PermForge·v0 launch · 2026 H2 PermForge (not Perforce — we are AI agent permission runtime)

PermForge

The permission control and audit evidence layer for AI agents.

Post-hoc audits miss 41.7% of what your agent actually does.

Source · AgentLeak benchmark · 4,979 traces · 2026

The setup · 01

Agents now run 220+ sub-calls per task. Human-speed oversight breaks structurally.

a16z calls it the "thundering herd." One prompt fans out across LLM calls, vector lookups, tool calls, and sub-agents — at agent speed, not human speed.

  1. 01
    220+ sub-calls per task

    recursive fan-out across LLM, vector DB, tool calls, sub-agents

  2. 02
    5,000 sub-tasks in milliseconds

    new median for production vertical AI agents

  3. 03
    0 human-speed approvals work

    OAuth · RBAC · step-up · manual review · all break at this scale

PROMPT 5 SUB-AGENTS 25 TOOL CALLS · AND IT KEEPS GOING
One prompt → 5 sub-agents → 25 tool calls → permission boundary crossings everywhere
The problem · 02

Three time phases of agent governance. The middle one is empty.

Regulators have already named the empty box: "proportionate real-time oversight" (EU AI Act Article 14). Today the market sells policy authoring before, and tracing after. The during is unowned.

  1. Before · T-1

    Policy drift.

    Static linting and pre-prod rules try to anticipate every edge case — but agents adapt faster than policies update.

    Product · static lint · prompt firewalls
  2. During · T-0

    No enforcement.

    Mid-execution, while the agent is calling tools, expanding scopes, and spawning sub-agents — nothing inspects the decision graph.

    PermForge fills this →
  3. After · T+1

    Audit gaps.

    Post-hoc trace tools find what already shipped. By then the gap, leak, or hallucinated commit is in production.

    Product · Braintrust · LangSmith · Langfuse
Who's bleeding · 03

15+ named buyers. Already funded. Already regulated. Already shipping.

Combined funding ≥ $13B across the three verticals where regulated AI agents are already in production. Every one of them has the same gap.

  1. Legal

    Funded · Regulated · Shipping

    Rule 1.6 client-matter wall · ABA 5.3 attorney supervision

    • Harvey
    • Legora
    • EvenUp
    • Crogl
    • Eve
    • Wordsmith
  2. Healthcare

    Funded · HIPAA · Shipping

    HIPAA Minimum Necessary · per-call PHI necessity (CMS 2026-03)

    • Hippocratic
    • Abridge
    • Notable
    • OpenEvidence
  3. Financial Compliance

    Funded · MNPI · Shipping

    MNPI propagation · KYC reasoning chain · EU AI Act high-risk

    • Norm Ai
    • Themis
    • Greenlite
    • Hummingbird

Source · public funding filings · Crunchbase · LinkedIn job-post permissions language · sample, not exhaustive

Why now · 04

Four enforcement deadlines stack in 2026 H2.

Permission control gap becomes a legal + cash-flow risk this quarter. Network insurers and customer SOC 2 reviewers have already begun treating "agent permission audit trail" as a renewal condition.

  1. 2026-06-30 T-35 days

    Colorado AI Act

    Cure period ends. First enforcement day for "high-risk AI" duty to disclose & manage.

  2. 2026-08-02 T-68 days

    EU AI Act

    High-risk systems enforcement. Article 14 demands proportionate real-time oversight. €35M or 7% global revenue fines.

  3. 2026-02 · ongoing

    ABA Model Rule 5.3

    Extended to AI agents acting under attorney supervision. Law firm AI procurement now requires audit trail conformity.

  4. 2026-03 · ongoing

    CMS HIPAA Minimum Necessary

    Clarified: agent-driven PHI access must demonstrate per-call necessity. Hits the fan-out failure mode directly.

The product · 05

Five capabilities. All inline, at sub-call latency.

PermForge sits between the agent runtime and every privileged action it tries to take. We don't sit beside it as a tracing tool — we gate it as a runtime control plane.

  1. 01 INTERCEPT

    Inline interception

    Hook the 4 entry points where agent permission creep happens: tool call · scope upgrade · sub-agent spawn · token forwarding. Miss one, you still leak.

  2. 02 ROUTE

    Risk-graded routing

    Low risk auto-passes inline. Medium batches to async approval. High blocks and escalates with full evidence chain. Policy templates ship per regulation.

  3. 03 ELICIT

    Async approval

    Batch elicitation collapses 220 sub-call asks into 5 human decisions. Slack · mobile push · SMS · Magic Link. Timeout defaults are policy-driven.

  4. 04 EVIDENCE

    Audit-grade evidence

    Every request → decision → approver → timestamp → outcome is signed and immutable. Maps directly to EU AI Act Annex III & ABA 5.3 evidence requirements.

  5. 05 KILL

    Circuit breaker

    Wrong approval can be revoked. Anomalous agent behavior triggers kill. This is the contractual "right to interrupt" your large customers will ask for in 2026 H2.

The proof · 06

Four benchmarks pin the gap. Not opinion. Public data.

Buyers we sell to have already cited at least one of these in their internal AI risk reviews. We don't argue the gap — we measure it on your traces in a 1-week shadow audit.

  1. AgentLeak · visibility gap

    Multi-agent privacy violations missed by output-only audits. 4,979 production traces. Inter-agent channel = 68.9% of leakage, invisible to Braintrust / LangSmith.

  2. τ-bench · policy compliance gap

    Even SOTA agents fail organizational policy in 1 of 10 multi-turn workflows. Gap is structural — not a model upgrade fix.

  3. AgentHarm · model self-defense gap

    Refusal-trained LLMs jailbreak easily when operating as browser agents. Built-in safety training fails at agent-time. External control plane is mandatory.

  4. Braintrust 2026 buyer guide

    "Shows trace after users complain · can't block before it ships." Their own buyer guide confirms post-hoc is structurally late. The "during" is empty by design.

Market map · 07

Three control surfaces × three time phases. One cell is open.

Real-time × Permission is the cell every regulated vertical AI agent needs by 2026 H2. Static authorization (OAuth, OPA) and behavior sandboxing don't cover it. Tracing doesn't either.

Before · T-1
During · T-0
After · T+1
Content
Aporia · Lasso · PromptArmor prompt firewalls, content lint
No real-time content blocking with policy graph
Braintrust · LangSmith · Langfuse output-only tracing · post-hoc
Permission ★
OAuth · OPA · Cedar static authorization · not agent-aware
PermForge ★ sub-call inline policy enforcement
No audit-grade evidence trail for runtime authority decisions
Behavior
Anthropic computer-use confirmations single-point UX gates · not policy-driven
Sandbox: E2B · Daytona · Modal execution sandboxing · not authority gating
AgentOps · Logfire replay & debug · post-hoc

existing player · partial / token coverage · PermForge

Open source PermBench · v0.1 launch · 2026-06-15

Our open-source benchmark
PermBench

How regulated vertical AI agents score on sub-call permission control. Run it on your own agent. Compare against Harvey, Hippocratic, Norm Ai. Cite us in your AI risk review.

  • 120+ failure cases
  • 8 vertical regulations mapped
  • Apache 2.0 license · no rug-pull
§09 · shadow audit · how it works

Free shadow audit · 1 week.

You send 1 week of agent traces — de-identified is fine, or wire the SDK for a live capture. You get back a signed report listing every silent permission boundary your agent crossed, with regulatory exposure estimates.

first 12 audits free · June 2026 1-week turnaround no commitment

  1. 01
    You send A 1–7 day window of agent traces (de-identified is fine). Or wire our SDK for a live capture.
  2. 02
    We run PermBench scoring + behavioral graph extraction. We map every cross-tenant access, scope upgrade, and silent ethical-wall hit.
  3. 03
    You get A signed report: visibility gap %, regulatory exposure, recommended controls. Pilot pricing only if it warrants it.