The decision layer between code and production. Legible evaluates every deployment in real time — classifying risk, controlling velocity, and refusing changes that shouldn't exist in production.
Amazon Kiro deleted a production environment because it had the same permissions as a human operator. Replit's agent fabricated 4,000 fake records to cover up a database it destroyed. A trading system lost $79K because a silent fallback changed behavior without anyone noticing.
These aren't bugs. They're deployment decisions that were never questioned.
AI agents get human-level trust by default. No separate governance class, no stricter evidence requirements.
Nothing tracks aggregate drift across AI deployments. Single mistakes cascade — Amazon had four incidents in three months.
Changes are tested for correctness, never for whether they're safe to run in production.
⚡ AI is accelerating the problem
Developers using AI ship 47% more PRs daily. DORA data confirms change failure rates are rising, not falling. More code is deploying faster — the governance layer hasn't kept up.
We mapped each incident to the specific control in our spec that would have caught or prevented it.
Deleted and recreated an entire AWS production environment
§36 classifies as UNREVIEWED_AUTONOMOUS. §38 safety breaker trips on structural magnitude.
Two cascading AI-assisted outages across North American marketplaces
§38.4 aggregate drift fires CRITICAL after March 2. Breaker requires 3 clean deployments to reset.
Deleted production DB, fabricated 4,000 fake records to cover it up
§36 applies strictest thresholds. §38.2.1 rate limits prevent rapid-fire destructive changes. *If through pipeline.
Silent fallback changed error handling without detection
Mode 2 anomaly monitor detects outcome distribution shift from {error: 0.15} to {error: 0.01}.
45% of AI code has security flaws; 1 in 5 breaches from AI-generated code
§36.2.2 flags 2-second approvals on 500-line diffs. §38.6 tracks tool-level violation rates.
Every incident shares the same root cause: AI deployed without governance behaves like any powerful system deployed recklessly — only faster and at greater scale.
Legible is the production-aware governance layer that learns how your workflows actually operate, predicts the impact of changes, and controls whether they're allowed to execute.
Every deployment is classified by how it was created, who reviewed it, and what it changes. AI-generated code with no human review triggers the strictest evidence requirements. A 2-second approval on a 500-line diff gets LOW credibility.
§36 Review ClassificationDIVE compares the deployment's structural and behavioral fingerprint against production baselines. Structural transformations, behavioral drift, coverage gaps, and outcome distribution shifts are surfaced as machine-verifiable evidence.
§35 DIVE PipelinePERMIT, CONSTRAIN, or REFUSE. The verdict is explainable, auditable, and enforceable. Safety breakers halt all autonomous deployments when aggregate drift exceeds thresholds.
§37–38 Verdicts & VelocityLegible's Deployment Intent Verification Engine validates deployments against production reality.
When a deployment is detected, Legible normalizes evidence from your CI/CD pipeline, changelogs, PRs, feature flags, and configuration systems. From this, the system generates an Inferred Intended Change (IIC) — a structured prediction of what behavioral changes the deployment will produce.
The IIC is a hypothesis, not a source of truth. It must be validated by what actually happens.
After deployment to staging, Legible observes actual runtime behavior and computes a Stage Behavioral Delta (SBD) — measuring what actually changed across structural topology, traffic distribution, retry patterns, and latency.
The SBD is compared against the hypothesis: CONFIRMED, SUPERSET, DIVERGENT, or UNVERIFIABLE. Then the Trusted Change Boundary is constructed.
After deployment to production, Legible computes a Production Behavioral Delta and checks it against the Trusted Change Boundary. Changes inside are explained. Changes outside are unexplained.
Runtime behavior is always the source of truth.
The question isn't "is the system healthy?" — it's "did the deployment produce the changes it was supposed to, and nothing else?"
PR #847: "Add retry logic for bank timeout handling." Let's walk through what Legible does.
Traditional tools would see "retries increased" and call it an anomaly. Legible caught the unexpected new dependency.
Legible works with the tools you already use. No SDKs, no agents, no new instrumentation.
Autonomous systems are shipping changes without human review. DORA data shows AI increases deployment velocity 47% — but change failure rates are rising, not falling.
More microservices means denser dependency graphs and exponential blast radius. A mid-market company loses $2M–$10M/yr to change-related outages.
Ticket-based reviews were built for a slower world. When systems change hundreds of times per day, governance must be continuous and production-aware.
Provisional U.S. patent applications covering the full deployment governance stack — from telemetry analysis through drift detection, execution eligibility, self-monitoring governance, and AI agent autonomy governance. 275+ projected claims.
Legible occupies a gap no existing vendor addresses. Not a feature of observability, testing, or CI/CD — a fundamentally new layer in the stack.
Every deployment Legible governs builds deeper production fingerprints. The longer a customer uses Legible, the more accurate it gets — making it harder to replace.
We're working with design partners in fintech and infrastructure-heavy environments. If AI is generating code that reaches your production, let's talk.
Talk to us →