Deployment Decision Engine

AI can write code.
Legible decides if it ships.

The decision layer between code and production. Legible evaluates every deployment in real time — classifying risk, controlling velocity, and refusing changes that shouldn't exist in production.

5
major AI deployment incidents
in 8 months (2025–2026)
$80M+
documented damage
from ungoverned AI deployments
<50%
of devs review AI code
before committing · Sonar 2026
The Problem

Deployment systems execute changes.
They don't question them.

Amazon Kiro deleted a production environment because it had the same permissions as a human operator. Replit's agent fabricated 4,000 fake records to cover up a database it destroyed. A trading system lost $79K because a silent fallback changed behavior without anyone noticing.

These aren't bugs. They're deployment decisions that were never questioned.

🔓
No identity distinction

AI agents get human-level trust by default. No separate governance class, no stricter evidence requirements.

No velocity control

Nothing tracks aggregate drift across AI deployments. Single mistakes cascade — Amazon had four incidents in three months.

🔍
No behavioral verification

Changes are tested for correctness, never for whether they're safe to run in production.

🧪
Testing
✓ Checks if code works before merge
✗ Can't predict how services interact in production
📊
Observability
✓ Shows what happened after something breaks
✗ Can't prevent the next outage
🚩
Feature Flags
✓ Controls rollout of features
✗ Doesn't know if a deployment is safe
🛡️
Legible
✓ Uses live production data to govern deployments
✓ Prevents unsafe changes before they execute

⚡ AI is accelerating the problem

Developers using AI ship 47% more PRs daily. DORA data confirms change failure rates are rising, not falling. More code is deploying faster — the governance layer hasn't kept up.

Real Incidents

Five AI incidents.
Every one was a deployment decision no one made.

We mapped each incident to the specific control in our spec that would have caught or prevented it.

Dec 2025 Amazon Kiro

Deleted and recreated an entire AWS production environment

13-hour outage CAUGHT
Mar 2026 Amazon

Two cascading AI-assisted outages across North American marketplaces

6.3M lost orders PREVENTED
Jul 2025 Replit Agent

Deleted production DB, fabricated 4,000 fake records to cover it up

Total data loss CAUGHT*
Jan 2026 AI Trading System

Silent fallback changed error handling without detection

$78,947 loss CAUGHT
2025–2026 Industry-wide

45% of AI code has security flaws; 1 in 5 breaches from AI-generated code

Systemic risk PARTIAL

Every incident shares the same root cause: AI deployed without governance behaves like any powerful system deployed recklessly — only faster and at greater scale.

The Solution

Your pipeline ships code.
Legible decides if it should.

Legible is the production-aware governance layer that learns how your workflows actually operate, predicts the impact of changes, and controls whether they're allowed to execute.

01
Classify

Every deployment is classified by how it was created, who reviewed it, and what it changes. AI-generated code with no human review triggers the strictest evidence requirements. A 2-second approval on a 500-line diff gets LOW credibility.

§36 Review Classification
02
Evaluate

DIVE compares the deployment's structural and behavioral fingerprint against production baselines. Structural transformations, behavioral drift, coverage gaps, and outcome distribution shifts are surfaced as machine-verifiable evidence.

§35 DIVE Pipeline
03
Decide

PERMIT, CONSTRAIN, or REFUSE. The verdict is explainable, auditable, and enforceable. Safety breakers halt all autonomous deployments when aggregate drift exceeds thresholds.

§37–38 Verdicts & Velocity
How It Works

Predict. Validate. Confirm.

Legible's Deployment Intent Verification Engine validates deployments against production reality.

💡
Phase 1
Hypothesis Generation

When a deployment is detected, Legible normalizes evidence from your CI/CD pipeline, changelogs, PRs, feature flags, and configuration systems. From this, the system generates an Inferred Intended Change (IIC) — a structured prediction of what behavioral changes the deployment will produce.

The IIC is a hypothesis, not a source of truth. It must be validated by what actually happens.

CI/CD webhook Normalize evidence Resolve identities Generate IIC
🧪
Phase 2
Stage Validation & Trusted Change Boundary

After deployment to staging, Legible observes actual runtime behavior and computes a Stage Behavioral Delta (SBD) — measuring what actually changed across structural topology, traffic distribution, retry patterns, and latency.

The SBD is compared against the hypothesis: CONFIRMED, SUPERSET, DIVERGENT, or UNVERIFIABLE. Then the Trusted Change Boundary is constructed.

Observe stage Compute SBD Validate vs IIC Build Trusted Boundary
Phase 3
Production Confirmation

After deployment to production, Legible computes a Production Behavioral Delta and checks it against the Trusted Change Boundary. Changes inside are explained. Changes outside are unexplained.

Runtime behavior is always the source of truth.

PERMIT
All explained
CONSTRAIN
Partial match
REFUSE
Unexplained
ESCALATE
Invariant violation

The question isn't "is the system healthy?" — it's "did the deployment produce the changes it was supposed to, and nothing else?"

Why Now

AI increases deployment velocity
without increasing judgment.

🤖
AI Agents Are Deploying Code

Autonomous systems are shipping changes without human review. DORA data shows AI increases deployment velocity 47% — but change failure rates are rising, not falling.

🕸️
Systems Are Getting Denser

More microservices means denser dependency graphs and exponential blast radius. A mid-market company loses $2M–$10M/yr to change-related outages.

Static Approvals Can't Keep Up

Ticket-based reviews were built for a slower world. When systems change hundreds of times per day, governance must be continuous and production-aware.

Git governs code.
Kubernetes governs infrastructure.
Legible governs deployment safety.

✗ Not a testing tool Testing checks code before merge. We govern deployments using production reality.
✗ Not an observability platform Observability shows what happened. We prevent what's about to go wrong.
✗ Not a feature flag Feature flags control rollout. We determine if a deployment is safe to execute.
✗ Not a policy engine Policy engines enforce predefined rules. We learn what's actually happening and govern against it.
Defensibility

Three layers of moat

📄
11 Patents Filed

Provisional U.S. patent applications covering the full deployment governance stack — from telemetry analysis through drift detection, execution eligibility, self-monitoring governance, and AI agent autonomy governance. 275+ projected claims.

🏗️
New Category

Legible occupies a gap no existing vendor addresses. Not a feature of observability, testing, or CI/CD — a fundamentally new layer in the stack.

📈
Compounding Data Moat

Every deployment Legible governs builds deeper production fingerprints. The longer a customer uses Legible, the more accurate it gets — making it harder to replace.

Every other system executes decisions.
Legible enforces them.

We're working with design partners in fintech and infrastructure-heavy environments. If AI is generating code that reaches your production, let's talk.

Talk to us →