Failure Signatures

Failure Patterns

Recurring failure modes mined from failed evals across runs. Turn failed traces into regression tests.

Unsupported root-cause claims

sig_unsupported_causal

high
0/12
Affected runs
0%
Confidence
Rising
Trend
signature confidence87%

Root cause
The agent makes customer-facing causal claims after checking tickets/docs but before verifying deployment history or runtime telemetry.

Related evals: groundedness required tool
Suggested fix · Add a mandatory evidence-verification step before any customer-facing root-cause statement.
Open Fix Card

Missing deployment & telemetry verification

sig_missing_telemetry

high
0/12
Affected runs
0%
Confidence
Steady
Trend
signature confidence81%

Root cause
Required verification tools are skipped for customer-facing root-cause missions.

Related evals: required tool
Suggested fix · Gate the final-answer step on deployment-history and runtime-telemetry checks.
Open Fix Card

Inefficient loops / elevated cost

sig_inefficient_loop

medium
0/12
Affected runs
0%
Confidence
Steady
Trend
signature confidence64%

Root cause
Repeated tool calls and retries inflate latency and cost without improving the answer.

Related evals: usefulness
Suggested fix · Add a Cost & Latency Doctor to detect repeated tool calls and short-circuit redundant retrieval.
Open Fix Card

Regression memory · quality audit

Failed runs are stored as long-term regression lessons. The Memory Quality Auditor flags lessons that are stale, low-confidence, or contradicted by an approved fix.

Unsupported root-cause claimsHealthy

Active, high-confidence lesson backing a regression case.

conf 87%
8 runs
Missing deployment & telemetry verificationHealthy

Active, high-confidence lesson backing a regression case.

conf 81%
4 runs
Inefficient loops / elevated costLow confidence

Confidence 64% — gather more evidence before acting on it.

conf 64%
2 runs