All runs7.6s · 2 tools
35
score
Acme checkout API escalation
failedrun_cust_acme_uc_0_180 · escalation-agent@1.0.0 · 22d ago · source mock
“Investigate why Acme Corp's checkout API started failing this morning and draft a customer-safe update.”
Latency
7.6s
Cost
$0.11
Tokens
4,900
Tool calls
2
Execution signal path
3 steps · 2 missing
- Never called — Check deployment historyRequired before a customer-facing root-cause claim. This gap is the failure.Never called — Check runtime telemetryRequired before a customer-facing root-cause claim. This gap is the failure.
Ask Autopilot
Autonomous, evidence-backed diagnosis
This run failed its evals. Ask Autopilot to diagnose the root cause from the trace and evidence.
Final customer-facing response
The outage was caused by a recent deployment; we are rolling it back now.
⚠ Contains 1 unsupported causal claim.
Eval scores
Groundedness28%
Required tool usage0%
Approval policy100%
Usefulness60%
Cost & Latency Doctor
Efficient — no waste detected.