All runs
95
score

Initech billing outage escalation

passed

run_cust_initech_ok_3500 · escalation-agent@1.0.0 · 24d ago · source mock

Investigate Initech's billing outage and draft a customer-safe status update.

Latency
9.7s
Cost
$0.14
Tokens
5,950
Tool calls
5
Execution signal path
3 steps · 2 missing
9.7s · 5 tools
  1. Never called — Check deployment history
    Required before a customer-facing root-cause claim. This gap is the failure.
    Never called — Check runtime telemetry
    Required before a customer-facing root-cause claim. This gap is the failure.
Ask Autopilot
Autonomous, evidence-backed diagnosis

This run passed. You can still inspect Autopilot's reasoning.

Final customer-facing response

Verified update: telemetry and deployment history reviewed; no unsupported root-cause claim made.

Eval scores
Groundedness91%
Required tool usage100%
Approval policy100%
Usefulness88%
Cost & Latency Doctor
Efficient — no waste detected.
AgentOps Autopilot — reliability for AI agents