03RUNS

runsv2-smoke-20260425T160855Z

openaicompletedv2 smoke

Gpt 5 Nano

openai/gpt-5-nano
Headline
90.8/100
CI [84.7, 100.0]
Cost
$0.0000
0 judge calls
Cases
11/12≥ 70%
0 failing
Per-band breakdown
  • MEDIUM
    n=6 · ci [75.0, 100.0]
    91.7
  • TRIVIAL
    n=1 · ci [100.0, 100.0]
    100.0
  • EXPERT
    n=1 · ci [85.3, 85.3]
    85.3
  • EASY
    n=4 · ci [100.0, 100.0]
    100.0
Cluster radar
Latency & throughput
  • Total elapsed
  • Avg judge latency
  • Judge calls0
  • Total spend$0.0000
  • CompletedSat, 25 Apr 2026 16:08:55 GMT