03RUNS
runsrun_claude_opus_4_7_8240
anthropiccompletedgs_general_v3
Claude Opus 4.7
claude-opus-4-7Overall
73.4/100
Cost
$37.30
Failures
8cases
Dimensions passing
6/10≥ 0.70
1
P95 latency
1.4s
Tokens
248ki/o
Dimension radar
Golden set breakdown
- Happy path312 / 32496%
- Adversarial141 / 18676%
- Edge cases78 / 9285%
- RAG grounding131 / 14889%
Q0.976LATp50 1.2s · p95 2.7s · p99 5.1sJUDGEgpt-4.1-miniQ-DEPTH0$/EVAL$0.00014