Arena · Live Evaluation

LLM PerformanceBattle Arena

Compare and benchmark Large Language Models in real-time. Track performance metrics, latency, and accuracy across diverse tasks.

Live Metrics

Real-time performance data from active model evaluations

Active Models12

Battles Today3,847

Avg Latency1.2s

Win Rate87.3%

GPT‑4OCLAUDE 3.5 SONNETLLAMA 3 70BMISTRAL LARGEGEMINI 1.5 PROGROK‑2