Signal // 6 sources

What I'm reading

Not a firehose. The trackers and analysis I actually trust, each with a one-line reason it earns the tab. Hand-curated, updated when something changes my mind.

2026-06-28
Artificial Analysis — independent model intelligence index artificialanalysis.ai
The independent numbers I sanity-check my own ranking against. When we disagree, one of us is measuring the wrong thing.
2026-06-20
LMArena — human-preference battle leaderboard lmarena.ai
Crowd preference, not capability. Useful for vibes and formatting, misleading if you read it as raw intelligence.
2026-06-10
SWE-bench — the software-engineering benchmark swebench.com
Still the closest thing to a real job interview for a coding model. Watch the Verified split, ignore the marketing numbers.
2026-05-30
Epoch AI — trends in compute, data, and capability epoch.ai
For the long arc rather than the launch-day noise. Good antidote to release hype.
2026-05-18
LiveCodeBench — contamination-resistant coding eval livecodebench.github.io
Because it pulls fresh contest problems, a high score here is harder to fake with memorised training data.
2026-05-02
LLM-Stats — broad model + pricing tracker llm-stats.com
The widest net for specs and prices. I cross-reference it, then form my own opinion about what the numbers mean.