Aideaide.dev

Blog

Writing about AI agents, inference scaling, and software engineering.

2024-12-13/Sandeep Kumar Pani
SOTA on swebench-verified: (re)learning the bitter lesson
Resolving 62.2% of issues by scaling test-time inference with Sonnet 3.5 and re-learning that general methods leveraging computation win.