Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Someone at OpenAI screwed up the SWE-bench graph. o3 and GPT-4o bars are same height, but with different values.




The graph is more screwed up than that: the split bar is also split in a nonsensical way

It feels a bit intentional




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: