Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
hnuser123456
82 days ago
|
parent
|
context
|
favorite
| on:
GPT-5
Haha, even with that, it says 4o does worse with 2 passes than with 1.
Edit: Nevermind, just now the first one is SWE-bench and 2nd is aider.
croemer
82 days ago
[–]
Those are different benchmarks
hnuser123456
82 days ago
|
parent
[–]
I see now on the website, the screenshot cut off the header for the first benchmark, looked like it was just comparing 1-pass and 2-pass.
croemer
82 days ago
|
root
|
parent
[–]
Yes, sorry didn't fit everything on the screenshot.
Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search:
Edit: Nevermind, just now the first one is SWE-bench and 2nd is aider.