Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There hasn't been much change in models from 6 months ago.

I made the same claim in a widely-circulated piece a month or so back, and have come to believe it was wildly false, the dumbest thing I said in that piece.






I have my own test to measure performance: https://omarabid.com/gpt3-now

So far the only model that showed significant advancement and differentiation was GPT-4.5. I advise to look at the problem and read GPT-4.5 answer. It'll show the difference to other "normal models" (including GPT-3.5) as it shows considerable levels of understanding.

Other normal models are now more chatty and have a bit more data. But they do not show increased intelligence.


I was able to have Opus 4 one-shot it. Happy to share a screenshot if that wasn't your experience.

Interested to see your Opus 4 one-shot. I tried it very recently on Opus 4 and it burbled non-sense.

Sorry for the delay, I'm out for the weekend I'll hey you it tomorrow!



Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: