*There hasn't been much change in models from 6 months ago.* I made the same cla...

csomar · 2025-07-04T06:12:18 1751609538

I have my own test to measure performance: https://omarabid.com/gpt3-now

So far the only model that showed significant advancement and differentiation was GPT-4.5. I advise to look at the problem and read GPT-4.5 answer. It'll show the difference to other "normal models" (including GPT-3.5) as it shows considerable levels of understanding.

Other normal models are now more chatty and have a bit more data. But they do not show increased intelligence.

Karrot_Kream · 2025-07-04T07:38:03 1751614683

I was able to have Opus 4 one-shot it. Happy to share a screenshot if that wasn't your experience.

csomar · 2025-07-04T11:22:18 1751628138

Interested to see your Opus 4 one-shot. I tried it very recently on Opus 4 and it burbled non-sense.

Karrot_Kream · 2025-07-06T08:30:00 1751790600

Sorry for the delay, I'm out for the weekend I'll hey you it tomorrow!