Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Because it is a 100x training compute model over 4.

GPT5.5 will be a 10X compute jump.

4.5 was 10x over 4.



Even worse optics. They scaled the training compute by 100x and got <1% improvement on several benchmarks.


It is almost as if there’s a documented limit in how much you can squeeze out of autoregressive transformers by throwing compute at it


Is 1% relative to more recent models like o3, or the (old and obsolete at this point) GPT-4?


It was relative to the number the comment I replied to included. I would assume GPT-5 is nowhere near 100x the parameters of o3. My point is that if this release isn't notable because of parameter count, nor (importantly) performance, what is it notable for? I guess it unifies the thinking and non-thinking models, but this is more of a product improvement, not a model improvement.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: