Because it is a 100x training compute model over 4. GPT5.5 will be a 10X compute...

bigmadshoe · 2025-08-07T18:11:39 1754590299

Even worse optics. They scaled the training compute by 100x and got <1% improvement on several benchmarks.

Mentlo · 2025-08-08T08:05:50 1754640350

It is almost as if there’s a documented limit in how much you can squeeze out of autoregressive transformers by throwing compute at it

reasonableklout · 2025-08-07T18:38:38 1754591918

Is 1% relative to more recent models like o3, or the (old and obsolete at this point) GPT-4?

bigmadshoe · 2025-08-07T20:26:49 1754598409

It was relative to the number the comment I replied to included. I would assume GPT-5 is nowhere near 100x the parameters of o3. My point is that if this release isn't notable because of parameter count, nor (importantly) performance, what is it notable for? I guess it unifies the thinking and non-thinking models, but this is more of a product improvement, not a model improvement.