Going by the system card at: https://openai.com/index/gpt-5-system-card/ > GPT‑5...

hatthew · 2025-08-07T19:17:18 1754594238

I know this is just arguing semantics, but wouldn't you call it a unified system since it has a single interface that automatically interacts with different components? It's not a unified model, but it seems correct to call it a unified system.

fnordpiglet · 2025-08-07T21:44:05 1754603045

Altman et al have been discussing the many model interface in ChatGPT is confusing to users and they want to move to a unified system that exposes a model that routes based on the task rather than depending on users understanding how and when to do that. Presumably this is what they’ve been discussing for some time. I don’t know that was intended to mean they would be working toward some unified inference architecture and model, although I’m sure goal posts will be moved to ensure it’s insufficient.

tomalbrc · 2025-08-08T09:10:25 1754644225

Altman is a salesman.

fnordpiglet · 2025-08-16T02:04:01 1755309841

And ChatGPT is a product. Confusing it with anything else is hazardous.

lompad · 2025-08-09T01:04:37 1754701477

And don‘t forget those … ”accusations“ from his sibling. The lawsuit is going to bring some details to the light, hopefully.

imdsm · 2025-08-08T09:14:18 1754644458

We could all learn a lot from him

rullelito · 2025-08-08T20:42:02 1754685722

e.g. how to not be fooled by salesmen.

awestroke · 2025-08-08T10:22:15 1754648535

No, Altman is not a researcher

cylemons · 2025-08-08T11:30:20 1754652620

He's the boss of the researchers so he knows more than them /s

But seriously tho, what parent is saying isn't a deep insight, it makes sense from a business perspective to consolidate your products into one so you don't confuse users

sigmoid10 · 2025-08-07T20:41:12 1754599272

It's not a unified architecture transformer, but it is a unified system for chatting.

WorldPeas · 2025-08-07T21:06:53 1754600813

so openai is in the business of GPT wrappers now? I'm guessing their open model is an escape for those who wanted to have a "plain" model, though from my systematic testing, it's not much better than Kimi K2

erjiang · 2025-08-08T15:05:38 1754665538

The API lets you directly choose the model you want. Automatic thinking is a ChatGPT feature since ChatGPT has always been a “GPT wrapper” in that sense.

pertymcpert · 2025-08-07T21:51:04 1754603464

They build AI systems, not GPTs.

andai · 2025-08-07T18:29:37 1754591377

> While GPT‑5 in ChatGPT is a system of reasoning, non-reasoning, and router models, GPT‑5 in the API platform is the reasoning model that powers maximum performance in ChatGPT. Notably, GPT‑5 with minimal reasoning is a different model than the non-reasoning model in ChatGPT, and is better tuned for developers. The non-reasoning model used in ChatGPT is available as gpt-5-chat-latest.

https://openai.com/index/introducing-gpt-5-for-developers/

Therenas · 2025-08-07T18:11:13 1754590273

Too expensive maybe, or just not effective anymore as they used up any available training data. New data is generated slowly, and is massively poisoned with AI generated data, so it might be useless.

fidotron · 2025-08-07T18:23:31 1754591011

I think that possibility is worse, because it implies a fundamental limit as opposed to a self imposed restriction, and I choose to remain optimistic.

If OpenAI really are hitting the wall on being able to scale up overall then the AI bubble will burst sooner than many are expecting.

pillefitz · 2025-08-07T19:41:43 1754595703

LLMs alone might be powerful enough already, they just need to be hooked up to classic AI systems to enable symbolic reasoning, episodic memory etc.

ACCount36 · 2025-08-07T20:35:44 1754598944

That's a lie people repeat because they want it to be true.

People evaluate dataset quality over time. There's no evidence that datasets from 2022 onwards perform any worse than ones from before 2022. There is some weak evidence of an opposite effect, causes unknown.

It's easy to make "model collapse" happen in lab conditions - but in real world circumstances, it fails to materialize.

noosphr · 2025-08-07T21:08:27 1754600907

>This looks like they're not training the single big model but instead have gone off to develop special sub models and attempt to gloss over them with yet another model. That's what you resort to only when doing the end-to-end training has become too expensive for you.

The corollary to the bitter lesson strikes again: any hand crafted system will out perform any general system for the same budget by a wide margin.

fidotron · 2025-08-08T11:18:50 1754651930

That is, at best, wishful thinking.

In practice the whole point is the opposite is the case, which is why this direction by OpenAI is a suspicious indicator.

lacoolj · 2025-08-07T17:52:18 1754589138

Many tiny, specialized models is the way to go, and if that's what they're doing then it's a good thing.

fidotron · 2025-08-07T18:03:42 1754589822

Not at all, you will simply rediscover the bitter lesson [1] from your new composition of models.

[1] https://www.cs.utexas.edu/~eunsol/courses/data/bitter_lesson...

bigmadshoe · 2025-08-07T18:16:16 1754590576

The bitter lesson doesn't say that you can't split your solution into multiple models. It says that learning from more data via scaled compute will outperform humans injecting their own assumptions about the task into models.

A broad generalization like "there are two systems of thinking: fast, and slow" doesn't necessarily fall into this category. The transformer itself (plus the choice of positional encoding etc.) contains inductive biases about modeling sequences. The router is presumably still learned with a fairly generic architecture.

fidotron · 2025-08-07T18:19:09 1754590749

> It says that learning from more data via scaled compute will outperform humans injecting their own assumptions about the task into models.

You are making assumptions about how to break the tasks into sub models.

bigmadshoe · 2025-08-07T19:38:00 1754595480

Sure, all of machine learning involves making assumptions. The bitter lesson in a practical sense is about minimizing these assumptions, particularly those that pertain to human knowledge about how to perform a specific task.

I don't agree with your interpretation of the lesson if you say it means to make no assumptions. You can try to model language with just a massive fully connected network to be maximally flexible, and you'll find that you fail. The art of applying the lesson is separating your assumptions that come from "expert knowledge" about the task from assumptions that match the most general structure of the problem.

"Time spent thinking" is a fundamental property of any system that thinks. To separate this into two modes: low and high, is not necessarily too strong of an assumption in my opinion.

I completely agree with you regarding many specialized sub-models where the distinction is arbitrary and informed by human knowledge about particular problems.

dmix · 2025-08-07T19:56:32 1754596592

Aren't you just moving the assumptions to an AI model and hoping it chooses the right one for the task?

bigmadshoe · 2025-08-07T20:25:17 1754598317

To be fair, you don't really "hope" it chooses the right ones for the task if you're optimizing the correct objective function.

nickthegreek · 2025-08-07T22:22:22 1754605342

so many people at my work need it just switch. they just leave it on 4o. you can still set the model yourself if you want. but this will for sure improve the quality of output for my non technical workmates who are confused by model selection.

dotancohen · 2025-08-08T06:21:19 1754634079

I'm a technical person, who has yet to invest the time in learning proper model selection too. This will be good for all users who don't bring AI to the forefront of their attention, and simply use it as a tool.

I say that as a VIM user who has been learning VIM commands for decades. I understand more than most how important it is to invest in one's tools. But I also understand that only so much time can be invested in sharpening the tools, when we have actual work to do with them. Using the LLMs as a fancy auto complete, but leaving the architecture up to my own NS (natural stupidity) has shown the default models to be more than adequate for my needs.

legulere · 2025-08-07T20:47:18 1754599638

> The ultimate reason for this is Moore's law, or rather its generalization of continued exponentially falling cost per unit of computation

Is it though? To me it seems like performance gains are slowing down and additional computation in AI comes mostly from insane amounts of money thrown at it.

noosphr · 2025-08-07T21:10:36 1754601036

Yes, custom hand crafted model will always outperform general statistical models when given the same compute budget. Given that we've basically saturated the power grid at this point we may have to do the unthinkable and start thinking again.

chaos_emergent · 2025-08-07T20:25:56 1754598356

Au contraire, ANNs are precisely the decomposition of larger problems into smaller ones.

gekoxyz · 2025-08-07T18:04:11 1754589851

We already did this for Object/Face recognition, it works but it's not the way to go. It's the way to go only if you don't have enough compute power (and data, I suspect) for a E2E network

sixo · 2025-08-07T18:09:08 1754590148

No, it's what you do if your model architecture is capped out on its ability to profit from further training. Hand-wrapping a bunch of sub-models stands in for models that can learn that kind of substructure directly.

TheOtherHobbes · 2025-08-07T18:04:11 1754589851

It's a concept of a unified system.

bjornsing · 2025-08-08T07:35:47 1754638547

You could train that architecture end-to-end though. You just have to run both models and backprop through both of them in training. Sort of like mixture of experts but with two very different experts.

dang · 2025-08-07T18:02:11 1754589731

Related ongoing thread:

GPT-5 System Card [pdf] - https://news.ycombinator.com/item?id=44827046

illiac786 · 2025-08-08T04:19:26 1754626766

I do agree that the current evolution is moving further and further away from AGI, and more toward a spectrum of niche/specialisation.

It feels less and less likely AGI is even possible with the data we have available. The one unknown is if we manage to get usable quantum computers, what that will do to AI, I am curious.

FeepingCreature · 2025-08-07T18:27:36 1754591256

If(f) it's trained end to end, it's a unified system.

mafro · 2025-08-07T21:09:08 1754600948

This is a precursor to a future model which isn't simply a router.

From the system card:

"In the near future, we plan to integrate these capabilities into a single model."

Icathian · 2025-08-07T21:13:19 1754601199

Anyone who still takes predictive statements from leadership at AI companies as anything other than meaningless noise isn't even trying.

kgwgk · 2025-08-07T21:21:58 1754601718

You don't get it. They couldn't do it yet because it would be too powerful and kill us all!