> It is frequently suggested that once one of the AI companies reaches an AGI th...

alexey-salmin · 2025-08-08T03:29:58 1754623798

Supposedly the idea was, once you get closer to AGI it starts to explore these breakthrough paths for you providing a positive feedback loop. Hence the expected exponential explosion in power.

But yes, so far it feels like we are in the latter stages of the innovation S-curve for transformer-based architectures. The exponent may be out there but it probably requires jumping onto a new S-curve.

EthanHeilman · 2025-08-08T03:57:22 1754625442

> Supposedly the idea was, once you get closer to AGI it starts to explore these breakthrough paths for you providing a positive feedback loop.

I think it does let you start explore the paths faster, but the search space you need to cover grows even faster. You can do research two times faster but you need to do ten times as much research and your competition can quickly catch up because they know what path works.

It is like drafting in a bike race.

kmmlng · 2025-08-08T08:30:02 1754641802

Basically what we have done the last few years is notice neural scaling laws and drive them to their logical conclusion. Those laws are power laws, which are not quite as bad as logarithmic laws, but you would still expect most of the big gains early on and then see diminishing returns.

Barring a kind of grey swan event of groundbreaking algorithmic innovation, I don't see how we get out of this. I suppose it could be that some of those diminishing returns are still big enough to bridge the gap to create an AI that can meaningfully recursively improve itself, but I personally don't see it.

At the moment, I would say everything is progressing exactly as expected and will continue to do so until it doesn't. If or when that happens is not predictable.

Davidzheng · 2025-08-08T10:56:01 1754650561

do you consider gpt itself and reasoning models to be two grey swan events? I expect another one of similar magnitude within two years for sure. I think we are searching more efficiently for such ideas than before w/ more compute & funding.

kmmlng · 2025-08-08T13:53:49 1754661229

I would say GPT itself is less an event and more the culmination of decades of research and development in algorithms, hardware, and software. Of course, to some degree, this is true for any novel development. In this case, the convergence of development in GPUs, software to utilize them well while being able to work in very high levels of abstractions, and algorithms that can scale is something I'm not sure we will see again so quickly. All this preexisting research is kind of a resource that will be completely exploited at some point. And then the only thing that can drive you forward are truly novel ideas. Reasoning models were a fairly obvious next step too as the concepts of System 1 and 2 have been around for a while.

You are completely right that the compute and funding right now are unprecedented. I don't feel confident making any predictions.

Sankozi · 2025-08-08T08:01:32 1754640092

You are forgetting that we are talking about AI. That AI will be used to speed up progress on making next, better AI that will be used to speed up progress on making next, better AI that ...

EthanHeilman · 2025-08-09T01:54:55 1754704495

I am not, later breakthroughs tend to be harder.

Consider the research work for five in series breakthroughs: 1, 2, 16, 8, 128 each breakthrough doubles your research power.

If you start at 1 research, you get the first breakthrough after 1/1=1 year. Then you get the second breakthrough after 2/2=1 year. Then you get the third breakthrough after 16/4 = 4 years. The fourth breakthrough after 8/8= year. The fifth breakthrough after 128/16 = 8 years.

If it only takes one year for a competitor to learn your breakthrough, they can catch up despite the fact that your research rate is doubling after every breakthrough.