Hacker Newsnew | past | comments | ask | show | jobs | submit | tibbar's commentslogin

This is mind-bending. I can't imagine that it performs well enough to be particularly fit for production just yet but..... wow.

this is the real reason why people are switching to claude code.

It's interesting to take the counterfactual, what it looks like when large projects are run poorly. The answer often looks like:

* Poorly defined goals / definition of success

* Overly-complex plans, slowly executed against

* A focus on issues that aren't the real bottleneck

* Large cost and time overruns

* Project is eventually cancelled

I've had the interesting experience of watching the same type of "transformation" project run twice at similar companies. In the first case, the project was bogged down to the extent that I genuinely updated to believe it wasn't possible to achieve. In the second case, I saw incredible progress / pace with a much smaller team, pushing on all the key points with the right planning, and learned some lessons I wish I'd known on take 1.


"Overly-complex plans, slowly executed against"

lol this usually happens when those leading the project have no vision and aren't ruthless about achieving a well-defined outcome state.

Having a vision is understated and very rare to find in people. Many people pretend/wish they had 'it'.


Github used to publish some pretty interesting postmortems. Maybe they still do. IIRC that they were struggling with scaling their SQL db and were starting to hit the limits. It's a tough position to be in because you have to either to a massive migration to a data layer with much different semantics, or you have to keep desperately squeezing performance and skirting on the edge of outages with a DB that wasn't really meant to handle what you're doing with it now. The OpenAI blog post on "scaling" Postgres to their current scale has much the same flavor, although I think they're doing it better than Github appears to be doing.

I’d be surprised by this: GitHub pretty famously used Vitess, and I’d be surprised if each shard were too big for modern hardware. Based on previous reporting [0], they’re running out of space in the main data center and new management is determined to move to Azure in a hurry. I’d bet that these outages are a combination of a worsening capacity crunch in the old data center and…well, Azure.

[0]: https://thenewstack.io/github-will-prioritize-migrating-to-a...


Here’s a post from 2021 about the migration! [0]

I guess 2021 is a long time ago now. How did that happen…

[0] https://github.blog/engineering/infrastructure/partitioning-...


I think most large platforms eventually split the tools out because you indeed can get MUCH better CI/CD, ticket management, documentation, etc from dedicated platforms for each. However when you're just starting out the cognitive overhead and cost of signing up and connecting multiple services is a lot higher than using all the tools bundled (initially for free) with your repo.

Lots of dedicated CI/CD out there that works well. CircleCI has worked for me

You would kind of expect with the pressure of supporting OpenAI and GitHub etc. that Azure would have been whipped into shape by now.

AZDO has been in KTLO maintenance mode for years.

I always felt that AZDO is basically TFS rebranded, so yeah, not much actions since they killed Source safe.

GitHub isn't VC funded at the moment, though. It's owned by Microsoft. Not that this necessarily changes your point.

> Of course it's going to change for the worse

> It's owned by Microsoft.

I see no contradictions here.


I mean, yeah, probably, but also OpenAI literally can't afford to give away this for free. They are losing a lot of money. Open source AI will continue to be a thing and they will have to compete to give you something better than what you can do yourself.

OpenAI is far from the stage of "grinding out more and more profits for investors." It's more like the stage of "most serious observers doubt that it can continue as a going concern"


Wait, you're completely skipping the emergence of reasoning models, though? 4.5 was slower and moderately better than 4o, o3 was dramatically stronger than 4o and GPT5 was basically a light iteration on that.

What's happening now is training models for long-running tasks that use tools, taking hours at a time. The latest models like 4.6 and 5.3 are starting to make good on this. If you're not using models that are wired into tools and allowed to iterate for a while, then you're not getting to see the current frontier of abilities.

(EG if you're just using models to do general knowledge Q&A, then sure, there's only so much better you can get at that and models tapered off there long ago. But the vision is to use agents to perform a substantial fraction of white-collar work, there are well-defined research programmes to get there, and there is stead progress.)


> Wait, you're completely skipping the emergence of reasoning models, though?

o1 was something like 16-18 months ago. o3 was kinda better, and GPT 5 was considered a flop because it was basically just o3 again.

I’ve used all the latest models in tools like Claude code and codex, and I guess I’m just not seeing the improvement? I’m not even working on anything particularly technically complex, but I still have to constantly babysit these things.

Where are the long-running tasks? Cursor’s browser that didn’t even compile? Claude’s C compiler that had gcc as an oracle and still performs worse than gcc without any optimizations? Yeah I’m completely unimpressed at this point given the promises these people have been making for years now. I’m not surprised that given enough constraints they can kinda sorta dump out some code that resembles something else in their training data.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: