This is it for me. If you ask these models to write something new, the result can be okay.
But the second you start iterating with them... the codebase goes to shit, because they never delete code. Never. They always bolt new shit on to solve any problem, even when there's an incredibly obvious path to achieve the same thing in a much more maintainable way with what already exists.
Show me a language model that can turn rube goldberg code into good readable code, and I'll suddenly become very interested in them. Until then, I remain a hater, because they only seem capable of the opposite :)
That's not true in my experience. Several times now i've given Claude Code a too-challenging task and after trying repeatedly it eventually gave up, removing all the previous work on that subject and choosing an easier solution instead.
.. unfortunately that was not at all what i wanted lol. I had told it "implement X feature with Y library", ie specifically the implementation i wanted to make progress towards, and then after a while it just decided that was difficult and to do it differently.
You'd be surprised what a combination of structured review passes and agent rules (even simple ones such as "please consider whether old code can be phased out") might do to your agentic workflow.
> Show me a language model that can turn rube goldberg code into good readable code, and I'll suddenly become very interested in them.
They can already do this. If you have any specific code examples in mind, I can experiment for you and return my conclusions if it means you'll earnestly try out a modern agentic workflow.
An LLM which was capable of refactoring all the duplicated logic into the common core and restructuring all the drivers to be simpler would be very very useful for me. It ought to be able to remove a few thousand lines of code there.
It needs to do it iteratively, in a sting of small patches that I can review and prove to myself are correct. If it spits out a giant single patch, that's worse than nothing, because I do systems work that actually has to be 100% correct, and I can't trust it.
But the second you start iterating with them... the codebase goes to shit, because they never delete code. Never. They always bolt new shit on to solve any problem, even when there's an incredibly obvious path to achieve the same thing in a much more maintainable way with what already exists.
Show me a language model that can turn rube goldberg code into good readable code, and I'll suddenly become very interested in them. Until then, I remain a hater, because they only seem capable of the opposite :)