My experience is that it varies a lot by model, dev, and field β I've seen juniors (and indeed people with a decade of experience) keeping thousands of lines of unused code around for reference, or not understanding how optionals work, or leaving the FAQ full of placeholder values in English when the app is only on the German market, and so on. Good LLMs don't make those mistakes.
But the worst LLMs? One of my personal tests is "write Tetris as a web app", and the worst local LLM I've tried, started bad and then half way through switched to "write a toy ML project in python".
But the worst LLMs? One of my personal tests is "write Tetris as a web app", and the worst local LLM I've tried, started bad and then half way through switched to "write a toy ML project in python".