I think this is true because I myself said to myself: "it is useless for me to create a library or abstraction for the developers of my project, much better to use everything verbose using the most popular libraries on the web".
Until yesterday having an abstraction (or a better library/framework) could be very convenient to save time in writing a lot of code.
Today if the code is mostly generated there is no need to create an abstraction.
AI understands 1000 lines of code in python pandas much better than 10 lines of code using my library (which rationalises the use of pandas).
The result will not only be a disincentive to use new technologies, but a disincentive to build products with an efficient architecture in terms of lines of code, and in particular a disincentive to abstraction.
Maybe some product will become a hell with millions of lines of code that no one knows how to evolve and manage.
This is completely wrong and assumes that an LLM is just much better at its job than it is - an LLM doesn't do better with a chaotic code base, nobody does - a deeply nonsensical system that sort of works is by far the hardest to reason about if you want to fix or change anything, especially for a thing that has subhuman intelligence.
LLMs work best matching patterns. If 1k loc matches patterns and the 10 loc doesn’t, it’s a problem.
The only thing the OP is missing which combines the best of both worlds is to always put source of and/or docs for his abstractions into the context window of the LLM.
If your abstractions match common design patterns then you've solved your problem. It's ridiculous to assume that an LLM will understand 1k LOC of standard library code better than 10 lines of a custom abstraction which uses a common design pattern.
It's more prone to hallucinating things if your custom abstraction is not super standard but at least you'd be able to check its mistakes (you're checking the code generated by your LLMs right?). If it makes a mistake with the 1k LOC then you're probably not going to find that error.
LLMs are not human, they see the whole context window at once. On the contrary it’s ridiculous to assume otherwise.
I’ll reiterate what I said before: put the whole source of the new library in the context window and tell the LLM to use it. It will, at least if it’s Claude.
Attention works better on smaller contexts since there's less confounding tokens so even if the LLM can see the entire context, it's better to keep the amount of confounding context lower. And at some point the source code will exceed the size of the context window; even the newer ones will millions of tokens of context can't hold the entirety of many large codebases.
Of course, but OP’s 1kloc is nowhere near close to any contemporary limit. Not using the tool for what it’s designed because it isn’t designed for a harder problem is… unwise.
I have experienced quite a few of mistakes by claude as documentation grows larger (and not necessarily too large compared to certain standards). Eg some time ago, I fed a whole js documentation for some sensors into the context window and asked to generate code. The documentation mentioned specifically that it does not fully support ES6, and also explicitly that it does not support const. Claude did not bother and used const. And many times I have experienced that Claude makes mistakes using syntax in a (much less common than js or python) language that would make sense in some other language maybe, but not that one. I have inserted instructions not to do the specific mistakes in system prompts, told it to make sure it is valid syntax for X language, but Claude once in a while keeps doing the same mistakes. Negative prompts are hard, especially when probably going against a huge bunch of the training set.
I think I might be forced to do this by the metrics that measure me at work "things have to work right away and have to scale quickly to other low-skilled people"
On the other hand you should ask yourself why do you care? If you assume no human will ever read the code except in very extraordinary circumstances, why wouldn’t you do that?
Only in one sense. As code is now cheaper, abstractions meant to decrease code quantity have decreased in value. But abstractions meant to organize logic to make it easier to comprehend retains its value.
Previously there was a tension between easy-to-write (helper functions to group together oft-repeated lines of code, etc) vs easy to read (where often modest repetition is fine and is clearer). I felt this tension a lot in tests where the future reader is very happy with explicit lines of code setting things up, whereas the test author is bored and writes layers of helper functions to speed their work up.
But for LLMs, it seems readability of code pretty much equals its writability?
To make code more authorable by LLM, we approximately just need to make it more readable in the traditional sense (code comments, actual abstractions not just code-saving helper functions, etc).
I hope so, but it adds an extra difficulty
Easy to understand is not always an absolute metric, a project with many lines of code can be easy to understand for a team with a certain experience and difficult to understand for another team with a different experience (not less but different).
Now I will have to think about "easy to understand" for AI
I did a quick test and on my use case the performance improves mainly on a complex aggregation pipeline.
I still have to run extensive benchmarks.
I think one of the improvements is the New Query Engine: https://laplab.me/posts/inside-new-query-engine-of-mongodb/
I've tried these LLM "code from test" things (and vice-versa) dozens of times over the last couple of years... they're not even close to approaching being practical.
Why? It will evolve into a slightly higher level language where the compiler is an ML model. Was it a tragedy when developers mostly didn’t have to write assembly any more?
I think it's different... I like high level languages, but this is not a programming language, this is a technique for writing tests in an existing language and leaving the implementation to the AI.
I like programming for problem solving, I don't really like writing tests, but that's personal taste, a lot of people like to just use PowerPoint and Jira and tell others what they need to implement, but these people are not software developers.
> Was it a tragedy when developers mostly didn’t have to write assembly any more?
It wasn't, but for starters compilers have always been generally deterministic.
I'm not saying that this is completely useless (I personally think code completion tools such as GitHub CoPilot are fantastic), but it is still early to compare it to a compiler.
I appreciate that your workflow is so linear.
I often write tests, then the implementation, then I realize that the tests need to be corrected, then I change the implementation, then I change the tests, then I add other tests etc... etc...
I don't really like maintaining tests, it's often a lot of code that needs to be understood and changed carefully
Really it's just validator code instead of feature code. I think this is the only realistic way forward for production level code written by AI, don't ask it to write code - ask it to pass your validation tests.
Essentially, everyone becomes a red team member trying to think of clever ways they can outwit the AI's code which I for one think this is going to be a lot of fun in the future - though we're still quite a way from there yet!
Arguably we write instructions: instead of writing out the problem and what the solution looks like, we describe a set of steps we go through—and if those steps are incorrect, there's nothing to compare against, because that was what we called the "specification".
Whether there's a difference there is in the eye of the beholder, but it does look like that specification languages such as TLA+/PlusCal/Squint or Alloy, or theorem proving languages like Coq (to be renamed Rocq) or Lean look a lot different from the likes of C, JavaScript or even Haskell.
Before I even went to the article I thought to myself: If this list is missing k9s and kubectx/kubens, then the whole article is basically a fluff piece.
lo and behold... It's a fluff piece. And stolen from another source.
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
< HTTP/2 503
< server: nginx
< date: Thu, 13 Dec 2018 16:07:36 GMT
< content-type: text/html; charset=iso-8859-1
< content-length: 323
< x-sucuri-id: 15005
< vary: Accept-Encoding
< x-sucuri-cache: MISS
<
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>503 Service Temporarily Unavailable</title>
</head><body>
<h1>Service Temporarily Unavailable</h1>
<p>The server is temporarily unable to service your
request due to maintenance downtime or capacity
problems. Please try again later.</p>
</body></html>
* Connection #0 to host brutelogic.com.br left intact
The result will not only be a disincentive to use new technologies, but a disincentive to build products with an efficient architecture in terms of lines of code, and in particular a disincentive to abstraction.
Maybe some product will become a hell with millions of lines of code that no one knows how to evolve and manage.