The sota chatbots are getting more and more functionality that is not just LLM inference. They can search the web, process files, integrate with other apps. I think that's why most people will consider local LLMs to be insufficient very soon.
This is a natural response to software enshittification. You can hardly find an iOS app that is not plagued by ads, subscriptions, or hostile data collection. Now you can have your own small utilities that can work for you. This sort of personal software might be very valuable in the world where you are expected to pay 5$ to click any button.
Yeah sure but have you considered that the actual cost of running these models is actually much greater than whatever cost you might be shelling out for the ad-free apps? You're talking to someone who hates the slopification and enshittification of everything, so you don't need to convince me about that. However, everything I've seen described in the replies to my initial comment - while cute, and potentially helpful on a case-by-case basis, does NOT warrant the amount of resources we are pouring into AI right now. Not even fucking close. It'll all come crashing down, taxpayers the world over will be caught with the bag in their hands, and for what? So that we can all have a less robust version of an app that already exists but that has the colours we want and the button where we want it?
If AI cost nothing and wasn't absolutely decimating our economy, I'd find what you've shared cute. However, we are putting literally all of our eggs, and the next generation's eggs, and the one after that, AND the one after that, into this one thing, which, I'm sorry, is so far away from everything that keeps on being promised to us that I can't help but feel extremely depressed.
At this point it doesn't matter that much whether we use AI or not, the apps are not selling and they are being produced at an alarming rate.
The projects being submitted to product hunt is 4x the year before.
The market is shrinking rapidly because now more people make their own apps.
Even making a typo and landing on a website, there is good chance its selling more ai snake oil, yet none of these apps are feature complete and easily beaten by apps made by guys in 2010s. (tldr & sketchbook for the drawing space).
Only way to excite the investors is to fake the ARR by giving free trials and sell before the recurring event occurs.
You are attempting to move the goalposts. There are two different points in this debate:
1) Modern LLMs are an inflection point for coding.
2) The current LLM ecosystem is unsustainable.
This submission discussion is only about #1, which #2 does not invalidate. Even if the ecosystem crashes, then open-source LLMs that leverage the same tricks Opus 4.5 does will just be used instead.
But it's only an inflection point if it's sustainable. When this comes crashing down, how many people are going to be buying $70k GPUs to run an open source model?
I said open-source models, not locally-hosted models. Essentially, more power to inference-only providers such as Groq and Together AI which host the large-scale OSS LLMs who will be less affected by a crash as long as the demand for coding agents is there.
Ok, and then? Taking a one time discount on a rapidly depreciating asset doesn’t magically make this whole industry profitable, and it’s not like you’re going to start running a GB200 in your basement.
Checked your history. From a fellow skeptic, I know how hard it is to reason with people around here. You and I need to learn to let it go. In the end, the people at the top have set this up so that either way, they win. And we're down here telling the people at our level to stop feeding the monster, but told to fuck off anyways.
So cool bro, you managed to ship a useless (except for your specific use-case) app to your iphone in an hour :O
What I think this is doing is it's pitting people against the fact that most jobs in the modern economy (mine included btw) are devoid of purpose. This is something that, as a person on the far left, I've understood for a long time. However, a lot (and I mean a loooooot) of people have never even considered this. So when they find that an AI agent is able to do THEIR job for them in a fraction of the time, they MUST understand it as the AI being some finality to human ingenuity and progress given the self-importance they've attributed to themselves and their occupation - all this instead of realizing that, you know, all of our jobs are useless, we all do the exact same useless shit which is extremely easy to replicate quickly (except for a select few occupations) and that's it.
I'm sorry to tell anyone who's reading this with a differing opinion, but if AI agents have proven revolutionary to your job, you produced nothing of actual value for the world before their advent, and still don't. I say this, again, as someone who beyond their PhD thesis (and even then) does not produce anything of value to the world, while being paid handsomely for it.
> if AI agents have proven revolutionary to your job, you produced nothing of actual value for the world before their advent, and still don't.
This doesn’t logically follow. AI agents produce loads of value. Cotton picking was and still is useful. The cotton gin didn’t replace useless work. It replaced useful work. Same with agents.
> I'm sorry to tell anyone who's reading this with a differing opinion, but if AI agents have proven revolutionary to your job, you produced nothing of actual value for the world before their advent, and still don't.
I agree with this, but I think my take on it is a lot less nihilistic than yours. I think people vastly undersell how much effort they put into doing something, even if that something is vibecoding a slop app that probably exists. But if people are literally prompting claude with a few sentences and getting revolutionary results, then yes, their job was meaningless and they should find something to do that they’re better at.
But what frustrates me the most about this whole hype wave isn’t just that the powers that be have bet the entire economy on a fake technology, it’s that it’s sucking all of of the air out of the room. I think most people’s jobs can actually provide value and there’s so much work to be done to make _real_ progress. But instead of actually improving the world, all the time, money, and energy is being thrown into such a wasteful technology that is actively making the world a worse place. I’m sure it’s always been like this and I was just to naive too see it, but I much preferred it when at least the tech companies pretended they cared about the impact their products had on society rather than simply trying to extract the most value out of the same 5 ideas.
Yeah, I do tend to have a rather nihilistic view on things, so apologies.
I really think we're just cooked at this point. The amount of people (some great friends whom I respect) that have told me in casual conversation that if their LLM were taken from them tomorrow, they wouldn't know how to do their work (or some flavour of that statement) has made me realize how deep the problem is.
We could go on and on about this, but let's both agree to try and look inward more and attempt to keep our own things in order, while most other people get hooked on the absolute slop machine that is AI. Eventually, the LLM providers will need to start ramping up the costs of their subscriptions and maybe then will people start clicking that the shitty code that was generated for their pointless/useless app is not worth the actual cost of inference (which some conservative estimates put out to thousands of dollars per month on a subscription basis). For now, people are just putting their heads in the sand and assuming that physicists will somehow find a way to use quantum computers to speed up inference by a factor of 10^20 in the next years, while simultaneously slashing its costs (lol).
But hey, Opus 4.5 can cook up a functional app that goes into your emails and retrieves all outstanding orders - revolutionary. Definitely worth the many kWh and thousands of liters of water required, eh?
The studies focus on a single representative task, but in a thread about coding entire apps in hours as opposed to weeks, you can imagine the multiples involved in terms of resource conservation.
The upshot is, generating and deploying a working app that automates a bespoke, boring email workflow will be way, way, wayyyyy more efficient than the human manually doing that workflow everytime.
I want to push back on this argument, as it seems suspect given that none of these tools are creating profit, and so require funds / resources that are essentially coming from the combined efforts of much of the economy. I.e. the energy externalities here are monstrous and never factored into these things, even though these models could never have gotten off the ground if not for the massive energy expenditures that were (and continue to be) needed to sustain the funding for these things.
To simplify, LLMs haven't clearly created the value they have promised, but have eaten up massive amounts of capital / value produced by everyone else. But producing that capital had energy costs too. Whether or not all this AI stuff ends up being more energy efficient than people needs to be measured on whether AI actually delivers on its promises and recoups the investments.
EDIT: I.e. it is wildly unclear at this point that if we all pivot to AI that, economy-wide, we will produce value at a lower energy cost, and, even if we grant that this will eventually happen, it is not clear how long that will take. And sure, humans have these costs too, but humans have a sort of guaranteed potential future value, whereas the value of AI is speculative. So comparing energy costs of the two at this frozen moment in time just doesn't quite feel right to me.
> For now, people are just putting their heads in the sand and assuming that physicists will somehow find a way to use quantum computers to speed up inference by a factor of 10^20 in the next years, while simultaneously slashing its costs (lol).
GPT-3 Da Vinci cost $20/million tokens for both input and output.
GPT-5.2 is $1.75/million for input and $14/million for output
I'd call that pretty strong evidence that they've been able to dramatically increase quality while slashing costs, over just the past ~4 years.
Isn't that kind of related with the amount of money thrown at the field? If the economy gets worse for any reason, do you think that we can still expect these level of cutting costs in the future?
> But hey, Opus 4.5 can cook up a functional app that goes into your emails and retrieves all outstanding orders - revolutionary. Definitely worth the many kWh and thousands of liters of water required, eh?
The thing is in a vacuum this stuff is actually kinda cool. But hundreds of billions in debt-financed capex that will never seen a return, and this is the best we’ve got? Absolutely cooked indeed.
It's not really a mystery why it happens. LLM APIs are non-deterministic from user's point of view because your request is going to get batched with other users' requests. The batch behavior is deterministic, but your batch is going to be different each time you send your request.
The size of the batch influences the order of atomic float operations. And because float operations are not associative, the results might be different.
Not convinced. There is an obvious value in having more food or more products for almost anybody on Earth. I am not sure this is the case for software. Most people's needs are completely fulfilled with the amount and quality of software they already have.
> There is an obvious value in having more food or more products for almost anybody on Earth
Quite the opposite is true. For a large proportion of people, they would increase both the amount of years they live and quality of life by eating less.
I think the days where more product is always better lapse to an end - we just need to figure out how the economy should work.
But how about some silly software for just a giggle. Like 'write website that plays fart sound when you push button'? That can be a thing for the kids at school.
Do we have a better estimate? I don't think it's particularly difficult to get information from the occupied territories, the people there seem to freely use Internet.
It's my understanding that this war is really not particularly bloody for civilians as it is moving so slow that Russians are taking month to conquer pretty small towns and cities and the civilians can usually evacuate or hide. The bombing campaign has some civilian casualties, but I mostly see headlines mentioning <5 dead overall per occasional huge wave of drones and missiles.
Yes we have better estimates. In Mariupul for example estimates are above 20k civilians dead and murdered.
UN cannot personally verify any of this though so it counts them as zero. It should be at least the double of their estimate.
> It's my understanding that this war is really not particularly bloody for civilians as it is moving so slow that Russians are taking month to conquer pretty small towns and cities and the civilians can usually evacuate or hide.
Russia's advance has slowed to a crawl yes but the amount of people murdered in the places where Russia does take control are still very high (see Mariupul as an example). Especially in the early days of war they took a lot of land.
> The bombing campaign has some civilian casualties, but I mostly see headlines mentioning <5 dead overall per occasional huge wave of drones and missiles.
5 per day is too low as that would only add up to around 5.5k civilians and per UN's own calculations that's too low.
They've been targeting civilians, including schools and hospitals, daily since the war started.
It's almost like both numbers are heavily biased in the UN. Almost. Surely such bias and possible corruption couldn't happen in the esteemed institution, known for its impartial and objective rulemaking. Right?
> It's almost like both numbers are heavily biased in the UN.
Yes, but in the opposite direction. It would be baffling if the UN's claim of 16,000 Ukrainian victims wasn't at least 100,000 in reality.
And, let's be honest, in Gaza, it does not seem realistic that there are even 50,000 victims of the 70,000 civilian victims claimed total. Don't get me wrong, significant amount of victims, but much less than reported. And on top of that it doesn't seem realistic that none of those are militants. I'd guess, say, at least half of those are militants, not civilians.
And on top of that, UN has no problem to state that of those 16,000, about 70 Ukrainian dead are not victims of Russia but of Ukrainian frienly fire. Again, of the 70,000 claimed dead in Gaza ... let's assume at the very least 300 are victims of hamas friendly fire (probably more, since hamas is no stranger to boobytrapping civilian buildings), rather than enemy action.
If you count the way the UN counts in Gaza in Ukraine, Russia has killed some 400,000 people minimum. Maybe half a million, and of course climbing fast. No distinction between civilian and military, no distinction between accidents vs friendly fire ...
And I guess in the Gaza case I sort of understand. But why downplay Ukrainian victims? Why by a factor of 2, not counting military deaths, which would make it a factor 5 lower than real, minimum? I guess if you discounted everything the same way in Gaza the numbers would also drop by a factor of 5 there, but still.
My current job cold contacted me via LinkedIn. I use LI minimally, basically only to establish connections with my network, and it already gave me huge value back.
My skills are alright, nothing too crazy. I've had messages that were more spammy/scammy, but the volume was not crazy high, they were usually fairly obvious, and I resolved by simply ignoring them. I would say that the random chance of getting an interesting job offer, however small, is probably worth it for most professionals.
However, one thing I haven't mentioned is that I am based in Europe. My small sample size of people reaching out to me is showing that US contacts are usually less serious (e.g. ghosting). Maybe the US experience is so much worse overall due to this?
I suspect you may be onto something. I'm a run of the mill dev. At least in the past when I was last looking for a job and more active on linkedin, the volume of contacts did not represent legitimate interest, not at all.
- LLMs are absolutely abysmal at PyTorch. They can basic MLP workflows, but that's it more or less. 0% efficiency gained.
- LLMs are great at short autocompletes, especially when the code is predictable. The typing itself is very efficient. Using vim-like shortcuts is now the slower way to write code.
- LLMs are great at writing snippets for tech I am not using that often. Formatting dates, authorizing GDrive, writing advanced regex, etc. I could do it manually, but I would have to check docs, now I can have it done in seconds.
- LLMs are great at writing boilerplate code, e.g. setting up argparse, printing the results in tables, etc. I think I am saving hours per month on these.
- Nowadays I often let LLMs build custom HTML visualization/annotation tools. This is something I would never do before due to time constraints, and the utility is crazy good. It allows my team to better understand the data we are working with.
The interesting question is how much more software we actually need. Will software be done one day, all built up, similar to railway networks? With LLMs, software engineering might get cheaper, but it can also lead to increased demand. Resource getting cheaper actually very often leads to demand skyrocketing, as it becomes accessible to new markets.
Definitely feels like a good amount of dev work is writing the same things over and over, in a different language, codebase or context. And it seems like llms are particularly good at translating, specializing and contextualizing across existing knowledge.
AI alignment is not a solved problem by any means. As long as LLMs hallucinate, they cannot be considered aligned. You can only be aligned if you have a zero probability of generating hallucinations. The two problems, alignment and hallucinations, can be considered equivalent.
A human who hates maths is different from one who adds up wrong because they think the first digit counts units, second digit how many tens, third digit how many twenties (as one of my uni lecturers recounted of her own childhood).
Alignment is, approximately, "are we even training this AI on the correct utility function?" followed up by the second question "even if we specified the correct utility function, did the AI learn a representation of that function or some weird approximation of that function with edge cases we've not figured out how to spot?"
With, e.g. RLHF, the first is "is optimising for thumbs-up/thumbs-down the right objective at all?", the second is "did it learn the preference, or just how to game the reward?"
reply