Hacker Newsnew | past | comments | ask | show | jobs | submit | more th0ma5's commentslogin

The imperfections of people are less imperfect than the machines.


A lot of the claims of usefulness evaporate when tested. The word useful has many meanings. Perhaps their only reliable use will be the rubber duck effect.


> A lot of the claims of usefulness evaporate when tested

In your personal experience? Because that's been my personal experience too, in lots of cases with LLMs. But I've also been surprised the other way, and overall it's been a net-positive for myself, but I've also spent a lot of time "practicing" getting prompts and tooling right. I could easily see how people give it try for 20-30 minutes, not getting the results they expected and give up, which yeah, you probably won't get any net-positive effects by that.


Anecdotes unfortunately are not data :/


Not for me they haven't.


Anecdotes unfortunately are not data :/


I keep hearing anecdotes but the data, like a widely covered BBC study, say they only compress and shorten and routinely fail outside of testing on real world selection of only the most important content or topics.


You don't have to trust my word -- all you have to do is provide an LLM with a text that you are well familiar with and ask the LLM questions about it.


Yup! I've done this and it sucks!


It was still very much like modern systems. If you didn't install, uninstall, or aggressively reconfigure things they were pretty stable, and controlled changes could be achieved. Some of the problem though was that the systems required a lot of that to do anything fun with them at home.


So long as the control messages and the processed results are the same channel, they will be at an insecure standoff. This is the in-band vs. out-of-band signalling issues like old crossbar phone systems and the 2600hz tone.


Huh? We've had pretty good translation in some languages in many general purpose contexts for a while. The LLM stuff if you're referring to that to my knowledge only has some gains in some languages in some contexts. Which is exciting no doubt.


Compared to google translate of yore, it’s gotten way more fluent thanks to transformers. Good translation relies heavily on context of course. Voice recognition and text to speech quality have increased dramatically. And near real-time (or as real-time as is possible given a pair of languages) is becoming feasible.


For sure, just the gains of LLMs in the mix cannot be measured and most still recommend human in the loop as always.


Another falsehood is that airplane data companies won't cave to legal or monetary threats. They might!


The errors that LLMs make and the errors that people make are not probably not comparable enough in a lot of the discussions about LLM limitations at this point?


We have different failure modes. And I'm sure researchers, faced with these results, will be motivated to overcome these limitations. This is all good, keep it coming. I just don't understand the some of the naysaying here.


They naysayers just says that even when people are motivated to solve a problem the problem might still not get solved. And there are unsolved problems still with LLM, the AI hypemen say AGI is all but a given in a few years time, but if that relies on some undiscovered breakthrough that is very unlikely since such breakthroughs are very rare.


Ther may be actually no way to ever know. A baked in bias could be well hidden at many levels. There is no auditing of any statements or products from any vendor. It may not be possible.


Exactly my point but that person seemed to have insider info or a source we all missed.


They are not. I think it is a good criticism though. Many people seem to be touting productivity that is only in the context of productivity towards more LLM inference operations and not productive in the sense of solving real world computing problems. There is a lot of material that suggests positive results are a kind of wish casting and people are not aware of the agency they are bringing to the interaction. The fact that you can hack things that do cool stuff is more of a reflection that you do those things, and that these models are not capable of it. That's why I recommend you work with others and you'll see your concepts that you feel are generalizable are not, and any learnings or insights are not like learning math or how to read, but more like learning a specific video game's rules. This is also why it is enthralling to you because you actually have only the illusion of controlling it.


No one is happy about the need for prompt engineering and other LLM hacks, but part of being a professional in the real world is doing what works.


Being a professional also involves reading about the technology like how there are studies saying that prompt engineering has hit a limit which is why people are talking about reasoning, agentic systems, this vendor vs that vendor vs offline, etc... Part of being a professional is also wanting to understand why and how something works and not disclosing my secrets to a third party vendor or otherwise wish casting over a black box that I cannot audit. Part of being a professional includes a history of natural language processing experience and knowing the limits of the digital representation of information, and having played with the early incarnations of OpenAI products and seeing them fail in the same way ever since, in the well documented ways that transformers are known to fail. There's also a real problem in the overloading of terms where the lay people are hearing "we have to cross our fingers and hear rumors and the magic box will make things correct for us" vs. machine learning experts saying "you have to limit the context to precisely the problem at hand and then their accuracy will improve" except... in machine learning terms the second is actually about as good as the first and they are both bad ways of working. There is very exciting ongoing research operating on these things in parallel to just absolute mysticism and here on HN that line is pretty much eroded away, and a lot of that is due to thought-terminating responses that blame the users.


Software worked quite well before the plagiarizing chat bots.

This is just a new fad like agile, where the process and tool obsessed developers blog and preach endlessly without delivering any results.

For them "it works". The rest of us can either join the latest madness or avoid places that use LLMs, which of course includes all open source projects that do actual work that is not measured in GitHub kLOCs and marketing speak.


I don't think software has ever worked "well".


I am inclined to agree with you, but Simonw's experiments are also illuminating.

And I have personally had good results with ChatGPT. I use it maybe 2 hours a month, on tasks that it is perfect for. The tasks would have taken 6 to 8 hours without ChatGPT. I don't find it at all useful in my normal programming activities.


The hype: "LLM's will replace all of the coders".

The hate: "LLM's can't do anything without hand holding"

I think both of these takes a disingenuous.

> not productive in the sense of solving real world computing problems.

Solving problems is a pretty vague term.

> The fact that you can hack things that do cool stuff

Lots of times this is how problems actually get solved. I would argue most of the time this is how problems get solved. More so if your working with other peoples software because your not reinventing CUPS, Dashboards, VPN's, Marketing Tools, Mail clients, Chat Clients and so on... I would argue that LOTS of good software is propped up directly and indirectly by this sort of hacking.


Oh I mean for sure, but the point I was trying to make is that OP did those things, and I would argue the LLMs mostly just give him the permission to do it rather than actually producing something that he actually fully let the system do. I mean, he even says this too. I understand hacking is how things are done, but the thing that managers don't understand is that hacking is not their personal army, and you can't rub AI on things and make them better without losing determinism.


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: