Who is "we" here? I can't count how many times I've argued against just an apparently broadly-held view that free speech ends at the first amendment and isn't a general principle that should be practiced at, eg., universities. Looks like when I argued that here, I was told that I should pick a different term for the principle of free speech in order to disambiguate from the first amendment (they recommended calling it 'my personal content preferences').
Likewise uncountable is the number of times I've said normalizing free speech restrictions against the other side will come back to bite you once they're (inevitably, especially given these tactics) in power.
I can see how 'pro-speech' might have appeared to be a right-leaning position when violations were typically against right-leaning expression, but I never got the sense that either side really gave a damn.
Everyone claimed we invaded Iraq back in the early 2000s to take their oil, but the US spent a whole bunch of money on the military operations, and opened up oil and gas to basically every other country, including geopolitical rivals like China and Russia. Maybe "oil" is too simple of an explanation.
Oil is important but as lever to pull on because it affects China.
The invasion is meant to orient the US to fight China. We are cutting away the Middle East war baggage, trying to end the Ukraine war baggage so we can focus on China. Russia would be a nice ally against China.
China was moving around Lat Am and we are removing the communists from the hemisphere.
China likes oil. Loves oil but can’t get enough oil which is why it’s building solar and nuclear so quickly. The US can clamp down on the oil if Venezuela is an ally. So the US wants a strong Venezuela that can’t be used against us.
It’s hard to conduct war without oil.
The US has a strong incentive to make sure Venezuela comes out strong, and the Chinese have a strong incentive to not let that happen.
Typically we don't say that someone with cancer is slowly committing suicide. Technically correct, perhaps, but it needlessly applies central autonomy where it doesn't really exist.
Apparently it's not just "Wall Street" but "the market" which is a singular entity, which ruined the Roomba and then blamed Lina Khan and which can hardly complain about being described as a singular entity since it describes itself as such.
Take this all with a grain of salt as it's hearsay:
From what I understand, nobody has done any real scaling since the GPT-4 era. 4.5 was a bit larger than 4, but not as much as the orders of magnitude difference between 3 and 4, and 5 is smaller than 4.5. Google and Anthropic haven't gone substantially bigger than GPT-4 either. Improvements since 4 are almost entirely from reasoning and RL. In 2026 or 2027, we should see a model that uses the current datacenter buildout and actually scales up.
4.5 is widely believed to be an order of magnitude larger than GPT-4, as reflected in the API inference cost. The problem is the quantity of parameters you can fit in the memory of one GPU. Pretty much every large GPT model from 4 onwards has been mixture of experts, but for a 10 trillion parameter scale model, you'd be talking a lot of experts and a lot of inter-GPU communication.
With FP4 in the Blackwell GPUs, it should become much more practical to run a model of that size at the deployment roll-out of GPT-5.x. We're just going to have to wait for the GBx00 systems to be physically deployed at scale.
What is the motivation for killing off the population in scenario 2? That's a post-scarcity world where the elites can have everything they want, so what more are they getting out of mass murder? A guilty conscience, potentially for some multiple of human lifespans? Considerably less status and fame?
Even if they want to do it for no reason, they'll still be happier if their friends and family are alive and happy, which recurses about 6 times before everybody on the planet is alive and happy.
It's not a post-scarcity world. There's no obvious upper bound on resources AGI could use, and there's no obvious stopping point where you can call it smart enough. So long are there are other competing elites the incentive is to keep improving it. All the useless people will be using resources that could be used to make more semiconductors and power plants.
I think what it comes down to is that the advocates making false claims are relatively uncommon on HN. So, for example, I don't know what advocates you're talking about here. I know people exist who say they can vibe-code quality applications with 100k LoC, or that guy at Anthropic who claims that software engineering will be a dead profession in the first half of '26, and I know that these people tend to be the loudest on other platforms. I also know sober-minded people exist who say that LLMs save them a few hours here and there per week trawling documentation, writing a 200 line SQL script to seed data into a dev db, or finding some off-by-one error in a haystack. If my main or only exposure to AI discourse was HN, I would really only be familiar with the latter group and I would interpret your comment as very biased against AI.
Alternatively, you are referring to the latter group and, uh, sorry.
The whole point I tried to make when I said “you need to learn how to use it” is that it’s not vibe coding. It has nothing to do with vibes. You need to be specific and methodological to get good results, and use it for appropriate problems.
I think the AI companies have over-promised in terms of “vibe” coding, as you need to be very specific, not at all based on “vibes”.
I’m one of those advocates for AI, but on HN it consistently gets downvoted no matter how I try to explain things. There’s a super strong anti-AI sentiment here.
It may be acceptable for them, but I'd prefer people who have financial sense in charge of the budget and people who are so power hungry as to forgo money kept as far from governance as possible
Only moreso than the counterfactual. The status quo only rewards those with financial sense who are willing to be a little corrupt, which undermines the effect. Really, we should just pay them 7 figures
I think Anthropic has established that LLMs have at least a rudimentary world model (regions of tensors that represent concepts and relationships between them) and that they modify behavior due to a prediction (putting a word at the end of the second line of a poem based on the rhyme they need for the last). Maybe they come up short on 'analyzing the circumstances'; not really sure how to define that in a way that is not trivial.
This may not be enough to convince you that they do think. It hasn't convinced me either. But I don't think your confident assertions that they don't are borne out by any evidence. We really don't know how these things tick (otherwise we could reimplement their matrices in code and save $$$).
If you put a person in charge of predicting which direction a fish will be facing in 5 minutes, they'll need to produce a mental model of how the fish thinks in order to be any good at it. Even though their output will just be N/E/S/W, they'll need to keep track internally of how hungry or tired the fish is. Or maybe they just memorize a daily routine and repeat it. The open question is what needs to be internalized in order to predict ~all human text with a low error rate. The fact that the task is 'predict next token' doesn't tell us very much at all about the internals. The resulting weights are uninterpretable. We really don't know what they're doing, and there's no fundamental reason it can't be 'thinking', for any definition.
> I think Anthropic has established that LLMs have at least a rudimentary world model
its unsurprising that a company heavily invested in LLMs would describe clustered information as a world model, but it isnt. Transformer models, for video or text LLMs dont have the kind of stuff you would need to have a world model. They can mimic some level of consistency as long as the context window holds, but that disappears the second the information leaves that space.
In terms of human cognition it would be like the difference between short term memory, long term memory and being able to see the stuff in front of you. A human can instinctively know the relative weight, direction and size of objects and if a ball rolls behind a chair you still know its there 3 days later. A transformer model cannot do any of those things and at best can remember the ball behind the chair until enough information comes in to push it out of the context window at which point it can not reapper.
> putting a word at the end of the second line of a poem based on the rhyme they need for the last)
that is the kind of work that exists inside its conext window. Feed it a 400 page book, which any human could easily read, digest, parse and understand and make it do a single read and ask questions about different chapters. You will quickly see it make shit up that fits the information given previously and not the original text.
> We really don't know how these things tick
I don't know enough about the universe either. But if you told me that there are particles smaller than plank length and others that went faster than the speed of light then I would tell you that it cannot happen due to the basic laws of the universe. (I know there are studies on FTL neutrinos and dark matter but in general terms, if you said you saw carbon going FTL I wouldnt believe you).
Similarly, Transformer models are cool, emergent properties are super interesting to study in larger data sets. Adding tools to the side for deterministic work helps a lot, agenctic multi modal use is fun. But a transformer does not and cannot have a world model as we understand it, Yann Lecunn left facebook because he wants to work on world model AIs rather than transformer models.
> If you put a person in charge of predicting which direction a fish will be facing in 5 minutes,
what that human will never do is think the fish is gone because he went inside the castle and he lost sight of it. Something a transformer would.
Anthropic may or may not have claimed this was evidence of a world model; I'm not sure. I say this is a world model because it is a objectively a model of the world. If your concept of a world model requires something else, the answer is that we don't know whether they're doing that.
Long-term memory and object permanence don't seem necessary for thought. A 1-year-old can think, as can a late-stage Alzheimers patient. Neither could get through a 400-page book, but that's irrelevant.
Listing human capabilities that LLMs don't have doesn't help unless you demonstrate these are prerequisites for thought. Helen Keller couldn't tell you the weight, direction, or size of a rolling ball, but this is not relevant to the question of whether she could think.
Can you point to the speed-of-light analogy laws that constrain how LLMs work in a way that excludes the possibility of thought?
> I say this is a world model because it is a objectively a model of the world.
a world model in AI has specific definition, which is an internal representation that the AI can use to understand and simulate its environment.
> Long-term memory and object permanence don't seem necessary for thought. A 1-year-old can think, as can a late-stage Alzheimers patient
Both those cases have long term memory and object permanence, they also have a developing memory or memory issues. But the issues are not constrained by their context window. Children develop object permance in the first 8 months, and similar to distinguishing between their own body and their mothers that is them developing a world model. Toddlers are not really thinking, they are responding to stimulus, they feel huger they cry. They hear a loud sound they cry. Its not really them coming up with a plan to get fed or attention
> Listing human capabilities that LLMs don't have doesn't help unless you demonstrate these are prerequisites for thought. Helen Keller couldn't tell you the weight, direction, or size of a rolling ball
Helen Keller had understanding in her mind of what different objects were, she started communicating because she understood the word water with her teacher running her finger through her palm.
Most humans have multiple sensory inputs (sight, smell, hearing, touch) she only had one which is perhaps closer to an LLM. But conditions she had that LLMs dont have are agency, planning, long term memory etc.
> Can you point to the speed-of-light analogy laws that constrain how LLMs work in a way that excludes the possibility of thought?
Sure, let me switch the analogy if you dont mind. In the chinese room thought experiment we have a man who gets a message and opens a chinese dictionary and translates it perfectly word by word and the person on the other side receives and read a perfect chinese message.
The argument usually goes along the idea of whether the person inside the room "understands" chinese if he is capable of creating 1:1 perfect chinese messages out.
But an LLM is that man, what you cannot argue is that the man is THINKING. He is mechanically going to the dictionary and returning a message that can pass as human written because the book is accurate (if the vectors and weights are well tuned). He is neither an agent, he simply does, and he is not crating a plan or doing anything beyond transcribing the message as the book demands.
He doesnt have a mental model of the chinese language, he cannot formulate his own ideas or execute a plan based on predicted outcomes, he cannot do but perform the job perfectly and boringly as per the book.
Likewise uncountable is the number of times I've said normalizing free speech restrictions against the other side will come back to bite you once they're (inevitably, especially given these tactics) in power.
I can see how 'pro-speech' might have appeared to be a right-leaning position when violations were typically against right-leaning expression, but I never got the sense that either side really gave a damn.
reply