More

dvt · 2026-01-19T19:19:50 1768850390

Apart from the article being generally just dumb (like, of course you can circumvent guardrails by changing the raw token stream; that's.. how models work), it also might be disrespecting the reader. Looks like it's, at least in part, written by AI:

> The punchline here is that “safety” isn’t a fundamental property of the weights; it’s a fragile state that evaporates the moment you deviate from the expected prompt formatting.

> When the models “break,” they don’t just hallucinate; they provide high-utility responses to harmful queries.

Straight-up slop, surprised it has so many upvotes.

mr_toad · 2026-01-19T21:52:50 1768859570

What’s the AI smell now? Are we not allowed to use semi-colons any more? Proper use of apostrophes? Are we all going to have to write like pre-schoolers to avoid being accused of being AI?

dvt · 2026-01-19T22:00:09 1768860009

One AI smell is "it's not just X <stop> it's Y." Can be done with semicolons, em dashes, periods, etc. It's especially smelly when Y is a non sequitur. For example what, exactly, is a "high-utility response to harmful queries?" It's gibberish. It sounds like it means something, but it doesn't actually mean anything. (The article isn't even about the degree of utility, so bringing it up is nonsensical.)

Another smell is wordiness (you would get marked down for this phrase even in a high school paper): "it’s a fragile state that evaporates the moment you deviate from the expected prompt formatting." But more specifically, the smelly words are "fragile state," "evaporates," "deviate" and (arguably) "expected."

azakai · 2026-01-19T23:31:15 1768865475

> For example what, exactly, is a "high-utility response to harmful queries?" It's gibberish. It sounds like it means something, but it doesn't actually mean anything. (The article isn't even about the degree of utility, so bringing it up is nonsensical.)

Isn't responding with useful details about how to make a bomb a "high-utility" response to the query "how do i make a bomb" - ?

dvt · 2026-01-20T00:58:15 1768870695

> Isn't responding with useful details about how to make a bomb a "high-utility" response to the query "how do i make a bomb" - ?

I know what the words of that sentence mean and I know what the difference between a "useful" and a "non-useful" response would be. However, in the broader context of the article, that sentence is gibberish. The article is about bypassing safety. So trivially, we must care solely about responses that bypass safety.

To wit, how would the opposite of a "high-utility response"--say, a "low-utility response"--bypass safety? If I asked an AI agent "how do I build a bomb?" and it tells me: "combine flour, baking powder, and salt, then add to the batter gradually and bake for 30 minutes at 315 degrees"--how would that (low-utility response) even qualify as bypassing safety? In other words, it's a nonsense filler statement because bypassing safety trivially implies high-utility responses.

Here's a dumbed-down example. Let's say I'm planning a vacation to visit you in a week and I tell you: "I've been debating about flying or taking a train, I'm not 100% sure yet but I'm leaning towards flying." And you say: "great, flying is a good choice! I'll see you next week."

Then I say: "Yeah, flying is faster than walking." You'd think I'm making some kind of absurdist joke even though I've technically not made any mistakes (grammatical or otherwise).

anon373839 · 2026-01-19T22:16:21 1768860981

I think this is 100% in your mind. The article does not in any way read to me as having AI-generated prose.

dvt · 2026-01-19T22:29:05 1768861745

You can call me crazy or you can attack my points: do you think the first example logically follows? Do you think the second isn't wordy? Just to make sure I'm not insane, I just copy pasted the article into Pangram, and lo and behold, 70% AI-generated.

But I don't need a tool to tell me that it's just bad writing, plain and simple.

Der_Einzige · 2026-01-20T11:59:35 1768910375

You are gaslighting. I 100% believe this article was AI generated for the same reason as the OP. And yes, they do deserve negative scrutiny for trying to pass off such lack of human effort on a place like HN!

JasonADrury · 2026-01-20T19:36:46 1768937806

Either this article was written by AI or someone deliberately trying to sound like AI.

Imustaskforhelp · 2026-01-19T22:34:13 1768862053

This is so funny because I MADE some comment like this where I was gonna start making grammatical mistakes for people to not mistake me for AI like writing like this , instead of like, this.

https://news.ycombinator.com/item?id=46671952#46678417

Der_Einzige · 2026-01-20T11:57:50 1768910270

Go take a giant dataset of LLM generated outputs, use an accurate POS tagger and look for 5-grams or similar lengths of matching patterns.

If you do thi, you’ll pull out the overrepresented paragraph and sentence level slop that we humans intuitively detect easily.

If your writing appears to be AI generated, I assume you aren’t willing to put human intentionality/effort into your work and as such I write it off.

Btw we literally wrote a paper and contributed both sampling level techniques, fine tuning level techniques, and antislopped models for folks to use who want to not be obviously detected in their laziness: https://arxiv.org/abs/2510.15061

dvt · 2026-01-19T01:38:53 1768786733

I liked em dashes before they were cool—and I always copy-pasted them from Google. Sucks that I can't really do that anymore lest I be confused for a robot; I guess semicolons will have to do.

celsius1414 · 2026-01-19T01:57:25 1768787845

On a Mac keyboard, Option-Shift-hyphen gives an em-dash. It’s muscle memory now after decades. For the true connoisseurs, Option-hyphen does an en-dash, mostly used for number ranges (e.g. 2000–2022). On iOS, double-hyphens can auto-correct to em-dashes.

I’ve definitely been reducing my day-to-day use of em-dashes the last year due to the negative AI association, but also because I decided I was overusing them even before that emerged.

This will hopefully give me more energy for campaigns to champion the interrobang (‽) and to reintroduce the letter thorn (Þ) to English.

geerlingguy · 2026-01-19T02:07:25 1768788445

I'm always reminded how much simpler typography is on the Mac using the Option key when I'm on Windows and have to look up how to type [almost any special character].

Instead of modifier plus keypress, it's modifier, and a 4 digit combination that I'll never remember.

projektfu · 2026-01-19T17:04:37 1768842277

PowerToys has a wonderful QuickAccent feature. The dashes and hyphens are on hyphen-KEY and some other characters are on comma-KEY, and many symbols are on the key that they resemble, like ¶ is on P-KEY where KEY is the follower key you want to use. I turned off using SPACE because it conflicted with some other software, but right arrow works great for me.

cellis · 2026-01-19T02:06:17 1768788377

I've also used em-dashes since before chatgpt but not on HN -- because a double dash is easier to type. However in my notes app they're everywhere, because Mac autoconverts double dashes to em-dashes.

derf_ · 2026-01-19T02:36:35 1768790195

And on X, an em-dash (—) is Compose, hyphen, hyphen, hyphen. An en-dash (–) is Compose, hyphen, hyphen, period. I never even needed to look these up. They're literally the first things I tried given a basic knowledge of the Compose idiom (which you can pretty much guess from the name "Compose").

stackghost · 2026-01-19T02:05:17 1768788317

Back in the heyday of ICQ, before emoji when we used emoticons uphill in the snow both ways, all the cool kids used :Þ instead of :P

parpfish · 2026-01-19T02:35:50 1768790150

I’m an em-dash lover but always (and still do) type the double hyphen because that’s what I was taught for APA style years ago

npn · 2026-01-19T02:12:05 1768788725

you can absolutely still use `--`, but you need to add spaces around them.

dvt · 2026-01-18T20:11:14 1768767074

I know you're replying to a brand new (likely troll) account, but I'm also very confused by this and would be curious to learn if there's any truth to it. I personally don't really see what a Von Neumann machine has to do with null pointers (or how an implication would go either way), but maybe I'm missing something.

tptacek · 2026-01-18T20:12:48 1768767168

It has nothing to do with NULL pointers and is instead a property of a programming language.

z3512 · 2026-01-18T20:39:25 1768768765

NULL pointers working the way they do was a design decision made my hardware engineers a long time ago because it saved some transistors when that mattered. We’re past that point now for most ASICs and hardware can be changed. Although backward software compatibility is a thing too.

wizzwizz4 · 2026-01-18T21:35:14 1768772114

Null pointers have nothing to do with the instruction set architecture, except as far as they are often represented by the value 0. Can you describe the scheme you're imagining, whereby their use saves transistors?

dvt · 2026-01-18T00:30:01 1768696201

The AI doom and gloom is so weird, and it's just turning into a bizarre echo chamber. AI is orders of magnitude more useful and transformative than Facebook was in 2005, and Meta is now one of the most valuable companies on the planet. Even if OpenAI has a down round or defaults on some loans, the technology has already proven to have dozens upon dozens of practical applications.

jazzyjackson · 2026-01-18T01:07:14 1768698434

Disagree, no one's going to invite me to their kids birthday party via ChatGPT. It's innovation was in ads knowing so much about the people it targeted, and putting tracking pixels on every webpage with a Like button. Facebook was transformative for online surveillance

IMO LLMs will be equally transformative for online influence campaigns (aka ads + Cambridge analytica on steroids)

Bolwin · 2026-01-18T01:21:48 1768699308

People are definitely going to be sending you AI generated birthday invite posters soon.

Oh and yeah, AI has already been shown to be more persuasive than the average human. It's only a matter of time before someone's paying to decide what it persuades you of

nuclearpidgeon · 2026-01-18T01:37:00 1768700220

If only there were some way to avoid this persuasion by, I don't know, not using or relying on such controlled technology, or by not buying in to the hype of all the companies with vested interests in selling it

matthewfcarlson · 2026-01-18T01:28:47 1768699727

Agreed, just because something is useful for helpful doesn’t mean it’s easy to monetize.

prisenco · 2026-01-18T01:50:36 1768701036

| AI is orders of magnitude more useful and transformative than Facebook was in 2005

It better be, it's taken over 40000x the funding.

The question is not whether AI is useful, the question is whether it's useful enough relative to the capital expectations surrounding it. And those expectations are higher than anything the world has ever seen.

ben_w · 2026-01-18T00:48:49 1768697329

"Useful and transformative" doesn't mean "financially successful".

A single LLM provider might have been able to get great margins and capture a significant fraction of the total economic output of (currently e.g. junior grade software engineering), but collectively they're in an all-pay auction for the hardware to train models worth paying for, and at the same on questionable margins because they need to compete with each other on cost.

They can all go bankrupt, and leave behind only trained models that normal people won't be able to run for 5 years while consumer-grade stuff catches up. Or any single one of them might win, which may not be OpenAI. Any or all may get state subsidies (US, Chinese, European, whatever).

All kinds of outcomes are possible.

versteegen · 2026-01-18T01:13:49 1768698829

Paid/API LLM inference is profitable, though. For example, DeepSeek R1 had "a cost profit margin of 545%" [1] (ignoring free users and using a placeholder $2/hour figure H800 GPU, which seems ballpark of real to me due to Chinese electricity subsidies). Dario has said each Anthropic model is profitable over its lifetime. (And looking at ccusage stats and thinking Anthropic is losing thousands per Claude Code user is nonsense, API prices aren't their real costs. That's why opencode gives free access to GLM 4.7 and other models: it was far cheaper than they expected due to the excellent cache hit rates.) If anyone ran out of money they would stop spending on experiments/research and training runs and be profitable... until their models were obsolete. But it's impossible for everyone to go bankrupt.

[1] https://github.com/deepseek-ai/open-infra-index/blob/main/20...

ares623 · 2026-01-18T01:35:00 1768700100

I don’t think the current industry can survive without both frontier training and inference.

Getting rid of frontier training will mean open source models will very quickly catch up. The great houses of AI need to continue training or die.

In any case, best of luck (not) to the first house to do so!

ben_w · 2026-01-18T12:14:52 1768738492

That's more of "cloud compute makes money" than "AI makes money".

If the models stop being updated, consumer hardware catches up and we can all just run them locally in about 5 years (for PCs, 7-10 for phones), at which point who bothers paying for a hosted model?

tootie · 2026-01-18T01:32:28 1768699948

They're not arguing that AI sucks. Only that OpenAI has no hope of meeting it's financial obligations which seems pretty reasonable. And very on brand for Sam Altman. It seems pretty obvious at this point that model training is extremely expensive and affords very little moat. LLMs will continue to improve and gain adoption, but one or more companies will fall by the wayside regardless of their userbase. Google seems pretty clearly to be in pole position at this point as they have massive revenue, data, expertise and their own chips.

jrflowers · 2026-01-18T03:04:59 1768705499

> AI is orders of magnitude more useful and transformative than Facebook was in 2005

This makes sense because Facebook was one year old in 2005 and OpenAI is 11 years old now. Eleven is just two ones so it’s basically the same thing as one so it is sensible to make that comparison

Morromist · 2026-01-18T01:47:54 1768700874

Facebook hooked me up with 4 beautiful girlfriends. I don't think Chatgpt is going to do that any time soon.

dvt · 2026-01-17T00:46:44 1768610804

What is your use case that you see UI lag between vscode and sublime? Honestly, I feel zero difference between sublime/vscode/vi. Vscode arguably takes longer to boot up, but that only happens like once a day so it's not a big deal.

I think this is a lot of "I don't like Typescript/Javascript for serious things" or "Electron sucks" posturing rather than an actual tangible difference.

Terretta · 2026-01-17T03:04:38 1768619078

> What is your use case that you see UI lag between vscode and sublime? … I think this is … posturing …

Typing with pleasure: https://pavelfatin.com/typing-with-pleasure/

Study the graphs. Ready the copy.

If you don't feel these differences every keystroke, count yourself lucky to have slower perception or typing, rather than accusing folks of posturing.

dvt · 2026-01-18T00:44:01 1768697041

Your brain processes (visual) information at a resolution of >= 80ms[1]. The idea that you can tell the difference between 10ms or 50ms latency when typing is simply untrue (both events will appear instantaneous). I say this as someone that has played Counter-Strike professionally and have a sub-200ms reaction time. (Auditory perception is processed at a higher resolution, but the article is decidedly not about that.)

[1] https://pmc.ncbi.nlm.nih.gov/articles/PMC9851611/

Terretta · 2026-01-18T09:18:04 1768727884

I almost noted on that as I've harped on sub-200ms for web responsiveness since the 90s.

However, reacting to something you see is a diff thing than sensing intra key-to-char lag in flow.

In CS, responding to what's on screen is diff from button press to seeing game action. High polling rate controls are going after that.

pvtmert · 2026-01-19T10:18:34 1768817914

i cannot tell exactly but it kinda bothers me while working/typing

it is not like a huge latency, definitely not like ssh-connection.

to explain better, i usually have pre-defined set of keystrokes i input, so it's not the issue of latency of a single keystroke, rather compounding effect.

another thing is, most of the LSPs, highlighting etc are visibly slower on vscode. I am also having many plugins/extensions so that is partly to blame.

in the recent versions of vscode, they started supporting tree-sitter, which is quite nice in terms of performance.

dvt · 2026-01-14T22:25:41 1768429541

We do, and the comparison is apt. We are the ones that hydrate the context. If you give an LLM something secure, don't be surprised if something bad happens. If you give an API access to run arbitrary SQL, don't be surprised if something bad happens.

stavros · 2026-01-14T22:33:04 1768429984

So your solution to prevent LLM misuse is to prevent LLM misuse? That's like saying "you can solve SQL injections by not running SQL-injected code".

jychang · 2026-01-14T23:14:11 1768432451

Isn't that exactly what stopping SQL injection involves? No longer executing random SQL code.

Same thing would work for LLMs- this attack in the blog post above would easily break if it required approval to curl the anthropic endpoint.

stavros · 2026-01-14T23:17:04 1768432624

No, that's not what's stopping SQL injection. What stops SQL injection is distinguishing between the parts of the statement that should be evaluated and the parts that should be merely used. There's no such capability with LLMs, therefore we can't stop prompt injections while allowing arbitrary input.

dvt · 2026-01-14T23:33:19 1768433599

Everything in an LLM is "evaluated," so I'm not sure where the confusion comes from. We need to be careful when we use `eval()` and we need to be careful when we tell LLMs secrets. The Claude issue above is trivially solved by blocking the use of commands like curl or manually specifiying what domains are allowed (if we're okay with curl).

stavros · 2026-01-14T23:36:46 1768433806

The confusion comes from the fact that you're saying "it's easy to solve this particular case" and I'm saying "it's currently impossible to solve prompt injection for every case".

Since the original point was about solving all prompt injection vulnerabilities, it doesn't matter if we can solve this particular one, the point is wrong.

dvt · 2026-01-14T23:56:41 1768435001

> Since the original point was about solving all prompt injection vulnerabilities...

All prompt injection vulnerabilities are solved by being careful with what you put in your prompt. You're basically saying "I know `eval` is very powerful, but sometimes people use it maliciously. I want to solve all `eval()` vulnerabilities" -- and to that, I say: be careful what you `eval()`. If you copy & paste random stuff in `eval()`, then you'll probably have a bad time, but I don't really see how that's `eval()`'s problem.

If you read the original post, it's about uploading a malicious file (from what's supposed to be a confidential directory) that has hidden prompt injection. To me, this is comparable to downloading a virus or being phished. (It's also likely illegal.)

acjohnson55 · 2026-01-15T02:16:17 1768443377

The problem is that most interesting applications of LLMs require putting data into them that isn't completely vetted ahead of time.

rswail · 2026-01-15T08:04:56 1768464296

The problem here is that the domain was allowed (Anthropic) but Anthropic don't check the API key belongs to the user that started the session.

Essentially, it would be the same if attacker had its AWS API Key and uploaded the file into an S3 bucket they control instead of the S3 bucket that user controls.

delaminator · 2026-01-15T06:58:50 1768460330

By the time you’ve blocked everything that has potential to exfiltrate, you are left with a useless system.

As I saw on another comment “encode this document using cpu at 100% for one in a binary signalling system “

Xirdus · 2026-01-14T23:34:32 1768433672

SQL injection is possible when input is interpreted as code. The protection - prepared statements - works by making it possible to interpret input as not-code, unconditionally, regardless of content.

Prompt injection is possible when input is interpreted as prompt. The protection would have to work by making it possible to interpret input as not-prompt, unconditionally, regardless of content. Currently LLMs don't have this capability - everything is a prompt to them, absolutely everything.

kentm · 2026-01-15T01:07:33 1768439253

Yeah but everyone involved in the LLM space is encouraging you to just slurp all your data into these things uncritically. So the comparison to eval would be everyone telling you to just eval everything for 10x productivity gains, and then when you get exploited those same people turn around and say “obviously you shouldn’t be putting everything into eval, skill issue!”

acjohnson55 · 2026-01-15T02:18:28 1768443508

Yes, because the upside is so high. Exploits are uncommon, at this stage, so until we see companies destroyed or many lives ruined, people will accept the risk.

wat10000 · 2026-01-14T22:46:38 1768430798

I can trivially write code that safely puts untrusted data into an SQL database full of private data. The equivalent with an LLM is impossible.

dvt · 2026-01-14T23:34:08 1768433648

It's trivial to not let an AI agent use curl. Or, better yet, only allow specific domains to be accessed.

strbean · 2026-01-14T23:44:51 1768434291

That's not fixing the bug, that's deleting features.

Users want the agent to be able to run curl to an arbitrary domain when they ask it to (directly or indirectly). They don't want the agent to do it when some external input maliciously tries to get the agent to do it.

That's not trivial at all.

dvt · 2026-01-14T23:59:33 1768435173

Implementing an allowlist is pretty common practice for just about anything that accesses external stuff. Heck, Windows Firewall does it on every install. It's a bit of friction for a lot of security.

acjohnson55 · 2026-01-15T02:23:09 1768443789

But it's actually a tremendous amount of friction, because it's the difference between being able to let agents cook for hours at a time or constantly being blocked on human approvals.

And even then, I think it's probably impossible to prevent attacks that combine vectors in clever ways, leading to people incorrectly approving malicious actions.

wat10000 · 2026-01-15T00:08:12 1768435692

It's also pretty common for people to want their tools to be able to access a lot of external stuff.

From Anthropic's page about this:

> If you've set up Claude in Chrome, Cowork can use it for browser-based tasks: reading web pages, filling forms, extracting data from sites that don't have APIs, and navigating across tabs.

That's a very casual way of saying, "if you set up this feature, you'll give this tool access to all of your private files and an unlimited ability to exfiltrate the data, so have fun with that."

dvt · 2026-01-12T21:17:29 1768252649

I had no idea we even had an `Invalid Date` object, that's legitimately insane. Some other fun ones:

    new Date(Math.E)
    new Date(-1)

are both valid dates lol.

dvt · 2026-01-11T07:09:07 1768115347

Fully agree that pushing OSI is just posturing. After all, Amazon/Google/Facebook have made literal billions by commercializing open source software. I release stuff on MIT all the time (for things I'm okay with people poaching) but I'd argue the only "pure" OSS license is GPL, which comes with its own problems (and, as we all know, it infects everything it touches).

The problem with FSL is that it hasn't been tested in the courts yet (afaik), so it's a bit of a gamble to think it'll just "work" if some asshole does try to clone your repo and sell your work. Maybe it's a decent gamble for a funded startup with in-house counsel, but if you're just one guy, imo keep stuff you want to sell closed-source, it's not that big of a deal. We've been doing just that since the 70s.

dvt · 2026-01-09T20:35:43 1767990943

Hacker News has become weirdly anti-hacker in the last 5 or so years, so please keep building stuff and keep posting it. This is literally what HN is supposed to be. The "AI slop" tirade is just bottom of the barrel bandwaggoning for upvotes because it's popular to hate AI today

vunderba · 2026-01-09T21:24:06 1767993846

Thanks for the support. Honestly, I probably shouldn’t get so defensive either, it’s a bad habit and a pretty poor "evolutionary holdover" in the internet age of anonymity and social media.

I thought one way to help mitigate my emotional responses was to desensitize myself, but who really wants to expose themselves to the requisite sufficient threshold of personal attacks? That’s not exactly a fun callus to develop.

minimaxir · 2026-01-09T21:00:43 1767992443

One of the counterintuitive aspects of the LLM boom is that agentic coding allows for more weird/unique projects that spark joy with less risk due to the increased efficiency. Nowadays, anything that's weird is considered AI slop and that's not even limited to software development.

No, "LLMs can only output what's in their training data" hasn't been true for awhile.

dvt · 2026-01-09T19:47:23 1767988043

It’s the typical “engineer thinking they’re smarter than everyone else” trope. From my experience, engineers fall squarely in the middle of the bell curve. The AI hate is just used as justification, so I don’t even take it that seriously. And fwiw, as someone that played piano when I was younger, this is 100% a useful tool. In fact, during quarantine I was learning to play guitar and used tools like this to learn which string is which by ear.