More

prmph · 2026-01-07T21:05:39 1767819939

Yet you are unavoidably eating micro-plastics too, which have been linked to adverse CV events.

Also:

- If you are eating more fish (as opposed to eating meat), you are likely consuming more mercury.

- If you are eating more fresh veggies you are probably ingesting more pesticides.

- If you are easting dark chocolate for its health benefits, you are also ingesting cadmium and other heavy metals.

So all the above should be done in moderation. Even things that seem like unalloyed good can be dangerous. A burst of exercise beyond your conditioning can lead to a CV event. Too much water can be poisonous. Some people get constipation for too much veggies in their diet.

For example, instead to sticking to a narrow faddish supposedly healthy diet, you can enjoy a wide range of foods, which will make it more likely you are getting all the nutrients that will do you good (of course clearly unhealthy food should be avoided).

The body is more complex than we can ever know. There are some general principles for good health (including CV health) that should be followed, but to me it is clear that good health does not arise from a slavish devotion to very detailed set of rules.

mcmoor · 2026-01-08T01:03:47 1767834227

Funnily I've heard that one reason why obesity is prevalent is that we have too many variations of food. Seems like our hunger controller suspends satiety when we eat a food too much, but when we eat few of lots of different foods, it's broken.

It'd be funny if lots of fad diets actually works because people are forced to eat a single type of food and that's entirely enough for it.

snicky · 2026-01-08T02:59:45 1767841185

It could also explain why most of us can eat like pigs in all-you-can-eat buffets.

overgard · 2026-01-08T14:56:25 1767884185

Your post sounds like "bad things can happen so why bother". Having a good diet isn't "slavish devotion", it's more like "don't eat something obviously terrible"

prmph · 2026-01-09T16:31:15 1767976275

Way to totally miss the gist of my comment:

> "don't eat something obviously terrible"

This is an exact thought in the comment.

> "bad things can happen so why bother"

The exact thing I was arguing against.

Jeez, why bother responding if you can't be bothered to read the actual comment I wrote.

prmph · 2025-12-26T02:18:00 1766715480

You are not alone

prmph · 2025-12-22T17:25:49 1766424349

Nothing will really work when the models fail at the most basic of reasoning challenges.

I've had models do the complete opposite of what I've put in the plan and guidelines. I've had them go re-read the exact sentences, and still see them come to the opposite conclusion, and my instructions are nothing complex at all.

I used to think one could build a workflow and process around LLMs that extract good value from them consistently, but I'm now not so sure.

I notice that sometimes the model will be in a good state, and do a long chain of edits of good quality. The problem is, it's still a crap-shoot how to get them into a good state.

hu3 · 2025-12-22T18:10:11 1766427011

Check context size.

LLMs become increasingly error-prone as their memory is fills up. Just like humans.

In VSCode Copilot you can keep track of how many tokens the LLM is dealing with in realtime with "Chat Debug".

When it reaches 90k tokens I should expect degraded intelligence and brace for a possible forced sumarization.

Sometimes I just stop LLMs and continue the work in a new session.

mstank · 2025-12-22T18:20:36 1766427636

In my experience this was an issue 6-8 months ago. Ever since Sonnet 4 I haven’t had any issues with instruction following.

Biggest step-change has been being able to one-shot file refactors (using the planning framework I mentioned above). 6 months ago refactoring was a very delicate dance and now it feels like it’s pretty much streamlined.

ewoodrich · 2025-12-22T22:49:44 1766443784

I recently ran into two baffling, what felt like GPT 3.5 era completely backwards misinterpretations of an unambiguous sentence once each in Codex and CC/Sonnet a few days apart in completely different scenarios (both very early in the context window). And to be fair, they were notable partially as an "exception that proves the rule" where it was surprising to see but OP's example can definitely still happen in my experience.

I was prepared to go back to my original message and spot an obvious-in-hindsight grey area/phrasing issue on my part as the root cause but there was nothing in the request itself that was unclear or problematic, nor was it buried deep within a laundry list of individual requests in a single message. Of course, the CLI agents did all sorts of scanning through the codebase/self debate/etc in between the request and the first code output. I'm used to how modern models/agents get tripped up by now so this was an unusually clear cut failure to encounter from the latest large commercial reasoning models.

In both instances, literally just restating the exact same request with "No, the request was: [original wording]" was all it took to steer them back and didn't become a concerning pattern. But with the unpredictability of how the CLI agents decide to traverse a repo and ingest large amounts of distracting code/docs it seems much too over confident to believe that random, bizarre LLM "reasoning" failures won't still occur from time to time in regular usage even as models improve given their inherent limitations.

(If I were bending over backwards to be charitable/anthropomorphize, it would be the human failure mode of "I understood exactly what I was asked for and what I needed to do, but then somehow did the exact opposite, haha oops brain fart!" but personally I'm not willing to extend that much forgiveness/tolerance to a failure from a commercial tool I pay for...)

PeterFBell · 2025-12-23T14:27:01 1766500021

It's complicated. Firstly, don't love that this happens. But the fact you're not willing to provide tolerance to a commercial tool that costs maybe a few hundred bucks a month but are willing to do so for a human who probably costs thousands of bucks a month is revealing of a double standard we're all navigating.

Its like the fallout when a waymo kills a "beloved neighborhood cat". I'm not against cats, and I'm deeply saddened at the loss of any life, but if it's true that (comparable) mile for mile, waymos reduce deaths and injuries, that is a good thing - even if they don't reduce them to zero.

And to be clear, I often feel the same way - but I am wondering why and whether it's appropriate!

prmph · 2025-12-23T14:48:57 1766501337

For me I was just pointing out some interesting and noteworthy failure modes.

And it matters. If the models struggle sometimes with basic instruction following, they're can quite possibly make insidious mistakes in large complex tasks that you might no have the wherewithal or time to review.

The thing about good abstractions is that you should be able to trust in a composable way. The simpler or more low-level the building blocks, the more reliable you should expect them to be. In LLMs you can't really make this assumption.

ewoodrich · 2025-12-23T17:37:25 1766511445

I mean, we typically architect systems depending on humans around an assumption of human fallibility. But when it comes to automation, randomly still doing the exact opposite even if somewhat rare is problematic and limits where and at what scale it can be safely deployed without needing ongoing human supervision.

For a coding tool it’s not as problematic as hopefully you vet the output to some degree but it still means I have don’t feel comfortable using them using them as expansively (like the mythical personal assistant doing my banking and replying to emails, etc) as they might otherwise be used with more predictable failure modes.

I’m perfectly comfortable with Waymo on the other hand, but that would probably change if I knew they were driven by even the newest and fanciest LLMs as [toddler identified | action: avoid toddler] -> turns towards toddler is a fundamentally different sort of problem.

alienbaby · 2025-12-22T18:04:26 1766426666

I'm curious in what kinda if situations you are seeing the model the do opposite of your intention consistently where the instructions were not complex. Do you have any examples?

avereveard · 2025-12-22T18:24:13 1766427853

Mostly gemini 3 pro when I ask to investigate a bug and provide fixing options (i do this mostly so i can see when the model loaded the right context for large tasks) gemini immediately starts fixing things and I just cant trust it

Codex and claude give a nice report and if I see they're not considering this or that I can tell em.

saxenaabhi · 2025-12-23T13:38:33 1766497113

fyi that happened to me with codex.

but, why is it a big issue? if it does something bad, just reset the worktree and try again with a different model/agent? They are dirt cheap at 20/m and I have 4 subscription(claude, codex, cursor, zed).

avereveard · 2025-12-25T16:56:46 1766681806

Same I have multiple subscription and layer them. I use haiku to plan and send queue of task to codex and gemini whose command line can be scripted

The issue to me is that I have no idea of what the code looks like and have to have a reliable first layer model that can summarize current codebase state so I can decide whether the next mutation moves the project forward or reduces technical debt. I can delegate much more that way, while gemini "do first" approach tend to result in many dead ends that I have to unravel.

prmph · 2025-12-23T14:45:26 1766501126

The issue is that if it's struggling sometimes with basic instruction following, it's likely to be making insidious mistakes in large complex tasks that you might no have the wherewithal or time to review.

The thing about good abstractions is that you should be able to trust in a composable way. The simpler or more low-level the building blocks, the more reliable you should expect them to be. In LLMs you can't really make this assumption.

saxenaabhi · 2025-12-23T15:21:25 1766503285

I'm not sure you can make that assumption even when a human wrote that code. LLMs are competing with humans not with some abstraction.

> The issue is that if it's struggling sometimes with basic instruction following, it's likely to be making insidious mistakes in large complex tasks that you might no have the wherewithal or time to review.

Yes, that's why we review all code even when written by humans.

prmph · 2025-12-21T13:53:02 1766325182

> And there’s a lot of games.

Yes, the move to 48 countries is driven by greed, but I it kinds of makes the games unwieldy. The previous 32 countries I think was optimal, though some would argue that that was even still too much.

prmph · 2025-12-21T12:55:34 1766321734

The world cup is losing it soul to unbridled capitalism.

biglyburrito · 2025-12-21T13:10:44 1766322644

Bold of you to assume the World Cup was never a capitalist entity to begin with.

prmph · 2025-12-21T13:55:14 1766325314

It was, but within reason.

Currently they are just milking the spectacle. Maybe in the future all countries will be allowed (no-playoffs) and together with sky high ticket prices, this will ensure the maximum payoff

drweevil · 2025-12-21T17:31:22 1766338282

Of course it was; but it's also a valid observation that the capitalist aspect has jumped the shark. Qatar was my shark, but I was also dismayed by the South Africa finals, when the organizers banned private street food vendors near the venues, because they supposedly competed with Official World Cup Sponsors. Those vendors had always done business in those locations. More to the point of this story, tickets to live events in general have become exclusionary to anyone who is not wealthy, and now those live events also have become difficult to find on the air.

prmph · 2025-12-21T00:24:23 1766276663

I don't know. I actually find it harder and more stressful to write code in a way that does not meet a certain quality level. it require me to actually think more.

It's king of weird, but I have tried over the years to develop a do-just-what-is-necessary-now mindset in my software engineering work, and I just can't make my mind work that.

For me, doing things right is a way for me to avoid having to hold too much context in my head while working on my projects. I know the idiomatic way to do something, and if i just do it that way, then when I come back to it I know it should and is architectured.

encoderer · 2025-12-21T22:43:22 1766357002

“I don’t have this feature yet” is not really a mental burden. For any successful project that is always true about a lot of features.

prmph · 2025-12-20T22:47:31 1766270851

I fail to see how preventing email changes solves the issues you listed, or how allowing it necessarily makes them worse.

blitzegg · 2025-12-20T23:36:54 1766273814

That's pretty obvious to anyone who had to maintain a high traffic site. Just the tip of the iceberg (I haven't included additional legal issues and other):

1.1 Strong protection against account takeover

Email change is one of the most abused recovery vectors in account takeover (ATO).

Eliminating email changes removes:

Social-engineering attacks on support

SIM-swap → email-change chains

Phished session → email swap → lockout of real user

Attacker must compromise the original inbox permanently, which is much harder.

1.2 No “high-risk” flows

Email change flows are among the highest-risk product flows:

Dual confirmation emails

Cooldown periods

Rollback windows

Manual reviews

Fixed email removes an entire class of security-critical code paths.

1.3 Fewer recovery attack surfaces No need for:

“I lost access to my email” flows

Identity verification uploads

Support-driven ownership disputes

Every recovery mechanism is an attack surface; removing them reduces risk.

MattJ100 · 2025-12-21T08:39:40 1766306380

You're very wrong, because account takeover can still happen due to a compromised email account. People can and do permanently lose access to their email account to a third party.

TheNewsIsHere · 2025-12-21T15:49:12 1766332152

Having worked in security on a fairly high profile, highly visible, largely used product — one of the fundamental decisions that paid off very well was intentionally including mechanisms to prevent issues with other businesses (like Google) from impacting user abilities for us.

Not having email change functionality would have been a huge usability, security, and customer service nightmare for us.

Regardless of anything else, not enabling users to change their email address effectively binds them to business with a single organization. It also ignores the fact that people can and do change emails for entirely opaque reasons from the banal to the authentically emergent.

ATO attacks are a fig leaf for such concerns, because you, as an organization, always have the power to revert a change to contact information. You just need to establish a process. It takes some consideration and table topping, but it’s not rocket science for a competent team.

cromka · 2025-12-21T12:20:06 1766319606

This is a logical fallacy. That's like saying security of the website is not important because someone can still steal your laptop.

MattJ100 · 2025-12-21T21:25:38 1766352338

What logical fallacy, exactly? I think you're perhaps misunderstanding the conversation. This translates just fine to your proposed analogy.

In your analogy, the claim would be that some online account is tied to a laptop and whoever possesses the laptop has access to that account. The online service does not permit the account owner to revoke access from that laptop and move the account to a different laptop. I stand by my statement that this would be a serious security hazard. Because yes, laptops can and do get hacked or stolen, just like email addresses.

Where your analogy isn't quite as strong is that at least you can generally add additional anti-theft protections such as full-disk encryption to a laptop, while with an email account generally 2FA is the best you can do.

tzs · 2025-12-21T19:58:47 1766347127

> Attacker must compromise the original inbox permanently, which is much harder

This may need further analysis. I'd guess that a significant fraction of the people that want to change the email address that identifies them to a service want to do so because they have a new email address that they are switching to.

Many of those will be people who lose access to the old email address after switching. For example people who were using an email address at their ISP's domain who are switching ISPs, or people who use paid email hosting without a custom domain and are switching to a different email provider.

A new customer of that old provider might then be able to get that old address. You'd think providers would obviously never allow addresses used by former customers to be reused, but nope, some do. Even some that you'd expect to not do so, such as mailbox.org [1] and fastmail.com, allow addresses to be recycled.

[1] https://kb.mailbox.org/en/private/e-mail/when-is-a-deleted-a...

[2] https://www.change.org/p/stop-fastmail-recycling-email-addre...

prmph · 2025-12-21T22:25:00 1766355900

Are you using LLMs to do your thinking for you?

prmph · 2025-12-20T22:43:22 1766270602

So with all their billions they could not get a proper software engineer to architect their project?

Unless there is some deep technical reason why things have to be this way, which I very much doubt.

And now they can't change it? Where is Claude when you need him/her

jaggederest · 2025-12-20T23:08:32 1766272112

The funny thing is that if you ask Claude if you should use email address as a primary key it will pretty adamantly warn you away from it:

> I'd recommend against using email as the primary key for a large LLM chat website. Here's why:

> Problems with email as primary key:

> 1. Emails change - Users often want to update their email addresses. With email as PK, you'd need to cascade updates across all related tables (chat sessions, messages, settings, etc.), which is expensive and error-prone

> [Edited for length]

prmph · 2025-12-18T13:43:44 1766065424

Hi all, I submitted this to spur a discussion, not that I necessarily agree or disagree with the argument or to start a flame war.

Please avoid comments likely to lead to TFA getting flagged.

prmph · 2025-12-18T00:29:11 1766017751

How does this affect Trump's chances of ... ahem, getting the peace prize?

I though he and his followers said he was anti-war? Oh yup, this is probably just a "special military operation"

davidw · 2025-12-18T00:38:31 1766018311

> getting the peace prize

Maybe if they can pinpoint its whereabouts at a specific time when it's not heavily guarded, they can send a team to snatch it with minimal casualties.

simonsarris · 2025-12-18T13:09:51 1766063391

You do realize the US is supported in the current actions by the 2025 Nobel Peace Prize winner? (María Corina Machado)

tim333 · 2025-12-18T17:46:41 1766080001

He already got the very prestigious FIFA peace prize.

dzhiurgis · 2025-12-18T02:10:23 1766023823

How financing ruSSia with oil money promotes peace?