More

cma · 2026-01-20T00:12:11 1768867931

Tablet/kindle on an arm mount (the kind with springs like a microphone or architecture lamp stand works best, goosenecks fail pretty quick), and the wearmouse app on an android watch to turn pages works pretty well.

cma · 2026-01-18T14:11:15 1768745475

I don't think it is dispositive, just that it likely didn't copy the proof we know was in the training set.

A) It is still possible a proof from someone else with a similar method was in the training set.

B) something similar to erdos's proof was in the training set for a different problem and had a similar alternate solution to chatgpt, and was also in the training set, which would be more impressive than A)

CamperBob2 · 2026-01-18T15:51:02 1768751462

It is still possible a proof from someone else with a similar method was in the training set.

A proof that Terence Tao and his colleagues have never heard of? If he says the LLM solved the problem with a novel approach, different from what the existing literature describes, I'm certainly not able to argue with him.

mmooss · 2026-01-18T16:12:11 1768752731

> A proof that Terence Tao and his colleagues have never heard of?

Tao et al. didn't know of the literature proof that started this subthread.

pvab3 · 2026-01-18T22:53:54 1768776834

there is an immense amount of stuff out there on ArXiv that no one has ever looked at

CamperBob2 · 2026-01-18T16:38:08 1768754288

Right, but someone else did ("colleagues.")

habinero · 2026-01-18T18:30:05 1768761005

No, they searched for it. There's a lot of math literature out there, not even an expert is going to know all of it.

CamperBob2 · 2026-01-18T18:43:38 1768761818

Point being, it's not the same proof.

mmooss · 2026-01-18T20:25:29 1768767929

Your point seemed to be, if Tao et al. haven't heard of it then it must not exist. The now known literature proof contradicts that claim.

nl · 2026-01-18T22:37:33 1768775853

There's an update from Tao after emailing Tenenbaum (the paper author) about this:

> He speculated that "the formulation [of the problem] has been altered in some way"....

[snip]

> More broadly, I think what has happened is that Rogers' nice result (which, incidentally, can also be proven using the method of compressions) simply has not had the dissemination it deserves. (I for one was unaware of it until KoishiChan unearthed it.) The result appears only in the Halberstam-Roth book, without any separate published reference, and is only cited a handful of times in the literature. (Amusingly, the main purpose of Rogers' theorem in that book is to simplify the proof of another theorem of Erdos.) Filaseta, Ford, Konyagin, Pomerance, and Yu - all highly regarded experts in the field - were unaware of this result when writing their celebrated 2007 solution to #2, and only included a mention of Rogers' theorem after being alerted to it by Tenenbaum. So it is perhaps not inconceivable that even Erdos did not recall Rogers' theorem when preparing his long paper of open questions with Graham in 1980.

(emphasis mine)

I think the value of LLM guided literature searches is pretty clear!

casey2 · 2026-01-19T02:15:46 1768788946

This whole thread is pretty funny. Either it can demo some pretty clever, but still limited, features resulting in math skills OR it's literally the best search engine ever invented. My guess is the former, it's pretty whatever at web search and I'd expect to see something similar to the easily retrievable, more visible proof method from Rogers' (as opposed to some alleged proof hidden in some dataset).

CamperBob2 · 2026-01-19T16:42:52 1768840972

Either it can demo some pretty clever, but still limited, features resulting in math skills OR it's literally the best search engine ever invented.

Both are precisely true. It is a better search engine than anything else -- which, while true, is something you won't realize unless you've used the non-free 'pro research' features from Google and/or OpenAI. And it can perform limited but increasingly-capable reasoning about what it finds before presenting the results to the user.

Note that no online Web search or tool usage at all was involved in the recent IMO results. I think a lot of people missed that little detail.

heliumtera · 2026-01-18T15:02:54 1768748574

Does it matter if it copied or not? How the hell would one even define if it is a copy or original at this point?

At this point the only conclusion here is: The original proof was on the training set. The author and Terence did not care enough to find the publication by erdos himself

cma · 2026-01-17T22:26:10 1768688770

The TPU implementation used approximate top-k instead of the exact used on nvidia. While that wouldn't matter too much and there was a bug with it, it still was a cost savings thing not to use exact from the beginning because it wasn't efficient on TPUs which they were routing to under load. So it was a bit of a model difference under load, even aside from the bug.

dcre · 2026-01-18T02:30:41 1768703441

To the extent this is an accurate characterization (somewhat, I think), they considered the quality difference a bug and fixed it!

cma · 2026-01-09T04:13:32 1767932012

They've added this change at the same time they added random trick prompts to try and get you hit enter on the training opt in from late last year. I've gotten three popups inside claude code today at random times trying to trick me into having it train my data with a different selection defaulted than I've already chosen.

(edit 4 times now just today)

transcriptase · 2026-01-09T04:36:07 1767933367

More evidence the EU solved the wrong problem. Instead of mandating cookie banners, mandate a single global “fuck off” switch: one-click, automatic opt-out from any feature/setting/telemetry/tracking/training that isn’t strictly required or clearly beneficial to the user as an individual. If it’s mainly there for data collection, ads, attribution, “product improvement”, or monetization, it should be off by default and remain that way so long as the “fuck off” option is toggled. Burden of proof on the provider. Fines exceeding what it takes to get growth teams and KPI hounds to have legal coach them on what “fuck off” means and why they need to.

croes · 2026-01-09T05:27:39 1767936459

Remember what happened do the DNT flag in the browser?

They just ignored until it was gone.

If you don’t give them a way to trick and annoy you into accept tracking they ignore completely what you want

wolvoleo · 2026-01-09T11:12:46 1767957166

DNT was useless because it didn't have a legal basis. It would have been amazing if they had mandated something like this instead of the cookie walls.

Advertisers ignored it because they could. And complained that it defaulted to on, however cookies are supposed to be opt-in so this is how it's supposed to work anyway.

memoriuaysj · 2026-01-09T08:09:48 1767946188

remember how all of HN and tech people were saying that DNT is a Micro$oft scam designed to break privacy because it was enabled by default without requiring user action?

to the point that Apache web server developers added a custom rule in the default httpd.conf to strip away incoming DNT headers !!!

https://arstechnica.com/information-technology/2012/09/apach...

adastra22 · 2026-01-09T06:27:15 1767940035

DNT wasn't actually legally mandated.

skeptic_ai · 2026-01-09T04:25:39 1767932739

Last time I mentioned they are sketchy I got a ton of downvotes. I’m happy to see more support.

cma · 2026-01-08T06:52:34 1767855154

The fees go to the vendors. The vendors are allowed to accept cash for less money than the card+fees price, so there is still a fee.

cma · 2026-01-07T05:05:47 1767762347

I think manipulation will come long before 2036, but the people doing high level planning on LLMs trained on forum discussions of Chucky movies and all kinds of worse stuff and planning for home robot deployment soon I think are off by a lot. Things like random stuff playing on TV rehydrating that memory that was mostly wiped out in RLHF; it will need many extra safety layers.

And even if it isn't just doing crazy intentional-seeming horror stuff, we're still a good ways off from passing the safely make a cup of coffee in a random house without burning it down or scalding the baby test.

cma · 2026-01-07T04:52:03 1767761523

If it takes a lot of back and forth it between lots of people it is more like a $12000 workstation or more after the labor for requesting and approving.

cma · 2026-01-07T01:18:44 1767748724

You need to find where context breaks down, Claude was better at it even when Gemini had 5X more on paper, but both have improved with last releases.

cma · 2026-01-03T19:23:49 1767468229

Trump pardoned the largest opiates by mail operator in world history on his first or second day in office (Ross Ulbricht).

cma · 2025-12-31T10:50:55 1767178255

Deepseek has invested the same amount as OpenAI?