More

aragonite · 2025-06-15T08:52:44 1749977564

aragonite · 2025-05-20T15:55:56 1747756556

When trying to understand complex C codebase I've often found it helpful to rename existing variable as emojis. This makes it much easier to track which variables are used where & to take in the pure structure of the code at one glance. An example I posted previously: https://imgur.com/F27ZNfk

Unfortunately most modern languages like Rust and JS follow the XID_Start/XID_Continue recommendation (not very well-motivated imo) which excludes all emoji characters from identifiers.

unstuck3958 · 2025-05-20T16:56:36 1747760196

wouldn't writing a parser of sorts that would replace emojis with a valid alphabetical string identifier be trivial?

aragonite · 2025-05-20T17:37:41 1747762661

You're right that writing a preprocessor would be straightforward. But while you're actively editing the code, your dev experience will still be bad: the editor will flag emoji identifiers as syntax errors so mass renaming & autocompletion won't work properly. Last time I looked into this in VSCode I got TypeScript to stop complaining about syntax errors by patching the identifier validation with something like `if (code>127) return true` (if non-ascii, consider valid) in isUnicodeIdentifierStart/isUnicodeIdentifierPart [1]. But then you'd also need to patch the transpiler to JS, formatters like Prettier, and any other tool in your workflow that embeds their own version of TypeScript...

[1] https://github.com/microsoft/TypeScript/blob/81c951894e93bdc...

aragonite · 2025-04-12T14:50:31 1744469431

There's a pretty amazing video here showing a Prince Rupert's drop defeaing a hydraulic press: https://www.youtube.com/shorts/ns7PQjjqHIo

dzdt · 2025-04-12T19:20:43 1744485643

Yeah its not really what it looks like though. They put cylinders of soft metal in place of where you would expect the press to have hardened steel.

aragonite · 2025-04-12T19:55:28 1744487728

Has this been confirmed? The original channel also posted a comparison video [1] showing what seems to be the same cylinders tested against titanium and tungsten cubes (though it's difficult to be sure they are identical)

There's also footage from another channel [2] showing a Prince Rupert's Drop bursting at 20 tons with significant damage to both the steel plate and the press.

[1] https://www.youtube.com/watch?v=4SuPFbeqqKU

[2] https://www.youtube.com/watch?v=A6NUNroyUys

KTibow · 2025-04-12T20:58:50 1744491530

Slowed down: https://youtu.be/tRUngxK2yXY

philjohn · 2025-04-12T21:00:22 1744491622

You can see the steel deforming - definitely soft steel.

aragonite · 2025-03-02T19:39:38 1740944378

> Because they do things. With their hands. That no one else does

That's only true of surgeons :) What if your specialty is nonsurgical (internal medicine, pediatrics, psychiatry, etc)?

downrightmike · 2025-03-02T19:53:48 1740945228

Not even true of all surgeons, the ones that make the most money use machines to work on things their hands couldn't do

skeeter2020 · 2025-03-02T23:48:55 1740959335

pathologists are some of the highest paid doctors and they are right in the crosshairs of what AI is getting better at performing.

rscho · 2025-03-03T00:45:47 1740962747

Do you really know what pathologists do ? Apparently not...

geodel · 2025-03-03T15:51:03 1741017063

IT engineers thought the same. Until finally automation is setting them right.

rscho · 2025-03-03T17:31:55 1741023115

I'm sure going to be amazed if the LLMs of the future 10y suddenly acquire the ability to physically cut just the right bit of a random surgical piece, with a precise idea of where, when and in what orientation the surgeon dug it out, all that with shitty documentation. Humans will be cheaper for a long time still.

rscho · 2025-03-02T19:55:36 1740945336

Haha. Have you actually ever seen a surgical robot yourself? Your claim is laughable. There is no automation whatsoever in any robot on the market currently.

downrightmike · 2025-03-02T19:56:17 1740945377

not automation, yet

rscho · 2025-03-02T19:46:36 1740944796

Almost all specialties do various technical procedures that only them really know how to do. The extreme is psychoanalytic psychiatry, which are the only ones really doing nothing with their hands (yes, interventional psychiatry is a thing...). Now, you could argue that 'yes, but most of the times it's done by techs/nurses'. Well, no. When things go south, and in all places where there is noone else to do the stuff (of which there are many) docs are on their own.

Regarding surgery, I expect it to be one of the easiest procedures to automate, actually (still quite hard, obviously). Because surgery is the only case where there's always advanced imaging available beforehand, and the environment is relatively fixed (OR).

menaerus · 2025-03-03T10:30:37 1740997837

Why do you think medical science wrt complexity is any different than applied math, which computer science essentially is? People already can use LLMs to assist them in diagnosing health issues so why would it be hard to believe that the doctors won't be using the same kind of assistance soon too?

rscho · 2025-03-03T15:25:30 1741015530

> Why do you think medical science wrt complexity is any different than applied math

I don't think I wrote that.

Doctors already use tech assistance. I just pointed out that while we've got efficient robots for applied math, we don't have those as agents in the physical world. People who do blue collar jobs are less replaceable. Well, believe it or not, but most doctors are actually blue collar workers.

menaerus · 2025-03-03T18:11:27 1741025487

You sort of implied that with your replies across the thread. And since AI already replaced part of the CS, I was wondering why do you think this would not be the case with doctors. I'm not sure I agree it's a blue collar profession. I can easily see diagnostics being replaced with AI models.

rscho · 2025-03-03T19:20:46 1741029646

I never wanted to imply that. But here, people frequently assume that because that's what they're used to. Diagnosis is the tip of the iceberg. Most people here aren't sick, so diagnostics are their only focus. If they get ill, they want a diagnosis. But many people are chronically ill already, and doctors spend most of their time treating, not diagnosing. Treating people is made in good part of technical procedures and practical assessments, and you need doctors for that because robots are still far behind for that kind of stuff. People actually have a completely skewed view of what a doctor is.

menaerus · 2025-03-04T10:17:45 1741083465

> People actually have a completely skewed view of what a doctor is.

It could be but treating patients also requires continuous diagnostics, result comprehension, and final assessment so this is certainly the part where AI could play the crucial role.

I don't think anyone thinking of the AI consequences on medicine is arguing that it will replace manual labor such as procedure executions or psychological support. This is obviously not possible so when I see people talking about the "AI in medicine" I read that as mostly complementing the existing work with new technology.

nurettin · 2025-03-03T04:14:59 1740975299

Psychiatrists do that triangle shape with their hands.

ghc · 2025-03-02T19:47:02 1740944822

Uh, pediatricians do a lot with their hands. I don't think my kids (or future grandkids) will be seeing an AI/robot doctor.

aragonite · 2025-02-16T18:28:34 1739730514

I did this very recently for a 19th century book in German with occasionally some Greek. The method that produces the highest level of accuracy I've found is to use ImageMagick to extract each page as a image, then send each image file to Claude Sonnet (encoded as base64) with a simple user prompt like "Transcribe the complete text from this image verbatim with no additional commentary or explanations". The whole thing is completed in under an hour & the result is near perfect and certainly much better than from standard OCR softwares.

cxr · 2025-02-17T03:13:55 1739762035

> a 19th century book

If you're dealing with public domain material, you can just upload to archive.org. They'll OCR the whole thing and make it available to you and everyone else. (If you got it from archive.org, check the sidebar for the existing OCR files.)

aragonite · 2025-02-17T03:56:06 1739764566

I did try the full text OCR from archive.org, but unfortunately the error rate is too high. Here are some screenshots to show what I mean:

- Original book image: https://imgur.com/a8KxGpY

- OCR from archive.org: https://imgur.com/VUtjiON

- Output from Claude: https://imgur.com/keUyhjR

cxr · 2025-02-17T15:03:16 1739804596

Ah, yeah, that's not uncommon. I was operating on an assumption, based on experience seeing language models make mistakes, that the two approaches would be within an acceptable range of each other for your texts, plus the idea that it's better to share the work than not.

Note if you're dealing with a work (or edition) that cannot otherwise be found on archive.org, though, then if you do upload it, you are permitted as the owner of that item to open up the OCRed version and edit it. So an alternative workflow might be better stated:

1. upload to archive.org

2. check the OCR results

3. correct a local copy by hand or use a language model to assist if the OCR error rate is too high

4. overwrite the autogenerated OCR results with the copy from step 3 in order to share with others

(For those unaware and wanting to go the collaborative route, there is also the Wikipedia-adjacent WMF project called Wikisource. It has the upside of being more open (at least in theory) than, say, a GitHub repo—since PRs are not required for others to get their changes integrated. One might find, however, it to be less open in practice, since it is inhabited by a fair few wikiassholes of the sort that folks will probably be familiar with from Wikipedia.)

joseda-hg · 2025-02-17T14:01:56 1739800916

Maybe I've just had back luck, but their OCR butchered some of the books I've tried to get

HarHarVeryFunny · 2025-02-16T19:00:21 1739732421

Is it really necessary to split it into pages? Not so bad if you automate it I suppose, but aren't there models that will accept a large PDF directly (I know Sonnet has a 32MB limit)?

7thpower · 2025-02-16T19:14:24 1739733264

They are limited on how much they can output and there is generally an inverse relationship between the amount of tokens you send vs quality after the first 20-30 thousand tokens.

smallnix · 2025-02-16T23:02:11 1739746931

Are there papers on this effect? That quality of responses diminishes with very large inputs I mean. I observed the same.

HarHarVeryFunny · 2025-02-16T23:53:10 1739749990

I think these models all "cheat" to some extent with their long context lengths.

The original transformer had dense attention where every token attends to every other token, and the computational cost therefore grew quadratically with increased context length. There are other attention patterns than can be used though, such as only attending to recent tokens (sliding window attention), or only having a few global tokens that attend to all the others, or even attending to random tokens, or using combinations of these (e.g. Google's "Big Bird" attention from their Elmo/Bert muppet era).

I don't know what types of attention the SOTA closed source models are using, and they may well be using different techniques, but it'd not be surprising if there was "less attention" to tokens far back in the context. It's not obvious why this would affect a task like doing page-by-page OCR on a long PDF though, since there it's only the most recent page that needs attending to.

Breza · 2025-02-22T01:31:34 1740187894

I've experienced this problem but I haven't come across papers about it. For this context, it would be interesting to compare the accuracy of transcribing one page at a time to batches of n pages.

smallnix · 2025-03-04T21:11:17 1741122677

Found one https://arxiv.org/abs/2404.06654

therealpygon · 2025-02-17T00:00:35 1739750435

Necessary? No. Better? Probably. Despite larger context windows, attention and hallucinations aren’t completely a thing of the past within the expanded context windows today. Splitting to individual pages likely helps ensure that you stay well within a normal context window size that seems to avoid most of these issues. Asking an LLM to maintain attention for a single page is much more achievable than an entire book.

Also, PDF size isn’t a relevant measurement of token lengths when it comes to PDFs which can range from a collection of high quality JPEG images to thousand(s) of pages of text

siva7 · 2025-02-16T19:41:47 1739734907

They all accept large PDFs (or any kind of input) but the quality of the output will suffer for various reasons.

ant6n · 2025-02-16T20:11:38 1739736698

I recently did some OCRing with OpenAI. I found o3-mini-hi to be imagining and changing text, whereas the older (?) o4 was more accurate. It’s a bit worrying that some of the models screw around with the text.

jazzyjackson · 2025-02-16T20:20:19 1739737219

There’s GPT4, then GPT4o (o for Omni, as in multi modal) and then GPT o1 (chain of thought / internal reasoning) then o3 (because o2 is a stadium in London that I guess is very litigious about its trademark?), o3-mini is the latest but yes optimized to be faster and cheaper

polshaw · 2025-02-16T21:16:28 1739740588

o2 is the UK's largest mobile network operator. They bought naming rights to what was known as the millennium dome (not even a stadium).

jazzyjackson · 2025-02-16T22:19:20 1739744360

Ahh makes sense :)

dotancohen · 2025-02-16T22:15:15 1739744115

What is the o3 model good for? Is it just an evolution of o1 (chain of thought / internal reasoning)?

KTibow · 2025-02-16T23:53:16 1739749996

Yes

(albeit I believe o3-mini isn't natively multimodal)

dotancohen · 2025-02-17T02:22:10 1739758930

I see, thank you.

ant6n · 2025-02-16T23:45:40 1739749540

Which one is the smartest, and most knowledgeable? (Like least likely to make up facts)

wrsh07 · 2025-02-17T03:46:20 1739763980

4o is going to be better for a straight up factual question

(But eg I asked it about something Martin Short / John Mulaney said on SNL and it needed 2 prompts to get the correct answer..... the first answer wasn't making anything up it was just reasonably misinterpreting something)

It also has web search which will be more accurate if the pages it reads are good (it uses bing search, so if possible provide your own links and forcibly enable web search)

Similarly the latest Anthropic Claude Sonnet model (it's the new Sonnet 3.5 as of ~Oct) is very good.

The idea behind o3 mini is that it only knows as much as 4o mini (the names suck, we know) but it will be able to consider its initial response and edit it if it doesn't meet the original prompt's criteria

hkonsti · 2025-02-17T08:55:45 1739782545

Do you have a rough estimate of what the price per page was for this?

aragonite · 2025-02-19T22:49:20 1740005360

It must have been under $3 for the 150 or so API calls, possibly even under $2, though I'm less sure about that.

woile · 2025-02-16T23:00:40 1739746840

What about preserving the style like titles and subtitles?

aragonite · 2025-02-17T00:56:17 1739753777

You can request Markdown output, which takes care of text styling like italics and bold. For sections and subsections, in my own case they already have numerical labels (like "3.1.4") so I didn't feel the need to add extra formatting to make them stand out. Incidentally, even if you don't specify markdown output, Claude (at least in my case) automatically uses proper Unicode superscript numbers (like ¹, ², ³) for footnotes, which I find very neat.

tmaly · 2025-02-19T21:21:57 1740000117

how big were the image files in terms of size/resolution that go you the level of accuracy you needed with Claude?

aragonite · 2025-02-19T22:46:01 1740005161

300dpi (`magick -density 300 book.pdf page_%03d.png` was the command I used). The PDF is a from archieve.org & a very high-quality scan (https://ia601307.us.archive.org/5/items/derlgnertheori00rsuo...)

aragonite · 2025-02-09T14:09:26 1739110166

I'm pretty sure they are from this collection:

https://en.wikipedia.org/wiki/Franz_Kafka:_The_Office_Writin...

(Also I believe these letters are 'anonymous' not in the sense that their authorship is unknown, but in that they were published under the institutional name of the Insurance Institute where he was employed, not under his personal name.)

aragonite · 2025-02-06T16:21:31 1738858891

At least you can't trivially circumvent the redaction just by copy and paste which was the case for lots of PDFs from the Obama years, such as this one:

https://obamawhitehouse.archives.gov/sites/default/files/mic...

aragonite · 2025-02-04T01:17:03 1738631823

The ranking member (senior Democrat) on the intelligence committee voted against the ban (https://www.ctinsider.com/columnist/article/tiktok-ban-jim-h...):

> One key for him is that it’s only a possible threat. Our best intelligence, including in a briefing for Congress from the Biden administration Tuesday, is that the Chinese government has not actually done the things the ban fears.

Also, from a recent All-In interview (https://www.happyscribe.com/public/all-in-with-chamath-jason...):

> I look to Jim Himes, who is the senior Democrat on the Intelligence Committee, the ranking member. He's in what's called the Gang of Eight. He has the most exquisite access to intelligence. Jim voted against the ban. And I thought, you know what? If this guy is not seeing anything on the national security level.

> [00:44:36] There was an off the record or confidential briefing to the House Intelligence Committee. You think in that meeting, there was nothing that was very meaningful that was disclosed about TikTok?

> [00:44:45] Nothing that I had seen. Is it owned by the Chinese government? Absolutely. But is there a national security risk? I have not seen that.

gruez · 2025-02-04T01:28:13 1738632493

>The ranking member (senior Democrat) on the intelligence committee voted against the ban (https://www.ctinsider.com/columnist/article/tiktok-ban-jim-h...):

But he voted yes on the subsequent vote?

https://clerk.house.gov/Votes/2024145

Moreover your factoid is misleading because it omits the fact that the chair voted for the ban, along with most of the members. Of the 25 members on the committee, only 4 voted against.

aragonite · 2025-02-04T01:35:32 1738632932

He voted against the March bill that narrowly focuses on the TikTok issue, but voted for the April bill which also includes foreign aid for Ukraine and Israel. They are not the same bill.

> The decision by House Republicans to include TikTok as part of a larger foreign aid package, a priority for President Joe Biden with broad congressional support for Ukraine and Israel, fast-tracked the ban after an earlier version had stalled in the Senate. A standalone bill with a shorter, six-month selling deadline passed the House in March by an overwhelming bipartisan vote as both Democrats and Republicans voiced national security concerns about the app’s owner, the Chinese technology firm ByteDance Ltd. (https://www.pbs.org/newshour/politics/possible-u-s-tiktok-ba...)

orwin · 2025-02-04T08:45:02 1738658702

Putting shit like that in an unrelated bill can be translated as a 'horseman bill' and would have been censored so fast in my country it isn't even funny. What does your supreme court do exactly?

jashmatthews · 2025-02-04T03:02:39 1738638159

Intelligence is kinda unnecessary at this point. A Tiktok VP already admitted to a UK parliament select committee that they used to harshly moderate anything the Chinese government found sensitive

Q90 - John Nicolson: It may happen elsewhere, and I can tell you what your official TikTok response was to this leak. You did not deny that these were instructions. In fact, you confirmed that these were instructions, but what you said was that the company had changed its policy in May 2019. Previously, you instructed your moderators to take down videos critical of China, specifically talking about incidents in Tiananmen Square, separatism in Tibet, all straight out of the Chinese Communist Party playbook. You confirmed that is what your moderators did, but your defence was that you had changed your policy in May 2019.

Theo Bertram: It is highly regrettable that that is what it was, but it is not our policy today, nor has it been for a long time.

https://committees.parliament.uk/writtenevidence/13247/html/

aragonite · 2025-01-10T06:00:01 1736488801

> Google used to watermark internal emails using non-visible Unicode. It would catch people copy-pasting things to the press.

How does that even work? I assume the unique identifiers are generated along the lines of https://zws.im but do they send a different version of the same email to each unique recipient? Or does the watermark get inserted by some email client when copying text?

dbmnt · 2025-01-10T06:09:41 1736489381

It was the former, a unique ID per recipient.

aragonite · 2024-12-25T06:14:35 1735107275

Most AI-generated images I've come across (I have in mind Substacks, Medium posts and other personal blogs) are purely "decorative" & having the same (lack of) purpose as stock photography - e.g. a generic photo of people exercising in an article about fitness. In informational articles I find these pointless & also maybe somewhat in poor taste. But then I'd probably feel the same way about regular stock photography.

There may be potential for AI-generated explanatory visuals though. High-quality diagrams, graphs, map of complex conceptual relationships and so on would be exciting.

coffeebeqn · 2024-12-25T06:55:32 1735109732

That would need to be a whole new model that’s not based on pixels but these nodes and edges. All AI generated maps or diagrams look absolutely unhinged

john_the_writer · 2024-12-25T09:44:12 1735119852

The difference between stock and AI would be how they came to be "free". Stock comes from people posing. AI comes from stealing the peoples poses.