More

Yoric · 2026-01-09T20:12:48 1767989568

Yeah, that is one of my main uses for AI: getting the build stuff and scripts out of the way so that I can focus on the application code. That and brainstorming.

In both cases, it works because I can mostly detect when the output is bullshit. I'm just a little bit scared, though, that it will stop working if I rely too much on it, because I might lose the brain muscles I need to detect said bullshit.

dionian · 2026-01-10T01:31:47 1768008707

Im super interested to know how juniors get a long. i have dealt with build systems for decades and half the time its just use google or stackoverflow to get past something quickly, or manually troubleshoot deps. now i automate that entirely. and for code, i know what good or not, i check its output and hve it redo anything t5hat doesnt pass my known stndards. It makes using it so much easier. the article is so on point

Yoric · 2026-01-06T18:36:54 1767724614

You intrigue me.

> have it learn your conventions, pull in best practices

What do you mean by "have it learn your conventions"? Is there a way to somehow automatically extract your conventions and store it within CLAUDE.md?

> For example, we have a custom UI library, and Claude Code has a skill that explains exactly how to use it. Same for how we write Storybooks, how we structure APIs, and basically how we want everything done in our repo. So when it generates code, it already matches our patterns and standards out of the box.

Did you have to develop these skills yourself? How much work was that? Do you have public examples somewhere?

ac29 · 2026-01-06T19:28:20 1767727700

> What do you mean by "have it learn your conventions"?

I'll give you an example: I use ruff to format my python code, which has an opinionated way of formatting certain things. After an initial formatting, Opus 4.5, without prompting, will write code in this same style so that the ruff formatter almost never has anything to do on new commits. Sonnet 4.5 is actually pretty good at this too.

UncleMeat · 2026-01-06T22:15:08 1767737708

Isn't this a meaningless example? Formatters already exist. Generating code that doesn't need to be formatted is exactly the same as generating code and then formatting it.

I care about the norms in my codebase that can't be automatically enforced by machine. How is state managed? How are end-to-end tests written to minimize change detectors? When is it appropriate to log something?

eterm · 2026-01-06T23:30:54 1767742254

Here's an example:

We have some tests in "GIVEN WHEN THEN" style, and others in other styles. Opus will try to match each style of testing by the project it is in by reading adjacent tests.

vidarh · 2026-01-08T13:39:52 1767879592

The one caveat with this, is that in messy code bases it will perpetuate bad things, unless you're specific about what you want. Then again, human developers will often do the same and are much harder to force to follow new conventions.

gck1 · 2026-01-07T20:58:49 1767819529

The second part is what I'd also like to have.

But I think it should be doable. You can tell it how YOU want the state to be managed and then have it write a custom "linter" that makes the check deterministic. I haven't tried this myself, but claude did create some custom clippy scripts in rust when I wanted to enforce something that isn't automatically enforced by anything out there.

UncleMeat · 2026-01-07T21:43:37 1767822217

Lints are typically well suited for syntactic properties or some local semantic properties. Almost all interesting challenges in software design and evolution involve nonlocal semantic properties.

scotty79 · 2026-01-07T17:24:10 1767806650

Memes write themselves.

"AI has X"

"We have X at home"

"X at home: x"

gingersnap · 2026-01-06T18:42:58 1767724978

Starting to use Opus 4.5 I'm reduces instrutions in claude.md and just ask claude to look in the codebase to understand the patterns already in use. Going from prompts/docs to instead having code being the "truth". Show don't tell. I've found this patterns has made a huge leap with Opus 4.5.

zoilism · 2026-01-06T23:59:25 1767743965

The Ash framework takes the approach you describe.

From the docs (https://hexdocs.pm/ash/what-is-ash.html):

"Model your application's behavior first, as data, and derive everything else automatically. Ash resources center around actions that represent domain logic."

kaydub · 2026-01-07T18:14:00 1767809640

I feel like I've been doing this since Sonnet 3.5 or Sonnet 4. I'll clone projects/modules/whatever into the working directory and tell claude to check it out. Voila, now it knows your standards and conventions.

oncallthrow · 2026-01-06T19:07:35 1767726455

When I ask Claude to do something, it independently, without me even asking or instructing it to, searches the codebase to understand what the convention is.

I’ve even found it searching node_modules to find the API of non-public libraries.

jack_pp · 2026-01-06T22:10:46 1767737446

This sounds like it would take a huge amount of tokens. I've never used agents so could you disclose how much you pay for it?

garblegarble · 2026-01-06T23:20:39 1767741639

If they're using Opus then it'll be the $100/month Claude Max 5x plan (could be the more expensive 20x plan depending on how intensive their use is). It does consume a lot of tokens, but I've been using the $100/mo plan and get a lot done without hitting limits. It helps to be mindful of context (regularly amending/pruning your CLAUDE.md instructions, clearing context between tasks, sizing your tasks to stay within the Opus context window). Claude Code plans have token limits that work in 5-hour blocks (that start when you send your first token, so it's often useful to prime it as early in the morning as possible).

Claude Code will spawn sub-agents (that often use their cheap Haiki model) for exploration and planning tasks, with only the results imported into the main context.

I've found the best results from a more interactive collaboration with Claude Code. As long as you describe the problem clearly, it does a good job on small/moderate tasks. I generally set two instances of Claude Code separate tasks and run them concurrently (the interaction with Claude Code distracts me too much to do my own independent coding simultaneously like with setting a task for a colleague, but I do work on architecture / planning tasks)

The one manner of taste that I have had to compromise on is the sheer amount of code - it likes to write a lot of code. I have a better experience if I sweat the low-level code less, and just periodically have it clean up areas where I think it's written too much / too repetitive code.

As you give it more freedom it's more prone to failure (and can often get itself stuck in a fruitless spiral) - however as you use it more you get a sense of what it can do independently and what's likely to choke on. A codebase with good human-designed unit & playwright tests is very good.

Crucially, you get the best results where your tasks are complex but on the menial side of the spectrum - it can pay attention to a lot of details, but on the whole don't expect it to do great on senior-level tasks.

To give you an idea, in a little over a month "npx ccusage" shows that via my Claude Code 5x sub I've used 5M input tokens, 1.5M output, 121M Cache Create, 1.7B Cache Read. Estimated pay-as-you-go API cost equivalent is $1500 (N.B. for the tail end of December they doubled everybody's API limits, so I was using a lot more tokens on more experimental on-the-fly tool construction work)

NiloCK · 2026-01-06T23:38:53 1767742733

FYI Opus is available and pretty usable in claude-code on the $20/Mo plan if you are at all judicious.

I exclusively use opus for architecture / speccing, and then mostly Sonnet and occasionally Haiku to write the code. If my usage has been light and the code isn't too straightforward, I'll have Opus write code as well.

covibes · 2026-01-10T12:52:30 1768049550

The problem with current approaches is the lack of feedback loops with independent validators that never lose track of the acceptance criteria. That's the next level that will truly allow no-babysitting implementatons that are feature complete and production grade. Check out this repo that offers that: https://github.com/covibes/zeroshot/

garblegarble · 2026-01-06T23:42:02 1767742922

That's helpful to know, thanks! I gave Max 5x a go and didn't look back. My suspicion is that Opus 4.5 is subsidised, so good to know there's flexibility if prices go up.

baq · 2026-01-07T07:08:53 1767769733

The $20 plan for CC is good enough for 10-20 minutes of opus every 5h and you’ll be out of your weekly limit after 4-5 days if you sleep during the night. I wouldn’t be surprised if Anthropic actually makes a profit here. (Yeah probably not, but they aren’t burning cash.)

vidarh · 2026-01-08T13:42:04 1767879724

I use the $200/month Claude Code plan, and in the last week I've had it generate about half a million words of documentation without hitting any session limits.

I have hit the weekly limit before, briefly, but that took running multiple sessions in parallel continuously for many days.

vidarh · 2026-01-08T13:38:28 1767879508

Just ask it to.

/init in Claude Code already automatically extracts a bunch, but for something more comprehensive, just tell it which additional types of things you want it to look for and document.

> Did you have to develop these skills yourself? How much work was that? Do you have public examples somewhere?

I don't know about the person above, but I tell Claude to write all my skills and agents for me. With some caveats, you can do this iteratively in a single session ("update the X agent, then re-run it. Repeat until it reliably does Y")

kaydub · 2026-01-07T18:12:36 1767809556

"Claude, clone this repo https://github.com/repo, review the coding conventions, check out any markdown or readme files. This is an example of coding conventions we want to use on this project"

Yoric · 2026-01-01T15:29:50 1767281390

Of course, any attempt at safety or security requires defense in depth.

But usually, any effort spent on making one layer sturdy is worth it.

Yoric · 2025-12-30T23:00:05 1767135605

The answer to that would very much be: "it depends".

Yes, of course, network I/O > local I/O > most things you'll do on your CPU. But regardless, the answer is always to measure performance (through benchmarking or telemetry), find your bottlenecks, then act upon them.

I recall a case in Firefox in which we were bitten by a O(n^2) algorithm running at startup, where n was the number of tabs to restore, another in which several threads were fighting each other to load components of Firefox and ended up hammering the I/O subsystem, but also cases of executable too large, data not fitting in the CPU cache, Windows requiring a disk access to normalize paths, etc.

tombert · 2025-12-30T23:38:24 1767137904

Sure, I will admit I was a bit hyperbolic here.

Obviously sometimes you need to do a CPU optimization, and I certainly do not think you should ignore big O for anything.

It just feels like 90+% of the time my “optimizing” boils down to figuring out how to batch a SQL or reduce a call to Redis or something.

Yoric · 2025-12-29T19:11:04 1767035464

Took me some time to remember that RDF meant Reality Distortion Field.

Yoric · 2025-12-29T19:08:43 1767035323

> Every report and available evidence shows he is barely technologically astute, nevermind genius; the accomplishments of his teams are despite him not because of him.

In particular, nothing that comes out of his mouth regarding AI makes any sense.

And still, people listen to him as if he was an expert. Go figure.

FireBeyond · 2025-12-29T19:19:29 1767035969

Or even vehicle autonomy.

His latest bullshit was about Tesla cameras and fog/rain/snow - on an investor call, no less - "Oh, we do photon counting directly from the sensor, so it's a non-issue".

No. 1, Tesla cameras are not capable of that - you need a special sensor, that's not useful for any real visual representation. And 2, even if you did, photon counting requires a closed "box" so to speak - you can't count photons in "open air".

And no-one calls it out.

alfiedotwtf · 2025-12-29T19:13:33 1767035613

I just don’t get it? Do people hang off his every word just because he’s rich? What are they expecting for this worship… it’s not like he’s going to start throwing $100 bills to people because they agree with him on Twitter

Yoric · 2025-12-29T22:09:53 1767046193

Seen from the other side of the Atlantic, I've regularly felt that the US is rather prone to hero worship, see e.g. the passion dedicated to presidential candidates, former presidents, billionaires, but also how the main characters of pretty much all American biopics I recall can't ever be wrong.

If my observation is correct, I guess what we're witnessing with Musk could be a case of hero worship – and in any narrative in which Musk is a hero, he's of course right.

Yoric · 2025-12-28T19:14:30 1766949270

Can confirm at least for Firefox. When I worked on it, I've spent literal years shaving seconds from startup, or shutdown, or milliseconds from tab switching.

Everybody likes to hate Telemetry, and yes, it can be abused, but that's how Mozilla (and its competitors) manage to make user's life more comfortable.

Yoric · 2025-12-28T11:29:19 1766921359

Sadly, that is exactly my experience.

Yoric · 2025-12-27T09:07:30 1766826450

As a variant, I recently stumbled upon a post that basically sums up to "people who disagree with me on AI are clearly blinded by their prejudice, it's so sad."

godelski · 2025-12-27T11:23:27 1766834607

Or

Your argument is dumb because it's objectively better to optimize x conditioned on y than optimize y conditioned on x.

Maybe the worst variant of this is where people don't realize they're actually arguing for different things but because it's the same general topic they assume everything is the same (duals are common). I feel like this describes many political arguments and it feels in part intentional...

Yoric · 2025-12-27T09:03:27 1766826207

Except for the white elephants, which were designed specifically as anti-gifts.

godelski · 2025-12-27T10:58:50 1766833130

Depends how you do white elephant...

But still, a good gag gift takes effort. It's not like you walk into a random store and pick the first thing you see.

The whole aspect of stealing gifts demonstrates this. It'd be pointless if the gifts were all low grade garbage. They'd be effectively fungible. Yet the theft part it is critical to making white elephant fun. Regardless if you're doing gag gifts or good gifts.

Yoric · 2025-12-27T14:53:12 1766847192

Er... white elephants were not gag gifts.

A white elephant is a gift that you cannot refuse, cannot regift, and is so expensive/complicated to take care of that it will become your primary concern for the rest of your life.

lcnPylGDnU4H9OF · 2025-12-27T17:56:04 1766858164

Well, yes, but it also means a gag gift; I'd hazard a guess that >99% of uses of the term in the past several decades have been of the "gag gift" persuasion. There are many white elephant parties thrown by people who care little for history.

Even then, intentionally ruining someone's financial life requires more care and attention than telling an AI agent to perform random acts of kindness (so far).

Yoric · 2025-12-27T18:54:39 1766861679

> Well, yes, but it also means a gag gift; I'd hazard a guess that >99% of uses of the term in the past several decades have been of the "gag gift" persuasion. There are many white elephant parties thrown by people who care little for history.

Is this an Americanism? I've never heard "white elephant" used with such a meaning.

> Even then, intentionally ruining someone's financial life requires more care and attention than telling an AI agent to perform random acts of kindness (so far).

Absolutely.

woooooo · 2025-12-27T09:38:04 1766828284

Even a deliberately bad gift as a gag shows some effort and socialization.