More

__mharrison__ · 2026-02-12T23:11:27 1770937887

Great, they can pay me the $60k they owe me for pirating my books...

utopiah · 2026-02-13T06:38:48 1770964728

No, no didn't you hear? If they were to actually play by the rules, then there wouldn't be an industry! /$

__mharrison__ · 2026-02-12T15:02:29 1770908549

Is there a skill file I can use for these edits?

__mharrison__ · 2026-02-09T15:05:30 1770649530

I was expecting photos...

brucehoult · 2026-02-10T01:40:11 1770687611

Here's a not bad (and accidental!) 342 km one from 7 months ago:

https://www.reddit.com/r/newzealand/comments/1m9p0bh/tapuaeo...

In theory there should be a 365.3 km one just barely to the left of the obvious twin peaks, but I'm not seeing it.

Quite a bit short of the longest in the world, but still pretty far to see.

__mharrison__ · 2026-02-09T05:55:08 1770616508

I'm reading all these articles and having the same thought. These folks aren't using the same tools I'm using.

the-grump · 2026-02-09T06:03:43 1770617023

I feel so weird not being the grumpy one for once.

Can't relate to GP's experience of one-shotting. I need to try a couple of times and really hone in on the right plan and constraints.

But I am getting so much done. My todo list used to grow every year. Now it shrinks every month.

And this is not mindless "vibe coding". I insist on what I deploy being quality, and I use every tool I can that can help me achieve that (languages with strong types, TDD with tests that specify system behaviour, E2E tests where possible).

acjohnson55 · 2026-02-10T17:05:14 1770743114

I regret using the term "one-shot", because my reality isn't really that. It's more that the first shot gets the code 80-90% of the way there, usually, and it short-circuits a ton of the "code archaeology" I would normally have to do to get to that point.

Some bugs really can be one-shotted, but that's with the benefit of a lot of scaffolding our company has built and the prompting process. It's not as simple as Claude Code being able to do this out of the box.

all2 · 2026-02-09T06:14:40 1770617680

I'm on my 5th draft of an essentially vibe-coded project. Maybe its because I'm using not-frontier models to do the coding, but I have to take two or three tries to get the shape of a thing just right. Drafting like this is something I do when I code by hand, as well. I have to implement a thing a few times before I begin to understand the domain I'm working in. Once I begin to understand the domain, the separation of concerns follows naturally, and so do the component APIs (and how those APIs hook together).

the-grump · 2026-02-09T07:13:19 1770621199

My suggestions:

- like the sister comment says, use the best model available. For me that has been opus but YMMV. Some of my colleagues prefer the OAI models.

- iterate on the plan until it looks solid. This is where you should invest your time.

- Watch the model closely and make sure it writes tests first, checks that they fail, and only then proceeds to implementation

- the model should add pieces one by one, ensuring each step works before proceeding. Commit each step so you can easily retry if you need to. Each addition will involve a new plan that you go back and forth on until you're happy with it. The planning usually gets easier as the project moves along.

- this is sometimes controversial, but use the best language you can target. That can be Rust, Haskell, Erlang depending on the context. Strong types will make a big difference. They catch silly mistakes models are liable to make.

Cursor is great for trying out the different models. If opus is what you like, I have found Claude code to be better value, and personally I prefer the CLI to the vscode UI cursor builds on. It's not a panacea though. The CLI has its own issues like occasionally slowing to a crawl. It still gets the work done.

all2 · 2026-02-10T22:52:41 1770763961

My options are 1) pay about a dollar per query from a frontier model, or 2) pay a fraction of that for a not-so-great model that makes my token spend last days/weeks instead of hours.

I spend a lot of time on plans, but unfortunately the gotchas are in the weeds, especially when it comes to complex systems. I don't trust these models with even marginally complex, non-standard architectures (my projects center around statecharts right now, and the semantics around those can get hairy).

I git commit after each feature/bugfix, so we're on the same page here. If a feature is too big, or is made up of more than one "big" change, I chunk up the work and commit in small batches until the feature is complete.

I'm running golang for my projects right now. I can try a more strongly typed language, but that means learning a whole new language and its gotchas and architectural constraints.

Right now I use claude-code-router and Claude Code on top of openrouter, so swapping models is trivial. I use mostly Grok-4.1 Fast or Kimi 2.5. Both of these choke less than Anthropic's own Sonnet (which is still more expensive than the two alternatives).

girvo · 2026-02-09T07:23:01 1770621781

> and personally I prefer the CLI to the vscode UI cursor builds on

So do I, but I also quite like Cursor's harness/approach to things.

Which is why their `agent` CLI is so handy! You can use cursor in any IDE/system now, exactly like claude code/codex cli

the-grump · 2026-02-09T07:29:04 1770622144

I tried it when it first came out and it was lacking then. Perhaps it's better now--will give it a shot when I sign up for cursor again.

Thank you for sharing that!

chrispyfried · 2026-02-09T15:22:29 1770650549

When you say “iterate on the plan” are you suggesting to do that with the AI or on your own? For the former, have any tips/patterns to suggest?

the-grump · 2026-02-10T04:03:51 1770696231

With the AI. I read the whole thing and correct the model where it makes mistakes, fill the gaps where I find them.

I also always check that it explicitly states my rules (some from the global rules, some from the session up until that moment) so they're followed at implementation time.

In my experience opus is great at understanding what you want and putting it in a plan, and it's also great at sticking to the plan. So just read through the entire thing and make sure it's a plan that you feel confident about.

There will be some trial and error before you notice the kind of things the model gets wrong, and that will guide what you look for in the plan that it spits out.

mistercow · 2026-02-09T07:01:28 1770620488

> Maybe its because I'm using not-frontier models to do the coding

IMO it’s probably that. The difference between where this was a a year ago and now is night and day, and not using frontier models is roughly like stepping back in time 6-12 months.

__mharrison__ · 2026-02-08T07:19:16 1770535156

I'm playing with local first openclaw and qwen3 coder next running on my LAN. Just starting out but it looks promising.

bluerooibos · 2026-02-08T16:00:19 1770566419

On what sort of hardware/RAM? I've been trying ollama and opencode with various local models on a 16Gb RAM, but the speed, and accuracy/behaviour just isn't good enough yet.

__mharrison__ · 2026-02-09T04:57:12 1770613032

DGX Spark (128gb)

__mharrison__ · 2026-02-05T18:54:49 1770317689

I never really used Codex (found it to slow) just 5.2, which I going to be an excellent model for my work. This looks like another step up.

This week, I'm all local though, playing with opencode and running qwen3 coder next on my little spark machine. With the way these local models are progressing, I might move all my llm work locally.

andix · 2026-02-05T19:24:26 1770319466

I think codex got much faster for smaller tasks in the last few months. Especially if you turn thinking down to medium.

raffkede · 2026-02-05T20:20:15 1770322815

I think the slow feeling is a UI thing in codex

__mharrison__ · 2026-02-05T22:11:47 1770329507

I realize my comment was unclear. I use codex the CLI all the time, but generally with this invocation: `codex --full-auto -m gpt-5.2`

However, when I use the 5.2codex model, I've found it to be very slow and worse (hard to quantify, but I preferred straight-up 5.2 output).

__mharrison__ · 2026-02-03T17:55:09 1770141309

Really enjoying using prek.

Dedicated a whole chapter to it in my latest book, Effective Testing.

The trend of fast core (with rust) and convenient wrapper is great while we are still writing code.

__mharrison__ · 2026-01-30T05:50:40 1769752240

I'm working on a Django app. This would make production deployment a bit easier.

Also sad that the test suite isn't open source. Would help drive development of the new DB...

__mharrison__ · 2026-01-30T03:07:16 1769742436

We recently flew overseas and a "klymit"-style sleeping pad came in very handy.

Very small and lightweight. Takes the edge off of a hard floor.

__mharrison__ · 2026-01-28T18:53:54 1769626434

"every way" is strong words.

Pandas is better for plotting and third party integration.