Hacker Newsnew | past | comments | ask | show | jobs | submit | dceddia's commentslogin

Interesting about the level of detail. I’ve noticed that myself but I haven’t done much to address it yet.

I can imagine some ideas (ask it for more detail, ask it to make a smaller plan and add detail to that) but I’m curious if you have any experience improving those plans.


I’m trying to solve this myself by implementing a whole planner workflow at https://github.com/solatis/claude-config

Effectively it tries to resolve all ambiguities by making all decisions explicit — if the source cannot be resolved to documentation or anything, it’s asked to the user.

It also tries to capture all “invisible knowledge” by documenting everything, so that all these decisions and business context are captured in the codebase again.

Which - in theory - should make long term coding using LLMs more sane.

The downside is that it takes 30min - 60min to write a plan, but it’s much less likely to make silly choices.


Have you tried the compound engineering plugin? [^1]

My workflow with it is usually brainstorm -> lfg (planning) -> clear context -> lfg (giving it the produced plan to work on) -> compound if it didn’t on its own.

[^1]: https://github.com/EveryInc/compound-engineering-plugin


That’s super interesting, I’ll take a look to see if I can learn something from it, as I’m not familiar with the concept of compound engineering.

Seems like a lot of it aligns with what I’m doing, though.


> The downside is that it takes 30min - 60min to write a plan

Oof you weren't kidding. I've got your skills running on a particularly difficult problem and it's been running for over three hours (I keep telling it to increase the number of reviews until its satisfied).


Yeah I’m working on some improvements in this area, should make things faster. But yeah I’ve frequently had 1h-2h planning sessions as well, depending upon the complexity of the task.

I have had good success with the plans generated by https://github.com/obra/superpowers I also really like the Socratic method it uses to create the plans.

I iterate around issues. I have a skill to launch a new tmux window for worktree with Claude in one pane and Codex in another pane with instructions on which issue to work on, Claude has instructions to create a plan, while Codex has instructions to understand the background information necessary for this issue to be worked on. By the time they're both done, then I can feed Claude's plan into Codex, and Codex is ready to analyze it. And then Codex feeds the plan back to Claude, and they kind of ping pong like that a couple times. And after a certain or several iterations, there's enough refinement that things usually work. Then Claude clears context and executes the plan. Then Codex reviews the commit and it still has all the original context so it knows what we have been planning and what the research was about the infrastructure. And it does a really good job reviewing. And again, then they ping pong back and forth a couple times, and the end product is pretty decent. Codex's strength is that it really goes in-depth. I usually do this at a high reasoning effort. But Codex has zero EQ or communication skills, so it works really well as a pedantic reviewer. Claude is much more pleasant to interact with. There's just no comparison. That's why I like planning with Claude much more because we can iterate.. I am just a hobbyist though. I do this to run my Ansible/Terraform infrastructure for a good size homelab with 10 hosts. So we actually touch real hardware a lot and there's always some gotchas to deal with. But together, this is a pretty fun way to work. I like automating stuff, so it really scratches that itch.

I concur, also with no benchmarks to share, but I had the experience of rewriting a video editor timeline to use WebGL instead of the 2D canvas I started with and it got much faster. Like being able to draw 10k+ rectangles at 60fps became easy, where with 2D canvas it was stumbling.

I don't know any project which uses the 2D canvas. It's horribly inefficient except for the most trivial use-cases (basically demos). Any serious web graphics uses WebGL and shaders.

Hard disagree. Canvas 2D is fully GPU-accelerated in modern browsers and can easily handle thousands of draw calls at 60fps,more than enough for most practical applications. For data visualization, interactive tools, drawing apps, and UI rendering, it's a robust and performant choice. WebGL is often overkill unless you're dealing with extreme datasets or 3D scenes. With its simpler API and faster startup, Canvas 2D is perfectly suited for the vast majority of 2D use cases. Labeling it as 'horribly inefficient' is simply wrong ._.

With their tagline being “video for developers”, isn’t this their whole thing? It seems like another service would be a better fit if having a management UI is a requirement.

So I’m probably in a similar spot - I mostly prompt-and-check, unless it’s a throwaway script or something, and even then I give it a quick glance.

One thing that stands out in your steps and that I’ve noticed myself- yeah, by prompt 10, it starts to suck. If it ever hits “compaction” then that’s beyond the point of return.

I still find myself slipping into this trap sometimes because I’m just in the flow of getting good results (until it nosedives), but the better strategy is to do a small unit of work per session. It keeps the context small and that keeps the model smarter.

“Ralph” is one way to do this. (decent intro here: https://www.aihero.dev/getting-started-with-ralph)

Another way is “Write out what we did to PROGRESS.md” - then start new session - then “Read @PROGRESS.md and do X”

Just playing around with ways to split up the work into smaller tasks basically, and crucially, not doing all of those small tasks in one long chat.


I will check out Ralph (thank you for that link!).

> Another way is “Write out what we did to PROGRESS.md” - then start new session - then “Read @PROGRESS.md and do X”

I agree on small context and if I hit "compacting" I've normally gone too far. I'm a huge fan of `/clear`-ing regularly or `/compact <Here is what you should remember for the next task we will work on>` and I've also tried "TODO.md"-style tracking.

I'm conflicted on TODO.md-style tracking because in practice I've had an agent work through everyone on the list, confidently telling me steps are done, only to find that's not the case when I check its work. Either a TODO.md that I created or one I had the agent create both suffer from this. Also, getting it update the TODO.md has been frustrating, even when I add it to CLAUDE.md "Make sure to mark tasks as complete in the TODO.md as you finish them" or adding the same message to the end of all my prompts, it won't always update it.

I've been interested in trying out beads to see if works better than a markdown TODO file but I haven't played with that yet.

But overall I agree with you, smaller chunks are key to success.


I hate TODO.mds too. If I ever have to use one, I'll keep track of it manually, and split the work myself into chunks of the size I believe CC/codex can handle. TODO.md is a recipe for failure because you'll quickly have more code than you can review and nothing to trust that it was executed well.

That looks so ridiculous that it has me wondering how hard of a technical change it would’ve been to change that drag target, and if they just punted on it.


Is it possible for Ghostty to figure out how much memory its child processes (or tabs) are using? If so maybe it would help to surface this number on or near the tab itself, similar to how Chrome started doing this if you hover over a tab. It seems like many of these stem from people misinterpreting the memory number in Activity Monitor, and maybe having memory numbers on the tabs would help avoid that.


In many cases today “gif” is a misnomer anyway and mp4 is a better choice. Not always, not everywhere supports actual video.

But one case I see often: If you’re making a website with an animated gif that’s actually a .gif file, try it as an mp4 - smaller, smoother, proper colors, can still autoplay fine.


I had kinda suspected this just based on my own experience of paper vs screen, but hadn’t run across any research.

After seeing your comment I went looking! I found this interesting: https://phys.org/news/2024-02-screens-paper-effective-absorb...


That was one of the studies that I saw too.

There's some others about learning more from writing with pen on paper compared to tablet or taking notes digitally typing.

I am a digital note taker at heart but can't deny using a notebook still has better outcomes sometimes.


The situation on Windows got remarkably better and cheaper recently-ish with the addition of Azure code signing. Instead of hundreds or thousands for a cert it’s $10/month, if you meet the requirements (I think the business must have existed for some number of years first, and some other things).

If you go this route I highly recommend this article, because navigating through Azure to actually set it up is like getting through a maze. https://melatonin.dev/blog/code-signing-on-windows-with-azur...


Thanks for the link, I see only available to basically US, Canada and EU though.


That's not easier and cheaper than before. That's how it's always been only now you can buy the cert through Azure.

For an individual the Apple code signing process is a lot easier and more accessible since I couldn't buy a code signing certificate for Windows without being registered as a business.


> That's how it's always been only now you can buy the cert through Azure.

Where can you get an EV cert for $120/year? Last time I checked, all the places were more expensive and then you also had to deal with a hardware token.

Lest we talk past each other: it's true that it used to be sufficient to buy a non-EV cert for around the same money, where it didn't require a hardware token, and that was good enough... but they changed the rules in 2023.


> it’s $10/month

So $120 a year but no it's only Apple with a "tAx"


Millions of Windows power users are accustomed to bypassing SmartScreen.

A macOS app distributed without a trusted signature will reach a far smaller audience, even of the proportionately smaller macOS user base, and that's largely due to deliberate design decisions by Apple in recent releases.


As you said, you need to have a proper legal entity for about 2 years before this becomes an option.

My low-stakes conspiracy theory is that MS is deliberately making this process awful to encourage submission of apps to the Microsoft Store since you only have to pay a one-time $100 fee there for code-signing. The downside is of course that you can only distribute via the MS store.


Not quite a blooper but I thought it was neat:

I searched Kagi for “veterans day 2025” the other day (on Veterans Day, when I was unsure) and it answered

“= today”


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: