Hacker Newsnew | past | comments | ask | show | jobs | submit | nadis's commentslogin

Hopefully this brings some of the Superhuman product magic to Grammarly. Although both products could improve AI functionality significantly IMO.

Nearly 10 years later.

Appreciated this premise as it feels like a natural progression of what's needed now that AI can be used to generate more code, more quickly.

"AI-assisted coding has fundamentally shifted where complexity lives. We can now go from concept to code in minutes, but deploying reliably at scale still takes days or weeks. The constraint has moved from creation to delivery.

More people can build software, so more software gets built. More software means more infrastructure to provision, monitor, and maintain. The operational burden compounds while our ability to manage it manually approaches its limits."


Just saw this on X but haven't read it yet: https://ampcode.com/how-to-build-an-agent


I just gave this a read, super interesting! I've been thinking a lot about simulations on a slightly tangential note, interesting to see how you're using agents to test other agentic products.


+1 to this comment! Until this post was not aware of knit chickens being trendy but have noticed chicken content picking up steam, at least on my feed (e.g. drinking with chickens on Instagram etc).


This is really cool and a fascinating application of agents (and problem-solving when you need 100% accuracy but are using a technology that's known for hallucinations). Would love to read more technical blog posts in the future that get further into the nitty gritty details of what you built / how you built it and the iteration I imagine this required.

Also a big fan of the name Iris for a tax development agent :)


"To do that, we need to own the full stack — without leaning on layers of abstraction we don't control. That means no critical dependencies, not even React. We're starting with a fork of Preact, a mature virtual DOM library already used heavily at Shopify, Google, and countless others."

I really wish there were more details around why they chose to start with a fork of Preact beyond just this sentence. Sort of a surprising decision to me.


I'd guess to add something like RSC or more integrated data loading. Seems like we have to wait until October to find out more, but hopefully they drop more information earlier.


In the preview video, I appreciated Katy Shi's comment on "I think this is a reflection of where engineering work has moved over the past where a lot of my time now is spent reviewing code rather than writing it."

Preview video from Open AI: https://www.youtube.com/watch?v=hhdpnbfH6NU&t=878s

As I think about what "AI-native" or just the future of building software loos like, its interesting to me that - right now - developers are still just reading code and tests rather than looking at simulations.

While a new(ish) concept for software development, simulations could provide a wider range of outcomes and, especially for the front end, are far easier to evaluate than just code/tests alone. I'm biased because this is something I've been exploring but it really hit me over the head looking at the Codex launch materials.


> a lot of my time now is spent reviewing code rather than writing it.

Reviewing has never been a panacea. It’s a best-effort at catching obvious mistakes, like a second opinion. Only with highly rigorous tests can reviewing give as high confidence as I trust another engineer or myself. Generally cadence of code output has never been a bottleneck for me, rather the opposite (if I had more time I’d write you a shorter letter).

Most importantly, writing code that is testable on meaningful boundaries is an extremely difficult and delicate art form, which ime is something you really want to get right if possible. Not saying an AI can or can’t do that, only that it’s the hardest part. An army of automated junior engineers still can’t win over the complexity beast that yolo programming causes. At some point code mutations will cause more problems as side effects than what they fix.


> An army of automated junior engineers still can’t win over the complexity beast that yolo programming causes. At some point code mutations will cause more problems as side effects than what they fix.

This resonates a lot for me, completely agreed.


++ Kind of my whole thesis with Graphite. As more code gets AI-generated, the weight shifts to review, testing, and integration. Even as someone helping build AI code reviewers, we'll _need_ humans stamping forever - for many reasons, but fundamentally for accountability. A computer can never be held accountable

https://constelisvoss.com/pages/a-computer-can-never-be-held...


> A computer can never be held accountable

I think the issue is not about humans being entirely replaced. Instead, the issue is that if AI replaces enough number of knowledge workers while there's no new or expanded market to absorb the workforce, the new balance of supply and demand will mean that many of us will have suppressed pay or worse, losing our jobs forever.


That is true regardless of whether there is or isn't a "new or expanded market to absorb the workforce".

It's a crucial insight that's usually missed or eluded in discussions about automation and workforce - unless you're literally at the beginning of your career, losing your career to automation screws you over big time, forever. At best, you'll have to downsize your entire lifestyle, and that of your family, to be commensurate with your now entry-level pay. If you're halfway through the career that suddenly ended, you won't recover.

All the new jobs and markets are for the kids. Mind you, not your kids - your kids are going to be disadvantaged by their household being suddenly thrown into financial insecurity or downright poverty, and may not even get a chance to start a good career path with their peers.

That, not "anti technology sentiment", is why Luddites smashed the looms. Those were people who got rug-pulled by business decisions and thrown into poverty, along with their families and communities.


> A computer can never be held accountable

I feel like I've been thinking along similar lines recently (due to re-read this though!) but instead of "computer" am replacing it with "AI" or "Agents" these days. Same point holds true.


> I think this is a reflection of where engineering work has moved over the past where a lot of my time now is spent reviewing code rather than writing it.

This was always true. Front-End code is not really code. Most of the back-end code is just convert and moving data around. For most functionality where you need "real code" like crypto, compression, math, etc.. you use a library used by another 100k developers.


Re:simulation Deebo does this for debugging: https://github.com/snagasuri/deebo-prototype


Thanks for sharing - wasn't familiar with Deebo!


> rather than looking at simulations

You mean like automated test suites?


automated visual fuzzy-testing with some self-reinforcement loops

There's already library's for QA testing and VLM's can give critique on a series of screenshots automated by a playwright script per branch


Cool. Putting vision in the loop is a great idea.

Ambitious idea, but I like it.


I used Cline to build a tiny testing helper app and this is exactly what it did!

It made changes in TS/Next.js given just the boiletplate from create-next-app, ran `yarn dev` then opened its mini LLM browser and navigated to localhost to verify everything looked correct.

It found 1 mistake and fixed the issue then ran `yarn dev` again, opened a new browser, navigated to localhost (pointing at the original server it brought up, not the new one at another port) and confirmed the change was correct.

I was very impressed but still laughed at how it somehow backed its way into a flow the worked, but only because Next has hot-reloading.


SmolVLM, Gemma, LlaVa, in case you wanna play with some of the ones i've tried.

https://huggingface.co/blog/smolvlm

recently both llama.cpp and ollama got better support for them too, which makes this kind of integration with local/self-hosted models now more attainable/less expensive


also this for the visual regression testing parts, but you can add some AI onto the mix ;) https://github.com/lost-pixel/lost-pixel


Yes, the above reply is more what I meant! Vision / visualization not just more automated testing.

Definitely ambitious!


Have met someone that mentioned using them for taking photos of their kid while playing together in a hands-free way to capture cute moments but it sounded very much like a hobbyist use case not their intended purpose necessarily.


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: