Hacker Newsnew | past | comments | ask | show | jobs | submit | ggoo's commentslogin

Really depends on the application, no? I wouldn't want my IDE opening every file in a new window.


That’s why the windows manager and user should be in control

I’d love to be able to arrange different tabs of different apps in one window


Is this satire?


Nope it isn’t. I did it as a joke initially (I also had a version where every 2 stories there was a meeting and if a someone underperformed it would get fired). I think there are multiple reasons why it actually works so well:

- I built a system where context (+ the current state + goal) is properly structured and coding agents only get the information they actually need and nothing more. You wouldn’t let your product manager develop your backend and I gave the backend dev only do the things it is supposed to and nothing more. If an agent crashes (or quota limits are reached), the agents can continue exactly where the other agents left off.

- Agents are ”fighting against” each other to some extend? The Architect tries to design while the CAB tries to reject.

- Granular control. I wouldn’t call “the manager” _a deterministic state machine that is calling probabilistic functions_ but that’s to some extent what it is? The manager has clearly defined tasks (like “if file is in 01_design —> Call Architect)

Here’s one example of an agent log after a feature has been implemented from one of the older codebases: https://pastebin.com/7ySJL5Rg


Thanks for clarifying - I think some of the wording was throwing me off. What a wild time we are in!


What OpenCode primitive did you use to implement this? I'd quite like a "senior" Opus agent that lays out a plan, a "junior" Sonnet that does the work, and a senior Opus reviewer to check that it agrees with the plan.


You can define the tools that agents are allowed to use in the opencode.json (also works for MCP tools I think). Here’s my config: https://pastebin.com/PkaYAfsn

The models can call each other if you reference them using @username.

This is the .md file for the manager : https://pastebin.com/vcf5sVfz

I hope that helped!


This is excellent, thank you. I came up with half of this while waiting for this reply, but the extra pointers about mentioning with @ and the {file} syntax really helps, thanks again!


> [...]coding agents only get the information they actually need and nothing more

Extrapolating from this concept led me to a hot-take I haven't had time to blog about: Agentic AI will revive the popularity of microservices. Mostly due to the deleterious effect of context size on agent performance.


Why would they revive the popularity of microservices? They can just as well be used to enforce strict module boundaries within a modular monolith keeping the codebase coherent without splitting off microservices.


And that's why they call it a hot take. No, it isn't going to give rise to microservices. You absolutely can have your agent perform high-level decomposition while maintaining a monolith. A well-written, composable spec is awesome. This has been true for human and AI coders for a very, very long time. The hat trick has always been getting a well-written, composable spec. AI can help with that bit, and I find that is probably the best part of this whole tooling cycle. I can actually interact with an AI to build that spec iteratively. Have it be nice and mean. Have it iterate among many instances and other models, all that fun stuff. It still won't make your idea awesome or make anyone want to spend money on it, though.


In a fresh project that is well documented and set up it might work better. Many issues that Agents have in my work is that the endpoints are not always documented correctly.

Real example that happened to me, Agent forgets to rename an expected parameter in API spec for service 1. Now when working on service 2, there is no other way of finding this mistake for the Agent than to give it access to service 1. And now you are back to "... effect of context size on agent performance ...". For context, we might have ~100 services.

One could argue these issues reduce over time as instruction files are updated etc but that also assumes the models follow instructions and don't hallucinate.

That being said, I do use Agents quite successfully now - but I have to guide them a bit more than some care to admit.


> In a fresh project that is well documented and set up it might work better.

I guess this may be dependent on domain, language, codebase, or soke combination of the 3. The biggest issues I've had with agents is when they go down the wrong path and it snowballs from there. Suddenly they are loading more context unrelated to the tasks and getting more confused. Documenting interfaces doesn't help if the source is available to the agent.

My agentic sweet spot is human-designed interfaces. Agents cannot mess up code they don't have access to, e.g. by inadvertently changing the interface contract and the implementation.

> Agent forgets to rename an expected parameter in API spec for service 1

Document and test your interfaces/logic boundaries! I have witnessed this break many times with human teams with field renames, change in optionality, undocumented field dependencies, etc, there are challenging trade-offs with API versioning. Agents can't fix process issues.


Isn't all this a manual implementation of prompt routing, and, to a lesser extent, Mixture of Experts?

These tools and services are already expected to do the best job for specific prompts. The work you're doing pretty much proves that they don't, while also throwing much more money at them.

How much longer are users going to have to manually manage LLM context to get the most out of these tools? Why is this still a problem ~5 years into this tech?


I'm confused when you say you have a manager, scrum master, archetech, all supposdely sharing the same memory, do each of those "employees" "know" what they are? And if so, based on what are their identities defined? Prompts? Or something more. Or am I just too dumb to understand / swimming against the current here. Either way, it sounds amazing!


Their roles are defined by prompts. Only memory are shared files and the conversation history that’s looped back to stateless API calls to an LLM.


quite a storyteller


It's not satire but I see where you're coming from.

Applying distributed human team concepts to a porting task squeezes extra performance from LLMs much further up the diminishing returns curve. That matters because porting projects are actually well-suited for autonomous agents: existing code provides context, objective criteria catch more LLM-grade bugs than greenfield work, and established unit tests offer clear targets.

I guess what I'm trying to say is that the setup seems absurd because it is. Though it also carries real utility for this specific use case. Apply the same approach to running a startup or writing a paid service from scratch and you'd get very different results.


I don't know about something this complex, but right this moment I have something similar running in Claude Code in another window, and it is very helpful even with a much simpler setup:

If you have these agents do everything at the "top level" they lose track. The moment you introduce sub-agents, you can have the top level run in a tight loop of "tell agent X to do the next task; tell agent Y to review the work; repeat" or similar (add as many agents as makes sense), and it will take a long time to fill up the context. The agents get fresh context, and you get to manage explicitly what information is allowed to flow between them. It also tends to mean it is a lot easier to introduce quality gates - eg. your testing agent and your code review agent etc. will not decide they can skip testing because they "know" they implemented things correctly, because there is no memory of that in their context.

Sometimes too much knowledge is a bad thing.


Humans seem to be similar. If a real product designer would dive into all the technical details and code of a product, he would likely forget at least some of the vision behind what the product is actually supposed to be.


Doubt it. I use a similar setup from time to time.

You need to have different skills at different times. This type of setup helps break those skills out.


why would it be? It's a creative setup.


I just actually can't tell, it reads like satire to me.


to me, it reads like mental illness


maybe it's a mix of both :)


Why would it be satire? I thought that's a pretty stranded Agentic workflows.

My current workplace follows a similar workflow. We have a repository full of agent.md files for different roles and associated personas.

E.g. For project managers, you might have a feature focused one, a delivery driven one, and one that aims to minimise scope/technology creep.


I mean no offence to anyone but whenever new tech progresses rapidly it usually catches most unaware, who tend to ridicule or feel the concepts are sourced from it.


yeah, nfts, metaverse, all great advances

same people pushing this crap


ai is actually useful tho. idk about this level of abstraction but the more basic delegation to one little guy in the terminal gives me a lot of extra time


Maybe that's because you're not using your time well in the first place


bro im using ai swarms, have you even tried them?


bro wanna buy some monkey jpegs?

100% genuine


[flagged]


> Laughing about them instead of creating intergenerational wealth for a few bucks?

it's not creating wealth, it's scamming the gullible

criminality being lucrative is not a new phenomenon


Are you sure that yours would sell for $80K, if you aren't using it to launder money with your criminal associates?


If the price floor is 80k and there are thousands then it means that even if just one was legit it would sell for 80k

Weird Im getting downvoted for just stating facts again


I think many people really like the gamification and complex role playing. That is how GitHub got popular, that is how Rube Goldberg agent/swarm/cult setups get popular.

It attracts the gamers and LARPers. Unfortunately, management is on their side until they find out after four years or so that it is all a scam.


I've heard some people say that "vibe coding" with chatbots is like slot machines, you just keep "propmting" until you hit the jackpot. And there was some earlier study that people _felt_ more productive even if they weren't (caveat that this was with older models), which aligns with the sort of time-dilation people feel when gambling.

I guess "agentic swarms" are the next evolution of the meta-game, the perfect nerd-sniping strategy. Now you can spend all your time minmaxing your team, balancing strengths/weaknesses by tweaking subagents, adding more verifiers and project managers. Maybe there's some psychological draw, that people can feel like gods and have a taste of the power execs feel, even though that power is ultimately a simulacra as well.


Extending this -- unlike real slot machines, there is no definite state of won or not for the person prompting, only if they've been convinced they've won, and that comes down to how much you're willing to verify the code it has provided, or better, fully test it (which no one wants to do), versus the reality where they do a little light testing and say it's good enough and move on.

Recently fixed a problem over a few days, and found that it was duplicated though differently enough that I asked my coworker to try fixing it with an LLM (he was the originator of the duplicated code, and I didn't want to mess up what was mostly functioning code). Using an LLM, he seemingly did in 1 hour what took me maybe a day or two of tinkering and fixing. After we hop off the call, I do a code read to make sure I understand it fully, and immediately see an issue and test it further only to find out.. it did not in fact fix it, and suffered from the same problems, but it convincingly LOOKED like it fixed it. He was ecstatic at the time-saved while presenting it, and afterwards, alone, all I could think about was how our business users were going to be really unhappy being gaslit into thinking it was fixed because literally every tester I've ever met would definitely have missed it without understanding the code.

People are overjoyed with good enough, and I'm starting to think maybe I'm the problem when it comes to progress? It just gives me Big Short vibes -- why am I drawing attention to this obvious issue in quality, I'm just the guy in the casino screaming "does no one else see the obvious problem with shipping this?" And then I start to understand, yes I am the problem: people have been selling eachother dog water product for millenia because at the end of the day, Edison is the person people remember, not the guy who came after that made it near perfect or hammered out all the issues. Good enough takes its place in history, not perfection. The trick others have found out is they just need to get to the point that they've secured the money and have time to get away before the customer realizes the world of hurt they've paid for.


I don't think so.



Not sure what you mean by "looks to be by"? Their GitHub is linked at the bottom of the page


This happened to me too, you need a phone number unfortunately


If this is all that's blocking you (not the fact that they don't want your business), you might have a friend who's been playing the T-Mobile free phone number game. I know some people with 4+ phone numbers they don't need/use, simply because they were free (one-time activation with old byod or taxes-only free phone).

I've considered asking to borrow a number to verify with Discord so they don't actually have my phone number, but decided I'd rather just be unverified.


You can get one for a few bucks


I'll check this out! I've been using doll for a while, but I don't think it's maintained anymore.

https://github.com/xiaogdgenuine/Doll


Nice! Didn't know about it. ExtraBar doesn't add the notifications to the apps, but it will give youa fully customized menu for them


Ah! I misunderstood your product :) Notifications would be an awesome feature to add


I will definitely think about it


Believe it when you see it folks. No sooner.


While this may work, it just seems like incredibly unhealthy advice to give or to take. People should be able to focus on their work without being expected to take up random side quests, and if there is a career path, make the requirements clear.


> make the requirements clear

idealistic, but more often than not, unrealistic, unfortunately


If your company does not make career growth requirements clear and actionable, your best vector for career growth is changing companies.


most everything is unrealistic if we just assume it to be so and give up on seeing the world in any different way.


Doing work without contributing to identifying what is the most important work to do can be soul killing.

Some people just want to be given exact instructions on what to do. Others find that role very frustrating.


I agree. This outlook also implies a greater degree of meritocracy than usually exists in a competitive corporate environment. Doing a good job and taking initiative sometimes leads to promotions, but it sometimes just leads to more work. Meanwhile, many ladder climbers are busy optimizing for their own success, not the corp's.


You want to leave any company where doing a good job doesn’t lead to raises and promotions, as soon as possible.


Let's assume those requirements aren't made clear. People will still get promoted, based on side quests and initiative. And those people will have demonstrated a lot of important qualities that all the others (who were blocked on lack of requirements) never showed.

So what's the problem again?


Just a thought but you could probably argue that the US is the largest and most advanced of the capitalist systems in the world, and that makes it somewhat of a bellwether for the future of other capitalist countries.


I think the US is rather unique from other capitalist countries in having such a large portion of its population being so poorly educated. This simply isn't the case in EU or Commonwealth countries - in the US its led people to vote against their own interests, allowing companies to lobby for things that benefit the companies and hurt people, in an ever increasing trend. I don't think it's to do with the size of the country, but the education level of the population.


Hm, this doesn't seemed to be backed up by, say, PISA scores [1], by which the US looks very similar to its OECD peers.

[1] https://en.wikipedia.org/wiki/Programme_for_International_St...


Looking at the wiki page, PISA seems to have some criticisms, one being it seems very gameable. I'm not really familiar with it.

It might be interesting to try and find some more objective support of my claim, and I'll try and post anything worth posting, but anecdotally...the difference between the US and other developed countries is night and day. It's so incredibly easy to run into people in the US who genuinely astoundingly lack the basic knowledge that in other countries it is taken for granted that adults hold.

There's a reason other countries talk shows don't have segments like asking random pedestrians to name any country, literally any country on a world map, so they can laugh when they fail.


Scientific advancement has suffered from the light pollution and that advancement is a driving force behind your modern life. So you have (or will) suffer indirectly over time.


> Scientific advancement has suffered from the light pollution

Has it?

Destroying the Amazon destroys information. Light pollution simply raises the cost of our accessing it. I suppose one could model this out to some effect on deep-space astronomy's productivity. But if that effect is real--and I've seen zero evidence it is--the solution is a tax on satellite launches to fund more observatories.


Your response is not in good faith - this is very easy to google.


> this is very easy to google

Then it should be easy to cite. Astronomers have complained. But I haven't seen anyone link that to output, including the complaining astronomers.


Search term: "low earth orbit satellite effects on astronomy" first result:

https://www.nature.com/articles/s41550-023-01904-2


OP said "scientific advancement has suffered from the light pollution," past tense. Your source explores a "potentially large rise in global sky brightness," and an "expected...rapid rise in night sky brightness."

These are not risks to be ignored. But we haven't even observed or quantified them, which is the first step to weighing mitigation options. (Which could be physical, e.g. lowering satellite reflectivity. Or geographic, putting more observatories are higher latitudes. Or even statistical, by launching space-based calibration telescopes, or building more array-based observatories.)


This paper shows how in 2023 scientists were already annoyed by this, that they had to accommodate this into their observations, and adjust their measurements accordingly. Suffered (past tense) may be hyperbolic, but it isn’t untrue either.

This 2023 paper is also issuing a warning, that if this continues without mitigation, ground based astronomy will be affected. They have the calculations to prove that. What they are particularly concerned about is detecting faint objects inside the radio wave spectrum will be impossible because it will be lost in noise.

Now 2 years have passed since this paper was published, and we still don’t have mitigations for ground based radio astronomy. I seriously doubt we will ever have one. And that the predictions of worse astronomy will become true, externalized into a type of internet you could have gotten with traditional cable, fiber optics, or a 5G radio tower.

EDIT:

> But we haven't even observed or quantified them, which is the first step to weighing mitigation options.

The paper I cited does that. In the abstract they say:

> We present calculations of the potentially large rise in global sky brightness from space objects in low Earth orbit, including qualitative and quantitative assessments of how professional astronomy may be affected.

and inside the paper they devote a whole chapter (chapter 5) to possible mitigations which is titled:

> Mitigations: potential gains and risks


> They have the calculations to prove that

They have calculations that show this is how our models play out.

> What they are particularly concerned about is detecting faint objects inside the radio wave spectrum will be impossible because it will be lost in noise

Could become. They're not talking about mitigation because we haven't observed the problem yet.

> Now 2 years have passed since this paper was published, and we still don’t have mitigations for ground based radio astronomy

Again, where is the "scientific advancement" that "has suffered"?

> seriously doubt we will ever have one

Based on what?!


You wanted to see the computations, I provided you with them and instead of admitting that you were wrong, you responded by casting doubt on their models. This doesn’t strike me as arguing in good faith. But very well, 5th on my list of the same search term gave me this:

Vera C. Rubin Observatory – Impact of Satellite Constellations

https://www.lsst.org/content/lsst-statement-regarding-increa...

The Vera Rubin Observatory came online only this June, but they were complaining about Starlink already last year, and provided preliminary observation how they affected their observations, and how they plan on mitigating it.

Both the 2023 paper and the Vera Rubin Observatory statement call for a set of policies to mitigate the effect of these satellites. However policymakers have not enacted any of these other then some NSF science grants to study potential solution (I don‘t know whether or not they were defunded by DOGE; although if they were, that would seem like a criminal conflict of interest). And I have my reservations about the willingness of governments in the world to come together and set the universal regulatory framework required to enforce these proposed mitigations.

Note that increased exposure time required because of these satellites will affect the number of available operations, which in turn will decrease the amount of astronomy done with this telescope. I want to note especially the conclusion:

> Overall, large numbers of bright satellites — and the necessary steps to avoid, identify, and otherwise mitigate them — will impact the ability of LSST to discover the unexpected.

When you are disputing this you are disputing top engineers and scientists in astronomy. You better have a good reason for that (other then protecting the wealth of billionaires).


> You wanted to see the computations, I provided you with them

No, you didn’t. I asked for evidence this had happened. I read the ‘23 paper two years ago. It’s neat. But it’s a model. We don’t have great model parameters for high-atmosphere nanoparticles. We also have great surveillance of the ozone layer, and aren’t seeing damage.

> other then some NSF science grants to study potential solution

Yeah, I agree with this. (It may have been DOGE’d.)

We need to know what we’re up against. We need to know if it’s a problem that call for a pause, or a mandate that aluminum structures to transitioned to steel and carbon, or if the problem goes away as satellites get bigger and burn up less.

> When you are disputing this you are disputing top engineers and scientists in astronomy

I really am not. I’m taking them at their word that this is a potential problem. Again, if you have evidence this is currently a problem, the language I originally objected to, I’d love to see it.


The scientists and engineers are raising the alarm that this will become an issue if nothing is done, we should not simply ignore them until they are proven right, like we are doing to climate scientists. So much damage can be prevented.


I think your attempted connection between astronomy and modern technological conveniences is pretty thin.


Does your phone have a camera on it?


After just coming back from a trip to Maui, yeah you can totally say the same about flights to Hawaii.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: