Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Importance of context management in AI NPCs (walterfreedom.com)
56 points by walterfreedom 1 day ago | hide | past | favorite | 24 comments





Sounds like his original system was for the AI characters to be fed all events that happen in the universe. Then he changed it so that the AI characters were only fed events they reasonably should have gotten. Makes sense and seems like a great idea.

Something I wonder—often in a game, the focus follows the player character, and the universe stops when we go away. Maybe a simplified model will run to represent time passing while we are away, in games where that sort of thing matters. This is fine because our NPCs are basically static, if you freeze them, and then wake them up when the PC shows up. They aren’t deep enough for the missing day to day events to matter.

But, with more complex NPCs, will the fact that their lives pause while we’re gone shatter the illusion? It seems like his original system (the universe broadcasts to every NPC even while they are not doing anything) could fudge that a bit and retain a feeling of ongoing background life, in some cases. While in the new system they are ready frozen…

I dunno. What to do? Maybe run a simplified model and have it generate some appropriate local events for the NPCs while they are frozen (some, fewer than when they were on the receiving end of the whole universe).


In X4, the space game[1], they simulate the entire "universe" in the background. That includes ships flying around trading, fighting, stations running short of supplies etc etc.

To have this work on a modest computer, they have multiple fidelity levels, which primarily affects observable stuff. So near the player, all details of flight, collisions, projectiles etc are simulated. Further away certain collision checks are skipped and such.

Really far away, the simulation runs at a much reduced rate, flight simulation is significantly simplified, statistical methods are used for calculating weapon damage and such.

This does have the issue of discrepancies between levels. In the lowest fidelity mode, fleet A might consistently beat fleet B, while at the highest fidelity level it can be the opposite.

That said it's quite fun in the sense that playthroughs are seldom the same, and it allows for the player to make significant impact by simply helping one faction produce more goods that allows them to build more ships etc.

[1]: https://en.wikipedia.org/wiki/X4:_Foundations


I think you can mostly fake this by waiting until the player reenters the range to generate what happened since the last time they interacted. If it's a complex simulation it won't work without more effort, but if it's flavor text like "Bob told me last week you killed the dragon, nice work!" then it can be done like 5ms after the player enters the simulation radius of the NPC.

> Maybe a simplified model will run to represent time passing while we are away, in games where that sort of thing matters

More primitive, but this is technically how AI in Bethesda games work, has this background simulation for NPCs going on when out of sight. Think it's mainly focused on movement patterns though.

Dwarf fortress as well has it's world simulation setup such that the current tile is higher precision than the rest of the world. In fact I think plugging the output of this simulation would work perfectly as context for an LLM (or Embedding?) depending on what you wanted to accomplish...


Dwarf Fortress + LLM could be a really interesting idea. Although, I also wonder if it is like… so much effort was put into the simulation itself. Almost seems like a shame to start filling any gaps with LLM, haha.

There are certain angles you just can't cover with traditional methods in my opinion, we will have to wait until people get a broader intuition on when/where to harmonize them.

For games like dwarf fortress, I think embeddings are the easiest to integrate, they can handle aspects of subjectivity we can't be easily simulated. Even simple things like a sentiment meter that measures how screwed up the world is right now would be a fun and meaningful addition. For LLMs the main value could be just having it clean up and summarize generated text, constructing a more coherent real-time narrative on what's happening in battle for example. Dwarf Fortress has so much depth to it, but comprehending it all can be overwhelming at times.


Add the NPCs to a table to be ticked at some rate or triggered by some event. Calculate their actions and execute on them at that rate. Most NPCs do not need to be ticked. Some will need to be ticked every game day. Some can simply be ticked after some event game altering event.

Maybe I'm not understanding you but it doesn't seem all that different from running a non-AI story NPC.


Parallelization and off-load to beefy computers. Run a more complete simulation, stream the results back to the player, and define boundaries where things become sequential.

EDIT: Also observation and action masking is being explored as a core part of agent design. Definitely a skill and something that needs to be thoughtful for it to work but see where action masking is being applied in PettingZoo environments using Langchain: https://pettingzoo.farama.org/tutorials/langchain/langchain/. I'm using something similar for a WW2 roguelike I'm working on. The idea is we train agents to operate as soldiers, squads, platoons, companies... With some abstractions and we can represent full fronts in WW2, battles with 1000s of agents, all in a cool ASCII environment (:


You can effectively accomplish something like this already simply through good writing and programming, accounting for these gaps as part of an NPC's characterization. It's a cool way to play with player expectations, and one of the core bases behind Toby Fox's games.

The problem with AI NPCs is actually not strictly a context problem and cannot be fixed with prompt engineering or RAG, because the LLM knows a _vast_ amount of stuff outside of the context you feed it.

No matter how you tell it how to roleplay or how many instructions you give it or don't give it, there is always the problem that you can ask it to write a front end app in JS for you and it will. Or ask it about the theory of relativity or anything else that the AI is capable of conversing about but the character would not be. It is trivially easy to jailbreak out of fictional personas.


Unlike current videogames, LLMs by default flips the relationship with the limits: the agent is completely open-ended by default, but the player has to play along to a certain extent.

This makes the experience a lot closer to a tabletop game, where you can say that your D&D character does anything you want, and it's a negotiation between the player, the dungeon master, the dice, and the rules as to whether you allow it to happen and what the result is.

An LLM by default tends to be the world's most permissive dungeon master, so the burden of keeping things consistent shifts to the player. Early AI Dungeon gameplay is a typical example. Feels kind of like forum roleplaying, if you're familiar with it--there's no technical limit to what you can write so social conventions (and the mods) are what's preventing you from god-modding.

This is very different from the "try to break everything" way a lot of video game players approach things.

We might be able to eventually build an LLM system where fantasy knights don't know javascript and you can't summon a dragon by typing "there's a dragon." But that's going to take a lot of hard technical work, because it's very unnatural for an LLM out of the box.


will the fact that their lives pause while we’re gone shatter the illusion?

When two people leave the house in the morning and return home at night, does either person truly know what happened with the other person? Almost every interaction you have with other people is just “filling in” reality (reality is sparse, as far as you are concerned).

Person A: Saw a guy in a pikachu outfit at work today

Person B: Okay.

What else is there to say in reality? Just accept the context given to you.


I have been hearing the term 'context engineering' and wanted to share my article about it, which is mostly focusing on AI npcs and not enterprise use cases.

Yes, common issue.

We will probably get something more formalized , like "context occlusion", for games in the future.


For longer threads I find frequent prompting for sectional summaries along the lines of spaced repetition helpful.

Recently was evaluating some investment property for a friend that had valuable timber, historic notability, distressed foundation and structural elements, compromised access, and utility hookup challenges.

I would document and tackle all of the structural challenges then have ai summarize. Move on to all the access challenges. Summarize. All of the historic noteworthiness criteria etc.

Most people passed on the site because of one or other deal breakers but by synthesizing summaries was able to discover assistance programs targeted for “ag + historic” or “historic + access” initiatives etc that made the numbers pencil out as a “no-brainer” investment. These wouldn’t have unlocked otherwise without five decades of experience rehabilitating farmhouses in this part of the country.

Anyway, I imagine a similar approach would work for enriching NPC interactions in a game world.


>In my local ai (mistral-nemo) around 10 thousand tokens of context decreases my token gen speed from 70t/s to 20 t/s . And the LLM starts ignoring the context after a while.

as much as it pains me to say this, only cloud models are somewhat viable for this. AI-powered NPCs are my dream too, and after many attempts with countless local and cloud models, I've given up for now. locals are retarded and incurably sloppy, clouds can be tard-wrangled into producing somewhat decent prose, but they are prohibitively expensive.

mistral models are particularly soulless and full of cliches.

https://eqbench.com/creative_writing_longform.html

https://eqbench.com/results/creative-writing-longform/mistra...


My OS gets slower and buggy if I don't reboot. So I'll try to convince my users to reboot often and optimize their workflows for rebootability.

Feels like trying to solve a problem that shouldn't exist in the first place.

Once an OS that doesn't require reboots appear, this concept will look silly and everyone that optimized their workflows for reboots will look like dorks.


It took _decades_ for things to get to the point where "turn off the computer and turn it back on again" was not the go-to advice for all desktop tech problems.

This actually only really applied to windows.

Bad things can happen if you just reboot Linux when you have a problem.

And macos (used to be?) something i restarted once every few months. Any issues would just need an application or finder restart at worse.


I agree, there is no innovation whatsoever in this approach.

Furthermore, it ignores important lessons we've learned a long time ago.


I have a feeling that Half Life 3 will have groundbreaking AI NPCs.

I think people forget that all AI inference is role playing to some extent. It pretends to be a chatbot, or a programmer, or whatever. There is no real difference between that and telling it to pretend to be a wizard.

I highly recommend having your prompt in Claude code or Roo or what have you include a “talk like an Arthurian wizard including in your code commits and PR‘s.” Line.

this guy really turned his note taking app project to a new look on context engineering



Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: