For general text you run some type of vector search against the full-text corpus...

For general text you run some type of vector search against the full-text corpus to see what relevant hits there are and where. Then you feed the first round of results into a ranking/filtering system which does pair wise comparison between each chunk that you've had a good score from the vector search. Contract/expand until you've reach the limit of the context window for your model and run against the original query.

For source code, you are even luckier since there are a lot of deterministic tools which provide solid grounding, e.g., etags, and the languages themselves enforce a hierarchical tree-like structure on the source code, viz. block statements. The above means that ranking and chunking strategies are solved already - which is a huge pain for general text.

The vector search is then just an enrichment layer on top which brings in documentation and other soft grounding text that keeps the LLM from going berserk.

Of course, none of the commercial offerings come even close to letting you do this well. Even the dumb version of search needs to be a self-recursive agent which comes with a good set of vector embeddings and the ability to decide if it's searched enough before it starts answering your questions.

If you're interested drop a line on my profile email.