19 replies

RAG at scale: handling tons of technical docs

🍼Noob

I’m looking for advice on the current best practices for building a RAG system over a large corpus of technical documents (think specs, manuals, internal docs, etc.).

Context:

* Very large document set (tens/hundreds of thousands of files)
* Mostly technical text (structured + unstructured)
* Need accurate retrieval, minimal hallucinations

Questions:

1. What architectures are people using in 2025/2026? (classic embeddings + vector DB vs hybrid vs graph-RAG?)
2. Recommended chunking strategies for technical docs?
3. How are you handling evaluation + grounding quality?

Would love real-world lessons learned or links to solid repos/blogs. The current state of the web feels unclear, and I haven’t found much high-quality research on this topic yet.

TTC

Theo's Typesafe Cult•2mo ago•

19 replies

Plaster

RAG at scale: handling tons of technical docs

🍼Noob

RAG at scale: handling tons of technical docs

Similar Threads

RAG at scale: handling tons of technical docs

Similar Threads

Similar Threads

Similar Threads