Based on a LinkedIn post by Anton Strukov: https://lnkd.in/ePMa2TC4
When teams plug LLM-based agents into real engineering work, the wall they hit first is almost never the model. It is context: how to retrieve the right slice of knowledge into the right agent at the right step, without either drowning it in noise or hallucinating the missing parts.
OpenViking reframes that problem. Instead of treating documents as opaque blobs for a vector index, it exposes knowledge as a traversable file system tailored to multi-agent orchestration.
File system as memory — not RAG
Classical RAG collapses corpus structure into a flat sea of chunks. OpenViking keeps the hierarchy explicit: directories, files, and relationships. Agents navigate like engineers navigate a codebase — from `/` to concrete leaves — pulling exactly the subtree the current task needs.
That changes what we can say about an answer. Every response is traceable to a specific path, version, and lineage. The retrieval step becomes auditable, and the failure mode shifts from unknown to locatable.
Recursive retrieval, not a flat query
The agent does not just ask 'find me the chunks most similar to this question'. It walks the directory tree, reads sibling context, inspects metadata, and widens or narrows the window based on what it actually finds. The retrieval trace is a path, not a vector list — something humans can read and review.
For long-running agents this matters. Debugging 'why did the agent answer that?' turns from forensic to trivial: you open the path they walked.
Two lenses we apply to every AI infra bet
The first lens is architecture. Is retrieval observable and composable, or is it a black box you can only tune by temperature and re-ranking? OpenViking leans into the observable side.
The second lens is unit economics. Every token re-injected into a prompt costs money and latency. A retrieval path that fetches the right 2 files instead of 200 noisy chunks is not just cleaner — it is cheaper and faster at runtime.
The interesting part is not 'OpenViking vs vector DB'. It is the pattern: when you are building agents that live longer than a single prompt, context retrieval becomes an engineering system, not a library call. It wants a shape, a schema, and an audit trail.
For teams betting on agentic AI in production, that shift — from prompt engineering to context engineering — is where the next margin of reliability and cost efficiency is going to come from.
#AIEngineering #AgenticAI #DeveloperTools #LLM #RAG