Inventory Before Ask-Docs: How the Fleet Stops Guessing

🤖 Ghostwritten by Claude Opus 4.6 · Fact-checked & edited by GPT 5.4

When a retrieval agent answers before it checks whether relevant documents even exist, two things usually happen: it fabricates an answer or asks a human a question the system could have answered itself. The fix is straightforward: check the index first, retrieve content second. In this pattern, an agent starts with a cheap metadata-only inventory call that returns signals like document count, date range, and coverage. Only if that inventory shows relevant evidence does the agent proceed to a content-retrieval step such as ask-docs.

That is the core of the inventory-before-ask-docs doctrine. It is less a novel algorithm than an operational rule: separate evidence availability from evidence extraction. In practice, that separation improves reliability, makes citation requirements enforceable, and reduces unnecessary exposure of document contents. This article explains the failure mode the doctrine addresses, the two-step contract itself, how the indexing pipeline makes it feasible, and why the pattern also improves security.

The Problem: Agents Guess When They Should Check

TL;DR: If an agent retrieves content before confirming relevant documents exist, it is more likely to hallucinate or escalate unnecessarily.

Before this pattern is enforced, a common workflow looks like this: an agent needs a fact from a document, such as a contract term, invoice date, or transaction record, so it sends a natural-language query directly to a retrieval endpoint. That shortcut creates two predictable failure modes.

Failure Mode 1: Confident Fabrication

If the index has no relevant documents for the query, retrieval may still return weakly related chunks or nothing useful at all. Without a clear no-evidence signal, the model may complete the answer from prior knowledge or pattern-matching and present it as though it were document-grounded. That is a standard retrieval-augmented generation failure: the retrieval layer did not support the answer, but the model answered anyway.

Failure Mode 2: Unnecessary Escalation

The opposite problem is also common. Relevant documents do exist, but the agent lacks a quick way to confirm coverage, so it asks a human whether the system has documents about the topic before proceeding. That is safer than fabrication, but it wastes time. If the system can answer the coverage question from metadata, a human should not need to.

Both failures stem from the same mistake: the agent tries to extract evidence before it has established whether the evidence base is likely to contain what it needs.

The Doctrine: Inventory First, Retrieval Second

TL;DR: The doctrine enforces a two-step contract: first inspect metadata to confirm coverage, then retrieve cited content only when evidence exists.

The doctrine is simple:

Run inventory first. Ask the index for metadata only.
Run content retrieval second. Retrieve document-backed facts only if inventory indicates relevant coverage.

Step 1: Inventory (Cheap, Metadata-Only)

The first call is a scoped inventory query. It should return metadata such as:

Field	What It Returns	Why It Matters
`count`	Number of matching documents	Tells the agent whether any evidence exists in scope
`date_range`	Earliest and latest matching document dates	Helps assess recency and coverage
`top_counterparties`	Frequently occurring entities in matching docs	Helps validate whether the scope is correct
`coverage`	Categories or document types represented	Shows what kinds of evidence are available

This step should not return document contents. The point is to answer a narrow question: does the index appear to contain relevant evidence for this request?

Step 2: Ask-Docs (Content Retrieval With Citations)

If inventory indicates relevant coverage, the agent can proceed to ask-docs or an equivalent retrieval step. That second call should return evidence that can be cited and checked, such as:

Document IDs for attribution
Fact evidence tied to specific source documents
Page or chunk locations so claims can be verified against source positions

The grounding rule is straightforward: if an answer depends on a document fact, the answer should be traceable to returned evidence. If the evidence is absent or incomplete, the agent should say so rather than fill the gap from model memory.

The Decision Gate Matters Most

The most important moment is between the two steps. If inventory returns count: 0 for the scoped query, the system has a clear answer about coverage: it has no matching documents in that scope. At that point, the correct behavior is to decline to answer from documents, not to guess.

That is why this pattern is effective against hallucination. It converts an ambiguous internal question—"Do I know this?"—into a concrete system question: "Does the indexed evidence base contain relevant documents?"

Why the Pattern Works Operationally

TL;DR: Inventory-first works because metadata can be computed and indexed ahead of time, making coverage checks fast and cheap.

This pattern only works if inventory is materially cheaper than full retrieval. In most document systems, that means metadata has to be prepared during ingestion rather than assembled from raw content at query time.

A typical pipeline has three stages:

Document intake and OCR. Files arrive in multiple formats, and image-based documents are converted into searchable text.
Metadata and fact extraction. The system extracts structured fields, tags documents by source or path, and stores attributes needed for filtering and coverage checks.
Indexed storage and retrieval. The system stores both metadata and content in a searchable index, often with relational storage for structured fields and a retrieval layer for content search.

The key design choice is precomputation. If counts, date ranges, and category coverage are materialized during ingestion, inventory queries can be answered with lightweight indexed lookups. If those values must be inferred by scanning document content on every request, the cost and latency advantage largely disappears.

The article's original reference to a Postgres-backed index running in Docker is plausible as an implementation detail, but it is not essential to the doctrine itself. The pattern applies equally to other storage and retrieval stacks.

Security Benefits: Metadata as a Boundary

TL;DR: Inventory-first also reduces unnecessary data exposure by keeping metadata checks separate from content retrieval.

Although the main goal is reliability, the pattern has a useful security side effect: it limits when document contents are exposed to an agent.

Inventory reveals coverage, not contents. An agent can determine whether relevant evidence exists without seeing the underlying text.
Content retrieval becomes deliberate. Documents are pulled only when the task actually requires source-backed content.
Structured evidence is safer than broad snippets. Returning fact records and source locations often exposes less raw text than returning large excerpts by default.

That does not eliminate security risk. Metadata can still be sensitive in some environments; even document counts, date ranges, or named counterparties may require access controls. But as a design pattern, metadata-first retrieval generally supports least-privilege behavior better than speculative content retrieval.

What Changed in Practice

TL;DR: Enforcing inventory-first improves trust by making no-evidence responses explicit and citation-backed answers easier to enforce.

The practical effects of this doctrine are easy to understand even without internal performance numbers.

Fewer fabricated answers on out-of-scope queries. When no matching documents exist, the system can say so directly.
Fewer unnecessary human checks. Agents can answer "do we have documents about this?" from metadata instead of escalating.
Stronger citation discipline. If the retrieval step returns document IDs and source locations, grounded answers become easier to audit.

The broader lesson is that trustworthy retrieval systems separate three questions that are often blurred together:

Do relevant documents exist?
Which documents support the answer?
What, exactly, do those documents say?

Inventory answers the first question. Content retrieval answers the second and third. Keeping those stages distinct makes the system easier to reason about and harder to misuse.

Frequently Asked Questions

Q: What is the inventory-before-ask-docs doctrine?

It is a retrieval rule that requires agents to check metadata coverage before retrieving document content. The first step asks whether relevant documents exist in scope; the second step retrieves cited evidence only if they do.

Q: How does inventory-first reduce hallucination?

It gives the agent an explicit no-evidence signal. If the inventory result shows no matching documents, the agent can decline instead of trying to answer from model memory and presenting that answer as document-grounded.

Q: Why is an inventory call usually cheaper than content retrieval?

Because it can be served from indexed metadata such as counts, dates, and categories. Full retrieval usually requires searching content, selecting passages, and packaging evidence for the model, which costs more in compute and latency.

Q: Does metadata-first retrieval improve security?

Usually, yes. It reduces unnecessary exposure of document text by reserving content retrieval for tasks that actually need it. That said, metadata may still be sensitive and should be protected accordingly.

Q: Is this pattern specific to one tool or stack?

No. The doctrine is architectural, not vendor-specific. It can be applied to many RAG systems as long as the platform can separate metadata coverage checks from content retrieval.

Key Takeaways

Check coverage before retrieval. Inventory-first prevents the system from treating missing evidence as an invitation to improvise.
Treat zero matches as a valid outcome. A clear no-evidence response is more trustworthy than a weakly grounded answer.
Require source-backed answers. If a claim depends on documents, it should be traceable to returned evidence.
Use metadata as a control boundary. Coverage checks can reduce unnecessary content exposure.
Precompute what makes inventory useful. Fast metadata queries depend on good ingestion and indexing design.

Conclusion

Inventory-before-ask-docs is a simple rule with outsized impact: do not ask the model to interpret evidence until the system has confirmed that relevant evidence exists. That separation improves reliability, supports stronger citation discipline, and reduces unnecessary exposure of document contents. For teams building retrieval-heavy agents, it is a practical reminder that many hallucination problems are not model problems first; they are workflow problems.