Skip to main content

The two modes

GRAIL has one architecture and two ways to feed it. That's the design decision worth understanding before you start.

Direct comparison

Mode · Knowledge base
Mode · Agentic memory
Who writesAn LLM reads your documents and extracts entitiesYour agent writes what it already knows
What goes ininput/ with PDFs, markdown, codememories/<category>/ with markdown observations
Write costOne LLM call per document chunkZero LLM calls — the agent already knows what it meant
When to processBatch indexing (grail index)Incremental, one observation at a time
Source of communitiesLeiden algorithm over the graphDeclared folders + reviewed proposals
Extra search moderecall: structural filter, no LLM
Init commandgrail init my-projectgrail init my-project --memory
Python APIGRAIL.from_config(...)MemoryProject(...)

Same search layer on both sides

This is the critical part: the six search modes work identically on both. When you ask a natural-language question, it doesn't matter which door the facts came through.

  • local — find entities similar to your question, assemble their context, answer.
  • cascade — entity gate + BM25/cosine text rescue, combined ranking.
  • global — map-reduce over thematic community reports.
  • document — scope the search to a single source file.
  • agent — the LLM picks which mode to call per question, iterating up to 5 times.
  • recall — pure pandas filter over date, category, tags. Memory only. Zero LLM.

See Search modes for each one in detail.

When to pick which?

Pick knowledge base if…

  • You have an existing corpus you want queryable: a legal library, technical manuals, papers, legacy code.
  • The sources are authoritative (not written by your agent).
  • You're going to index once or periodically and query many times.
  • You care about exact provenance to source files for verifiable citations.

Pick agentic memory if…

  • Your agent needs to remember across sessions what it decided, observed, or learned.
  • The "sources" don't exist as documents — they're agent observations about the conversation, the code, the user's decisions.
  • You want the agent itself to control which entities and relationships to create (not an intermediate LLM guessing).
  • You need writes to be incremental and cheap — no re-indexing cycle.

Or both at once

Nothing stops you from mixing them. A project can have input/ with reference PDFs and memories/ with observations the agent accumulates in use. Both paths feed the same graph, and the six modes query both sources at once.

What they share

LayerBehaviour
ArtefactsSame parquet files (final_entities, final_relationships, final_text_units, …)
GraphSame NetworkX underneath
Vector storeSame (FAISS / LanceDB / ChromaDB)
EmbeddingsSame model
ProvenanceEach text unit keeps a pointer to the source file
Search modesThe 5 LLM modes work identically; recall is the sixth, memory-only

Next step