Cascade — when the fact isn't in the graph

Vanilla GraphRAG has a known weakness: sometimes the fact you want lives in a text chunk whose main entity wasn't extracted during indexing. The graph doesn't know about the fact. The answer comes out badly.

cascade solves this. It's probably the mode you'll use the most.

How it works

Entity gate. Embed your question, find the top-k most similar entities in the graph. These entities constrain the candidate pool.
Text rescue. In parallel, run BM25 and cosine similarity over all text chunks. Bring back the most relevant ones even if their entities aren't in the top-k.
Combined ranking. Merge both rankings with a configurable weight. Chunks that appear in both rise sharply; chunks in only one are still considered.
Synthesis. The LLM sees the combined context and answers.

                  Your question
                       │
        ┌──────────────┼──────────────┐
        ▼              ▼              ▼
   Top-k entities  BM25 over chunks  Cosine over chunks
        │              │              │
        └──────────────┼──────────────┘
                       ▼
              Combined ranking
                       │
                       ▼
                Context to LLM
                       │
                       ▼
                    Answer

When to use `cascade`

It's the default option we recommend for factual questions. Works especially well when:

The question has specific terms likely to appear in the literal text ("how many months does X last?", "what percentage covers Y?").
The corpus has details lost in entity extraction (numbers, dates, legal clauses).
You're not sure whether the answer lives in an entity or only in text.

When not to use `cascade`

Three cases where another mode is better:

If your question is…	Use	Why
About a single file you've already named	`document`	More precise when we limit scope
Broad thematic ("what's it all about")	`global`	Community reports are the answer
Compound ("compare X and Y, consider Z")	`agent`	Needs multiple iterated queries

Tuning the balance

By default, cascade weights entities and text evenly. If your corpus is highly structured (lots of schema, few free notes) you can weight it more to the graph; if it's very narrative, more to the text.

See the config glossary for the cascade_* flags under search: in your grail.yaml.

Benchmark result

In our internal benchmark (Chilean Oncology Laws), cascade and agent (which uses cascade as one of its tools) lead:

Question category	`cascade` only	`agent`
Single facts	4.6 / 5	4.7 / 5
Multi-chunk	4.5 / 5	4.8 / 5
Cross-source	4.4 / 5	4.9 / 5
Comparative	4.0 / 5	4.9 / 5

cascade alone already beats vanilla RAG comfortably. agent improves especially on compound questions because it can run cascade twice for different subtopics.

Next step

Search modes — the full panorama.
Communities and Leiden — what global does underneath.
KB quickstart — try cascade against your own corpus.

How it works​

When to use cascade​

When not to use cascade​

Tuning the balance​

Benchmark result​

Next step​