Skip to main content

Python SDK

GRAIL is a library first. The CLI is a wrapper over the same two public classes: GRAIL (KB mode) and MemoryProject (memory mode).

Install

uv pip install -e .

Common imports

from grail import (
GRAIL, # KB orchestrator
MemoryProject, # Memory orchestrator
Config, load_config, # configuration
LLMClient, # direct LLM client (advanced)
EmbeddingClient, # direct embedding client
PromptRegistry, # prompt registry
Entity, Relationship, # parquet row schemas
TextUnit, Community,
CommunityReport, Document,
SearchResult, # what search() returns
Reply, # what MemoryProject methods return
)

Mode · Knowledge base
GRAIL class

Knowledge-base mode orchestrator.

Construction

from grail import GRAIL, load_config

config = load_config("./my-kb") # path, dict, or Config object
grail = GRAIL.from_config(config)

from_config wires everything: storage, endpoint registry, LLM cache, cost tracker, LLM and embedding clients, prompts, optional reranker.

Public attributes after construction:

AttributeType
grail.configConfig
grail.storageStorageBackend
grail.llmLLMClient
grail.embeddingsEmbeddingClient
grail.promptsPromptRegistry
grail.cost_trackerCostTracker
grail.rerankerRerankerClient | None
grail.reporterReporter

Methods

All I/O methods are async. From a regular script, wrap with asyncio.run(...). From async code (FastAPI, Textual, Jupyter), await directly.

MethodSignatureWhat for
index()async () -> dictFull pipeline.
search()async (query, *, mode="local", ...) -> SearchResultA single search in any of 5 modes.
agent_search()async (query, ...) -> SearchResultAgentic tool-call loop.
append()async (new_files: list[str]) -> dictAdd files incrementally.
edit()async (replacements: dict[str, str]) -> dictReplace files.
delete()async (file_names: list[str]) -> dictDelete files.
create_entity_types()async (*, sample_chars=8000, ...) -> list[str]Discover entity types.
status()sync () -> dictWhich artefacts exist.

End-to-end example

import asyncio
from grail import GRAIL, load_config

async def main():
grail = GRAIL.from_config(load_config("./my-kb"))

# 1. Index
await grail.index()

# 2. Search
result = await grail.search(
"Who is FONASA?",
mode="cascade",
use_reranker=True,
)
print(result.response)
print(result.completion_time, "seconds")
print(result.llm_calls, "LLM calls")

# 3. Agent
result = await grail.agent_search(
"Compare AUGE vs Ricarte Soto Law coverage",
max_iterations=5,
)
print(result.response)

# 4. Incremental
await grail.append(["new_paper.pdf"])

# 5. Cost
print(grail.cost_tracker.render_total_cost())

asyncio.run(main())

Common search() kwargs

kwargTypeDescription
modestring"local" | "cascade" | "global" | "document"
conversation_historylist[{"role": "user", "content": "..."}, ...]
documentstring | NoneOnly for mode="document".
include_entity_nameslist[str]Restrict to these entities.
exclude_entity_nameslist[str]Exclude these entities.
use_rerankerbool | NoneNone = use config.
artifact_instructionsstringExtra text for the synthesis prompt.

SearchResult

@dataclass
class SearchResult:
response: str | dict | list
context_data: str | list[DataFrame] | dict[str, DataFrame]
context_text: str | list[str] | dict[str, str]
completion_time: float
llm_calls: int

response is the human-readable answer. context_data and context_text are what the LLM saw (useful for debugging).


Mode · Agentic memory
MemoryProject class

Agentic memory mode orchestrator. Same engine, different write path.

Construction

from grail import MemoryProject

mp = MemoryProject("./my-memory")
# Optionally: config=Config(...), embeddings=EmbeddingClient(...)

If meta.json doesn't exist in the folder, it's created automatically. If it does, it's opened.

Main methods

MethodSignatureWhat for
add_observation()async (*, title, content, ...) -> ReplyWrite an observation.
recall()async (query=None, *, mode="recall", ...) -> ReplySearch with structural filters (any mode).
list_observations()sync (*, category=None, since=None, ...) -> ReplyList observations by filter.
delete_observation()sync (slug, reason=None) -> ReplyDelete an observation.
consolidate()sync () -> ReplyGenerate proposals (no mutation).
list_proposals()sync (*, status=None) -> ReplyList pending/applied proposals.
apply_proposal()sync (proposal_id, *, accept=True) -> ReplyAccept or reject a proposal.

add_observation: the core primitive

reply = await mp.add_observation(
title="...", # required
content="...", # markdown body
category="work/clients/acme", # path for folders-as-communities
tags=["decision"],
entities=[ # the agent declares entities
{"name": "Acme", "type": "ORGANIZATION", "description": "..."},
],
relationships=[
{"source": "Acme", "target": "Postgres", "relationship_type": "CHOSE",
"description": "..."},
],
observed_at="2026-06-02T14:00:00Z", # default: now
confidence=0.95, # 0.0–1.0
source="architecture review",
related_to=["abc123", "def456"], # IDs of related observations
)

print(reply.ok) # True
print(reply.data["observation_id"]) # assigned ULID
print(reply.data["slug"]) # markdown file slug
print(reply.data["file_path"]) # absolute path
print(reply.data["new_entities"]) # names of new entities
print(reply.data["updated_entities"]) # existing entities that were updated

recall: structural filters

# Pure recall mode (no LLM, no embedding)
reply = await mp.recall(
mode="recall",
since="7d",
category="work/clients/acme/**",
tags=["decision"],
entity_names=["Postgres"],
type="ORGANIZATION",
min_confidence=0.8,
limit=20,
)

for obs in reply.data["observations"]:
print(obs["observed_at"], obs["title"])
# Cascade with filters (LLM + structural filter)
reply = await mp.recall(
"Why did we rule out DynamoDB?",
mode="cascade",
since="30d",
category="work/clients/acme/**",
)
print(reply.data["response"])

The Reply envelope

Every MemoryProject method returns a Reply:

@dataclass
class Reply:
ok: bool
data: Any = None
warnings: list[str] = field(default_factory=list)
next_steps: list[str] = field(default_factory=list)
error: str | None = None

It's the same JSON contract the skill scripts emit. SDK and skill read the same keys.


Programmatic configuration

Config is Pydantic. You can build it without YAML:

from grail import Config
from grail.config import (
LLMConfig, EmbeddingsConfig, IndexingConfig, StorageConfig,
)

config = Config(
project_name="my-project",
root_dir="/tmp/grail-my-project",
llm=LLMConfig(
endpoint="openai",
model="gpt-4o-mini",
extra_pricing={"openai|gpt-4o-mini": [0.15, 0.60]},
),
embeddings=EmbeddingsConfig(
endpoint="openai",
model="text-embedding-3-small",
),
indexing=IndexingConfig(
entity_types=["PERSON", "ORGANIZATION", "PRODUCT"],
chunk_size=1500,
),
storage=StorageConfig(backend="local", root="/tmp/grail-my-project"),
)

grail = GRAIL.from_config(config)

Or from a dict:

config = Config.model_validate({
"project_name": "my-project",
"llm": {"endpoint": "openai", "model": "gpt-4o-mini"},
"embeddings": {"endpoint": "openai", "model": "text-embedding-3-small"},
})

Read artefacts directly

When you want raw parquet, not a search response:

from grail.query.retrieval import load_artifacts_for_search

artifacts = load_artifacts_for_search(grail.storage, grail._output_folder())

artifacts.documents # DataFrame
artifacts.text_units
artifacts.entities
artifacts.relationships
artifacts.communities
artifacts.community_reports
artifacts.nodes

Embed in FastAPI

from fastapi import FastAPI
from grail import GRAIL, load_config

app = FastAPI()

@app.on_event("startup")
async def startup():
app.state.grail = GRAIL.from_config(load_config("./my-kb"))

@app.post("/ask")
async def ask(question: str):
result = await app.state.grail.search(question, mode="cascade")
return {
"answer": result.response,
"llm_calls": result.llm_calls,
"completion_time": result.completion_time,
}

A GRAIL instance is safe to share across requests — LLM/embedding clients are async-safe and internally rate-limited via concurrent_requests.


Customise collaborators

After from_config, you can replace any collaborator:

from grail.storage import StorageBackend

class MyGCSBackend(StorageBackend):
... # implement the 7 required methods

grail = GRAIL.from_config(config)
grail.storage = MyGCSBackend(bucket="my-bucket")

Same for reporter, vector store, reranker.

Next step