Skip to main content

Communities and Leiden

Communities are the intermediate layer between "loose dots on a graph" and "smart answers to thematic questions". If the graph is the skeleton, communities are the organs: functional groupings that give the information structure.

The problem they solve

A graph of 5,000 entities is hard to reason about. A question like "what's this all about?" isn't answered by looking at 5,000 individual dots — it's answered by looking at densely connected clusters that represent coherent themes.

That's where the Leiden algorithm comes in.

What Leiden does

Leiden is a community-detection algorithm on graphs. It takes a graph as input (nodes are entities, edges are weighted relationships) and returns a hierarchical partition of nodes into dense groups.

Densely connected means: many edges inside the group, few edges going out. That usually coincides with coherent semantic themes.

Three important properties

  1. Hierarchical: Leiden produces multiple levels of clusters. A coarse level (few large groups) and fine levels (many small groups). In GRAIL you pick which level to use with community_level.

  2. Reproducible: with the same seed (community.seed in grail.yaml), the partition is deterministic. Useful for tests and benchmarks.

  3. Smart multi-level: unlike Louvain (its predecessor), Leiden guarantees each subcommunity is connected and resolves the "resolution limit" problem. That is, it doesn't artificially merge small groups that should stay separate.

Three levels, one decision

After running Leiden, GRAIL lets you pick which granularity to use:

community_levelBehaviourWhen it fits
"coarsest" (default)Few large communitiesBroad thematic global reports
"finest"Many small communitiesDetailed analysis, dashboards
"all"Every hierarchyWhen you want both views
An integer (e.g. 2)Specific levelAdvanced cases

Configurable in grail.yaml:

community:
community_level: "coarsest"
min_report_size: 3 # ignore communities of < 3 entities

Community reports

Once you have the communities, GRAIL asks the LLM to write a narrative report for each:

{
"title": "GES System and health guarantees",
"summary": "This community groups entities related to the GES system...",
"findings": [
{
"summary": "FONASA covers 100% of the cost...",
"explanation": "Under Law 19.966, FONASA has an obligation to..."
}
],
"rank": 8.5
}

This report is what global consults. Without it, "what are the central themes?" would be unanswerable without reading the whole corpus.

Folders as communities (memory mode)

In memory mode, GRAIL doesn't need to run Leiden to identify communities — the folder structure under memories/ declares them:

memories/
├── work/
│ ├── clients/acme/ ← one community
│ └── clients/zorp/ ← another community
├── personal/
│ └── learning/python/ ← one community
└── decisions/
└── 2026-q2/ ← one community

Each entity can belong to multiple communities at once (multi-membership). Acme can appear in work/clients/acme/ but also in decisions/2026-q2/.

When to run consolidate

Folders-as-communities works well up to a point. But as the graph grows, it's worth combining the declared with the discovered:

grail consolidate ./my-memory

This runs structural analysis over the graph and proposes:

  • discover_community: "These entities are densely connected but don't share a folder — they're an emergent community".
  • split_folder: "This folder has two distinct clusters — it'd be worth splitting".

You decide accept or reject. Nothing mutates without your consent.

Technical details

ConfigDefaultWhat for
community.max_cluster_size10Upper bound on cluster size per level
community.use_lcctrueUse only the largest connected component
community.min_community_size1Smaller communities get dropped
community.embedding_merge_eps0.95DBSCAN threshold to merge near-duplicate entities before Leiden

See the repo's docs/glossary.md for the full list.

Next step