Knowledge Graph
Sibyl stores knowledge in a graph database, enabling rich relationships between entities and semantic search. This guide explains how the graph works.
Architecture Overview
Sibyl uses a two-database architecture:
| Database | Purpose | Technology |
|---|---|---|
| Graph DB | Entities, relationships, embeddings | FalkorDB |
| Relational DB | Users, sessions, API keys, crawled docs | PostgreSQL |
FalkorDB
FalkorDB is a Redis-compatible graph database that Sibyl uses via Graphiti, a Graph RAG framework. It provides:
- Cypher queries - Graph query language
- Vector search - Native embedding support
- High performance - Redis-speed operations
Graphiti Integration
Graphiti is the Graph RAG layer that powers Sibyl's knowledge operations:
# Sibyl uses Graphiti's node APIs
from graphiti_core.nodes import EntityNode, EpisodicNode
from graphiti_core.search.search_config_recipes import NODE_HYBRID_SEARCH_RRFNode Types
Graphiti creates two fundamental node types in the graph:
Episodic Nodes
Created by add_episode() - temporal knowledge snapshots:
# When you add knowledge via the CLI or MCP
sibyl add "Redis insight" "Connection pool must be >= concurrent requests"
# Creates an EpisodicNode with entity_type propertyEntity Nodes
Extracted entities or directly inserted nodes:
# Created via EntityNode.save() for structured data
# Tasks, projects, patterns with specific schemasQuery Both Node Types When querying the graph, you must handle both node types:
MATCH (n)
WHERE (n:Episodic OR n:Entity) AND n.entity_type = $type
RETURN nEntity Types
Sibyl supports many entity types (see Entity Types for full details):
| Type | Description |
|---|---|
episode | Temporal learnings, discoveries |
pattern | Reusable coding patterns |
rule | Sacred constraints, invariants |
task | Work items with workflow |
project | Container for tasks/epics |
epic | Feature-level grouping |
document | Crawled content |
source | Documentation sources |
Relationships
Entities connect through typed relationships:
Knowledge Relationships
| Type | Usage |
|---|---|
APPLIES_TO | Pattern applies to topic |
REQUIRES | A requires B |
CONFLICTS_WITH | Mutual exclusion |
SUPERSEDES | A replaces B |
RELATED_TO | Generic relationship |
ENABLES | A enables B |
BREAKS | A breaks B |
Task Relationships
| Type | Usage |
|---|---|
BELONGS_TO | Task -> Project, Epic -> Project |
DEPENDS_ON | Task -> Task (blocking) |
BLOCKS | Task -> Task (inverse) |
ASSIGNED_TO | Task -> Person |
REFERENCES | Task -> Pattern/Rule |
Document Relationships
| Type | Usage |
|---|---|
CRAWLED_FROM | Document -> Source |
CHILD_OF | Document hierarchy |
MENTIONS | Document -> Entity |
Multi-Tenancy
Each organization gets its own isolated graph:
# Graph named by organization UUID
graph_name = str(org.id) # e.g., "550e8400-e29b-41d4-a716-446655440000"
# All operations require org context
manager = EntityManager(client, group_id=str(org.id))Always Scope by Organization Never query without org scope - it will hit the wrong graph
or break isolation. :::
Write Concurrency
FalkorDB requires serialized writes to prevent corruption. Sibyl uses a semaphore:
async with client.write_lock:
await client.execute_write_org(org_id, query, **params)The EntityManager handles this automatically for all write operations.
Hybrid Search
Search combines multiple techniques:
Vector Search
Embeddings generated by OpenAI's embedding model enable semantic similarity:
# Search uses NODE_HYBRID_SEARCH_RRF recipe
results = await client.search_(
query=query,
config=NODE_HYBRID_SEARCH_RRF,
group_ids=[org_id],
)BM25 Search
Keyword-based scoring for exact matches:
# RediSearch provides BM25 scoring
# Combined with vector search via RRF fusionReciprocal Rank Fusion (RRF)
Combines vector and keyword results:
RRF_score = sum(1 / (k + rank_i)) for each rankingEntity Storage
Entities store metadata as JSON in the metadata property:
# Core properties stored directly
n.uuid # Entity ID
n.name # Display name
n.entity_type # Type enum value
n.content # Full content
n.description # Summary
# Extended properties in metadata JSON
n.metadata = {
"status": "doing",
"priority": "high",
"project_id": "proj_abc",
"tags": ["backend", "auth"],
...
}Graph Creation Paths
LLM-Powered (Slower, Richer)
# Uses Graphiti's entity extraction
await manager.create(entity)- Analyzes content with LLM
- Extracts additional entities
- Creates relationships automatically
- Best for unstructured knowledge
Direct Insertion (Faster)
# Bypasses LLM, uses EntityNode directly
await manager.create_direct(entity)- Creates node immediately
- Generates embedding inline
- Best for structured entities (tasks, projects)
- Used for bulk operations
Querying the Graph
Using EntityManager
from sibyl_core.graph import GraphClient, EntityManager
client = await GraphClient.create()
manager = EntityManager(client, group_id=str(org_id))
# Search
results = await manager.search("OAuth patterns", limit=10)
# Get by ID
entity = await manager.get("entity_abc")
# List by type
tasks = await manager.list_by_type(
EntityType.TASK,
status="todo",
project_id="proj_123"
)Using RelationshipManager
from sibyl_core.graph import RelationshipManager
rel_manager = RelationshipManager(client, group_id=str(org_id))
# Get related entities
related = await rel_manager.get_related_entities(
entity_id="pattern_abc",
relationship_types=[RelationshipType.APPLIES_TO],
max_depth=2
)Direct Cypher Queries
For complex queries, use Cypher directly:
result = await driver.execute_query(
"""
MATCH (t:Entity)-[:BELONGS_TO]->(p:Entity)
WHERE t.entity_type = 'task'
AND p.uuid = $project_id
AND t.status = 'doing'
RETURN t.uuid, t.name, t.status
""",
project_id="proj_abc"
)Best Practices
1. Always Use Org Context
# WRONG
manager = EntityManager(client, group_id="")
# RIGHT
manager = EntityManager(client, group_id=str(org.id))2. Handle Both Node Labels
-- WRONG (misses Episodic nodes)
MATCH (n:Entity) WHERE n.entity_type = 'pattern'
-- RIGHT
MATCH (n) WHERE (n:Episodic OR n:Entity) AND n.entity_type = 'pattern'3. Use Write Lock for Mutations
# EntityManager handles this, but for direct queries:
async with client.write_lock:
await driver.execute_query("CREATE ...")4. Filter Early in Queries
-- WRONG (fetches all, filters in Python)
MATCH (n) RETURN n
-- RIGHT (filters in DB)
MATCH (n)
WHERE n.entity_type = $type AND n.group_id = $org_id
RETURN n
LIMIT 100Troubleshooting
Graph Corruption
If you encounter corruption errors:
# Connect to FalkorDB
redis-cli -p 6380
# Delete the corrupted graph
GRAPH.DELETE <org-uuid>Slow Queries
- Add indexes for frequently queried properties
- Limit result sets
- Use specific node labels when possible
Missing Results
- Check both
EpisodicandEntitylabels - Verify org_id matches
- Check if entity_type filter is correct
Next Steps
- Entity Types - All available entity types
- Semantic Search - Search in detail
- Multi-Tenancy - Organization scoping
