Skip to main content
Fabric is four layers on top of one Postgres instance. Connectors feed an ingestion pipeline; the pipeline populates a knowledge graph, a vector store, and a memory graph; those power a unified query layer for chat, the REST API, and the graph explorer.
One database. No Elasticsearch to sync. No Neo4j to maintain. No vector DB sidecar. pgvector, tsvector, and recursive CTEs handle everything.

Data flow

Connectors → Ingestion → Storage → Query, all on Postgres

Connectors

Gmail, Drive, Slack, Fireflies, IMAP, Postgres, MySQL

Ingestion

Extract content · Chunk + embed · Extract graph · Extract memory

Storage

graph_nodes · graph_edges · rag_chunks · observations · connections

Query

Chat (SSE) · REST API · Graph UI

Connectors layer

Each connector is a Python adapter in backend/sources/ that implements three operations:

Discover

Enumerate what’s available — folders, channels, mailboxes, schemas.

Sync

Incremental fetch by cursor (message ID, file modification time, thread timestamp). Each connector persists its own cursor so runs only pull deltas.

Normalize

Turn raw source data into graph nodes and typed edges with metadata.
Connectors run on a schedule via Celery Beat. Each tracks its own sync state so runs fetch only deltas.
SourceAuthSync granularity
GmailOAuth 2.0Per message / thread
Google DriveOAuth 2.0Per file modification time
SlackOAuth 2.0Per channel timestamp
FirefliesOAuth 2.0Per transcript ID
IMAPCredentials (encrypted)Per message UID
PostgreSQL / MySQLCredentials (encrypted)On-demand (schema + query exec)

Ingestion pipeline

Strip formatting, unwrap HTML, handle attachments, normalize to plain text.The content adapter produces a SourceItem per content unit with stable IDs for deduplication across syncs.
Each stage is independently re-runnable. Regenerate embeddings without rebuilding the graph. Re-extract observations without re-chunking. The pipeline is checkpointed per-item.

Storage layer

One Postgres database. Extensions: pgvector for embeddings, built-in tsvector for full-text search.
TableRoleKey indexes
graph_nodesTyped entities extracted from contentHNSW(embedding), GIN(search_vector), btree(source_date)
graph_edgesTyped relationships with weight + timestampbtree(source_id), btree(target_id), btree(relation)
rag_chunksText chunks for retrievalHNSW(embedding), GIN(search_vector)
observationsTyped memory facts with importance decayHNSW(embedding), GIN(search_vector), btree(importance)
connectionsPer-tenant connector credentials (AES-256)btree(tenant_id)
chat_sessionsConversation history + SDK session mappingbtree(tenant_id, updated_at)
Every queryable table has tenant_id and project_id columns. Tenant isolation is enforced at query time — there’s no query path that doesn’t filter on tenant.

Query layer

Chat

A Claude Agent SDK session with a tool surface — search_knowledge, follow_edges, query_database, http_request, and more. Streams as SSE with per-token deltas, tool calls, and cost accounting.

REST API

72+ endpoints in 10 modules. FastAPI auto-generates OpenAPI docs at /docs (Swagger) and /redoc. Bearer token auth; API keys for programmatic access.

Graph Explorer

React SPA that renders graph_nodes and graph_edges via force-directed layout. Debug extraction, audit relationships, explore what Fabric has learned.

Event-driven processing

Heavy work — sync, extraction, embedding, agent execution — runs in Celery workers. The API never blocks on long operations.
  • Sync tasks fire on Celery Beat schedules configured per connection.
  • Extraction and embedding chain per item so they can be scaled and retried independently.
  • Agent runs execute SDK sessions inside worker processes with session persistence on a workspace filesystem.

Observability

Langfuse traces every LLM call, embedding operation, and tool invocation. Per-query cost surfaces in the chat UI. Structured logs with tenant_id + project_id context. Prometheus-compatible metrics at /metrics.
Most competitors are black boxes. Fabric shows its work.

Deployment

docker compose up
Brings up Postgres, Redis, API, worker, and frontend. Useful for dev or single-user deployments.

How It Works

The engineering behind each layer.

Knowledge Graph

Typed edges, multi-hop traversal, SQL you can run directly.