> ## Documentation Index > Fetch the complete documentation index at: https://docs.fabric.bulldogtechnologies.com/llms.txt > Use this file to discover all available pages before exploring further. # Architecture > Four layers on top of one Postgres instance — connectors, ingestion, storage, query — all tenant-scoped, all inspectable. Fabric is four layers on top of **one Postgres instance**. Connectors feed an ingestion pipeline; the pipeline populates a knowledge graph, a vector store, and a memory graph; those power a unified query layer for chat, the REST API, and the graph explorer. **One database.** No Elasticsearch to sync. No Neo4j to maintain. No vector DB sidecar. pgvector, tsvector, and recursive CTEs handle everything. ## Data flow Connectors → Ingestion → Storage → Query, all on Postgres

Connectors → Ingestion → Storage → Query, all on Postgres

Gmail, Drive, Slack, Fireflies, IMAP, Postgres, MySQL Extract content · Chunk + embed · Extract graph · Extract memory graph\_nodes · graph\_edges · rag\_chunks · observations · connections Chat (SSE) · REST API · Graph UI ## Connectors layer Each connector is a Python adapter in `backend/sources/` that implements three operations: Enumerate what's available — folders, channels, mailboxes, schemas. Incremental fetch by cursor (message ID, file modification time, thread timestamp). Each connector persists its own cursor so runs only pull deltas. Turn raw source data into graph nodes and typed edges with metadata. Connectors run on a schedule via Celery Beat. Each tracks its own sync state so runs fetch only deltas. | Source | Auth | Sync granularity | | ---------------------- | ----------------------- | ------------------------------- | | **Gmail** | OAuth 2.0 | Per message / thread | | **Google Drive** | OAuth 2.0 | Per file modification time | | **Slack** | OAuth 2.0 | Per channel timestamp | | **Fireflies** | OAuth 2.0 | Per transcript ID | | **IMAP** | Credentials (encrypted) | Per message UID | | **PostgreSQL / MySQL** | Credentials (encrypted) | On-demand (schema + query exec) | ## Ingestion pipeline Strip formatting, unwrap HTML, handle attachments, normalize to plain text. The content adapter produces a `SourceItem` per content unit with stable IDs for deduplication across syncs. The adapter emits typed graph nodes (`person`, `domain`, `email_thread`, `slack_message`, `meeting`, `file`, `folder`) and typed edges (`sent_by`, `attended`, `replied_to`, etc.) that upsert into `graph_nodes` and `graph_edges`. Content splits on structural boundaries and embeds via OpenAI `text-embedding-3-small` (1536 dims). Chunks land in `rag_chunks` with HNSW indexes and `tsvector` columns for BM25. On conversations only. Claude Haiku extracts typed observations from question-answer pairs. Observations land in `observations` with embeddings and initial importance 0.5. Each stage is independently re-runnable. Regenerate embeddings without rebuilding the graph. Re-extract observations without re-chunking. The pipeline is checkpointed per-item. ## Storage layer One Postgres database. Extensions: `pgvector` for embeddings, built-in `tsvector` for full-text search. | Table | Role | Key indexes | | --------------- | ------------------------------------------- | --------------------------------------------------------- | | `graph_nodes` | Typed entities extracted from content | HNSW(embedding), GIN(search\_vector), btree(source\_date) | | `graph_edges` | Typed relationships with weight + timestamp | btree(source\_id), btree(target\_id), btree(relation) | | `rag_chunks` | Text chunks for retrieval | HNSW(embedding), GIN(search\_vector) | | `observations` | Typed memory facts with importance decay | HNSW(embedding), GIN(search\_vector), btree(importance) | | `connections` | Per-tenant connector credentials (AES-256) | btree(tenant\_id) | | `chat_sessions` | Conversation history + SDK session mapping | btree(tenant\_id, updated\_at) | Every queryable table has `tenant_id` and `project_id` columns. Tenant isolation is enforced at query time — there's no query path that doesn't filter on tenant. ## Query layer A Claude Agent SDK session with a tool surface — `search_knowledge`, `follow_edges`, `query_database`, `http_request`, and more. Streams as SSE with per-token deltas, tool calls, and cost accounting. 72+ endpoints in 10 modules. FastAPI auto-generates OpenAPI docs at `/docs` (Swagger) and `/redoc`. Bearer token auth; API keys for programmatic access. React SPA that renders `graph_nodes` and `graph_edges` via force-directed layout. Debug extraction, audit relationships, explore what Fabric has learned. ## Event-driven processing Heavy work — sync, extraction, embedding, agent execution — runs in Celery workers. The API never blocks on long operations. * **Sync tasks** fire on Celery Beat schedules configured per connection. * **Extraction and embedding** chain per item so they can be scaled and retried independently. * **Agent runs** execute SDK sessions inside worker processes with session persistence on a workspace filesystem. ## Observability Langfuse traces every LLM call, embedding operation, and tool invocation. Per-query cost surfaces in the chat UI. Structured logs with `tenant_id` + `project_id` context. Prometheus-compatible metrics at `/metrics`. Most competitors are black boxes. Fabric shows its work. ## Deployment ```bash theme={null} docker compose up ``` Brings up Postgres, Redis, API, worker, and frontend. Useful for dev or single-user deployments. ECS Fargate for API and worker, RDS Postgres, ElastiCache Redis, ALB fronting the API, CloudFront fronting the S3-hosted frontend. Secrets in SSM Parameter Store, injected at container start. Infrastructure-as-code via AWS Copilot manifests in `copilot/`. Deployment script is `scripts/deploy.sh`. The engineering behind each layer. Typed edges, multi-hop traversal, SQL you can run directly.