How It Works - Fabric

Fabric stacks four layers that build on each other. Each is a moat on its own. Together they produce cross-source, time-aware, source-cited answers that no single layer can produce alone.

Layer 1 — The Knowledge Graph

As content syncs from your connectors, an extraction pipeline identifies entities and relationships. Entities become typed nodes. Relationships become typed edges with weights and timestamps.

Edge types extracted today

Relation	Meaning	Example
`sent_by`	Message authored by a person	Slack message → user who posted it
`replied_to`	Response relationship	Email → the email it replies to
`in_thread`	Message belongs to a thread	Email → its parent thread node
`posted_in`	Message posted in a channel	Slack message → `#billing` channel
`attended`	Person attended a meeting	Fireflies transcript → person
`organized_by`	Person organized the meeting	Meeting → organizer
`participant`	Person involved in a thread	Email thread → each participant
`in_folder`	File lives in a folder	Drive file → parent folder
`from_domain`	Person’s email domain	Person → domain node
`has_email`	Person → their email address	Person → person:email@company.com

Why typed edges

A vector store knows "Cole Smith" is near "Project Phoenix" in embedding space. It doesn’t know why. The graph knows Cole attended the Phoenix kickoff, sent_by three emails about the launch, and replied_to the legal review thread on April 7.

Typed edges turn “find similar text” into “reason about who, when, and why.”

Multi-hop traversal

The reasoning loop walks 2–3 hops out from a seed node via recursive CTE:

WITH RECURSIVE reachable AS (
  SELECT id, 0 AS depth FROM graph_nodes WHERE id = $seed
  UNION
  SELECT n.id, r.depth + 1
  FROM reachable r
  JOIN graph_edges e ON e.source_id = r.id OR e.target_id = r.id
  JOIN graph_nodes n ON n.id = CASE WHEN e.source_id = r.id
                                    THEN e.target_id ELSE e.source_id END
  WHERE r.depth < 2
)
SELECT * FROM reachable;

The graph lives in Postgres. You can SELECT against it, join it to your operational data, and inspect it in any Postgres client.

Layer 2 — Fused Retrieval via Reciprocal Rank Fusion

Every query runs two rankers in parallel over the graph_nodes and observations tables:

BM25 keyword relevance

Postgres full-text search with ts_rank_cd cover-density ranking on tsvector columns. Title weighted A, body weighted B. Parsed via websearch_to_tsquery for safe handling of punctuation.

Vector similarity

pgvector HNSW indexes with cosine distance. Embedding model: OpenAI text-embedding-3-small (1536 dimensions).

Results fuse by rank position via Reciprocal Rank Fusion with k = 60:

rrf_score = 1 / (60 + vec_rank) + 1 / (60 + bm25_rank)

Why RRF and not a weighted sum

Cosine similarity is bounded [0, 1] with most matches ~0.3–0.7. ts_rank_cd is unbounded and usually 0.01–0.3. Adding them with fixed weights means vector dominates almost every query — keyword matches on rare terms (names, table identifiers, proper nouns) get crowded out.

RRF sidesteps this entirely. It uses each ranker’s opinion about ordering, not the raw score. Documents that rank well on both lists naturally rise to the top. k = 60 is the standard constant from Cormack, Clarke & Büttcher (2009); it dampens the weight of top ranks slightly so a single ranker’s #1 doesn’t automatically win.

Vespa does it this way. Elasticsearch’s latest hybrid does it this way. We do it this way.

Implementation sketch

async def search_nodes(tenant, project, query, embedding=None, limit=20):
    """Returns graph_nodes ordered by RRF(vec_sim, bm25)."""
    rows = await pool.fetch("""
      WITH candidates AS (
        SELECT *,
               1 - (embedding <=> $emb::vector) AS vec_sim,
               ts_rank_cd(search_vector, websearch_to_tsquery('english', $q))
                 AS bm25
        FROM graph_nodes
        WHERE tenant_id=$t AND project_id=$p
          AND (
            (embedding IS NOT NULL AND 1 - (embedding <=> $emb::vector) > 0.2)
            OR (search_vector @@ websearch_to_tsquery('english', $q))
          )
      ),
      vec_ranked  AS (SELECT id, ROW_NUMBER() OVER (ORDER BY vec_sim DESC) r FROM candidates),
      bm25_ranked AS (SELECT id, ROW_NUMBER() OVER (ORDER BY bm25    DESC) r FROM candidates)
      SELECT c.*,
             COALESCE(1.0/(60 + v.r), 0) + COALESCE(1.0/(60 + b.r), 0) AS rrf
      FROM candidates c
      LEFT JOIN vec_ranked  v ON c.id = v.id
      LEFT JOIN bm25_ranked b ON c.id = b.id
      ORDER BY rrf DESC
      LIMIT $limit
    """, ...)

Layer 3 — Semantic Memory with Decay

Every user conversation produces observations — typed facts extracted by Claude Haiku from question-answer pairs:

Type	Meaning
`fact`	Stable truths about people, systems, preferences
`decision`	Choices made and their rationale
`commitment`	Things someone said they’d do, with a deadline
`risk`	Concerns or blockers that were flagged
`insight`	Analytical conclusions drawn from data
`pattern`	Recurring behaviors or practices

Importance math

Initial score: 0.5

Every observation starts at importance 0.5.

Strengthened on reference: × 1.1

If the observation is pulled into a later conversation, importance multiplies by 1.1 (capped at 1.0).

Decayed when unused: × 0.9

Per conversation it’s not referenced in, importance multiplies by 0.9.

Pruned below 0.05

Observations that decay past 0.05 are removed.

Co-occurrence edges

When multiple observations are retrieved together enough times, a weighted edge forms between them. Over time, the memory graph encodes not just what Fabric knows but what knowledge travels together.

Grounded in the knowledge graph

Every observation points back at the source content — the email thread, the meeting transcript, the Slack message where the fact originated. This is the difference between mem0 (floating memories with no provenance) and Fabric (facts with citations).

Layer 4 — Databases as First-Class Citizens

Fabric connects directly to PostgreSQL and MySQL. Not API wrappers — real connections with schema discovery.

Connect

Provide credentials once. Stored encrypted per-tenant with AES-256.

Discover

Fabric introspects the schema: tables, columns, types, primary keys, foreign keys. The schema becomes queryable context for the agent.

Query

Ask a natural-language question. Fabric generates SQL against your actual schema, executes it via asyncpg or aiomysql, and returns results in chat with the query visible for auditing.

Example: cross-source join

Q: Pull the top 10 customers by revenue from public.customers who opened a support ticket in the last 7 days, and show me any Slack #support threads that mention them.

Fabric generates:

SELECT c.name, c.revenue
FROM public.customers c
JOIN public.support_tickets t ON t.customer_id = c.id
WHERE t.created_at > NOW() - INTERVAL '7 days'
ORDER BY c.revenue DESC LIMIT 10;

Executes against your Postgres. Then searches the graph for #support Slack threads whose content matches any of those customer names. Returns a unified result with both.

Why the layers compound

Layer	Provides	Can’t do alone
Knowledge graph	Relational reasoning, typed edges, cross-source timelines	Retrieve content to ground an answer
Fused retrieval (RRF)	Keyword + semantic precision in one query	Relationships between entities
Semantic memory	Accumulated understanding that adapts over time	Grounding in live data
Database connections	Real operational data in the same reasoning loop	Structure across unstructured sources

Graph alone is a CRM with extra steps. Search alone is Elasticsearch. Memory alone is mem0. Database connections alone is Metabase with a chat wrapper. The combination is Fabric.

​Layer 1 — The Knowledge Graph

​Edge types extracted today

​Why typed edges

​Multi-hop traversal

​Layer 2 — Fused Retrieval via Reciprocal Rank Fusion

BM25 keyword relevance

Vector similarity

​Why RRF and not a weighted sum

​Implementation sketch

​Layer 3 — Semantic Memory with Decay

​Importance math

​Co-occurrence edges

​Grounded in the knowledge graph

​Layer 4 — Databases as First-Class Citizens

​Example: cross-source join

​Why the layers compound

Layer 1 — The Knowledge Graph

Edge types extracted today

Why typed edges

Multi-hop traversal

Layer 2 — Fused Retrieval via Reciprocal Rank Fusion

Why RRF and not a weighted sum

Implementation sketch

Layer 3 — Semantic Memory with Decay

Importance math

Co-occurrence edges

Grounded in the knowledge graph

Layer 4 — Databases as First-Class Citizens

Example: cross-source join

Why the layers compound