Knowledge Graph

Fabric’s knowledge graph is a typed, weighted, timestamped model of what it knows. Nodes are people, threads, meetings, channels, folders, domains, customers. Edges are the relationships between them — ten typed edge kinds today. It lives in Postgres as two tables: graph_nodes and graph_edges. You can SELECT against it, join it to your operational data, and inspect it in any Postgres client.

No Neo4j. No graph database to sync. One database for everything.

Node structure

Each node is one row in graph_nodes:

text

required

Stable per-node identifier.

tenant_id

text

required

Tenant isolation — every query filters on this.

project_id

text

required

Project-level scoping within a tenant.

node_type

text

required

Typed category: person, email_thread, slack_message, meeting, file, folder, domain.

source_type

text

required

Origin connector: gmail, slack, fireflies, google_drive, imap, postgres, mysql.

source_ref

text

required

External ID at the source (e.g. Gmail thread ID, Slack message timestamp).

title

text

Display title. Weighted A in the BM25 index.

content

text

Full text. Weighted B in the BM25 index.

source_date

timestamptz

When the underlying event happened.

embedding

vector(1536)

pgvector column for semantic search.

search_vector

tsvector

Full-text search column maintained by trigger.

metadata

jsonb

Source-specific fields.

Typed edges

Each edge is one row in graph_edges with source_id, target_id, relation, weight, source_date, and metadata.

sent_by — Message authored by a person

Email or Slack message → the person who wrote it. Most-common edge type in a Gmail-heavy tenant.

replied_to — Response relationship

Message B is a reply to message A. Enables thread reconstruction without relying on provider threading.

in_thread — Message belongs to a thread

Individual messages link to their parent thread node. Lets “summarize this thread” work across sources.

posted_in — Slack message posted in a channel

Slack message → channel node. Enables #channel-scoped search.

attended — Person attended a meeting

Fireflies meeting → person. Foundation for “who was in the Phoenix kickoff?”

organized_by — Meeting organizer

Meeting node → organizer person. Different from attended — tracks initiative.

participant — Person involved in a conversation

Generic participation edge, used for email threads with recipients who aren’t sender.

in_folder — File lives in a folder

Drive file → parent folder. Enables folder-scoped queries.

from_domain — Person's email domain

Person node → domain node. Surfaces cross-domain relationships (internal vs external).

has_email — Person linked to email address

Person entity → the person:email@company.com alias. Enables entity resolution.

Edges are directional but queried bidirectionally — follow_edges(node_id) returns all edges where the node is either source_id or target_id.

Why typed edges matter

Glean’s Enterprise Graph is a closed system — you can’t query it, inspect it, or join it with other data. Fabric’s graph is two Postgres tables. The structure is the point.

A vector store knows “Cole Smith” is semantically near “Project Phoenix.” It doesn’t know why. The graph knows Cole attended the Phoenix kickoff on April 7, sent_by three emails about the launch, and replied_to the legal review thread. Typed edges turn “find similar text” into “reason about who, when, and why.”

Querying the graph

From chat
From the REST API
From Postgres directly

Natural-language questions implicitly traverse the graph.

Q: Who has context on the billing migration?

Fabric finds the X topic node, follows attended, participant, and sent_by edges outward, and ranks the returned people by edge weight and recency.

Direct graph operations exposed as endpoints:

GET /v1/graph/nodes/:id — inspect a single node
GET /v1/graph/nodes/:id/neighborhood — fetch all 1-hop edges and neighbors
POST /v1/graph/search — RRF search across graph_nodes
GET /v1/graph/stats — node/edge counts per type

It’s just two tables, so you can query the graph as data:

-- Find everyone Cole worked with on billing topics in Q1
SELECT DISTINCT n.title
FROM graph_edges e1
JOIN graph_edges e2 ON e1.source_id = e2.source_id
                    OR e1.target_id = e2.source_id
JOIN graph_nodes n  ON n.id = e2.target_id
WHERE (e1.source_id = 'node_cole' OR e1.target_id = 'node_cole')
  AND n.node_type = 'person'
  AND e2.source_date BETWEEN '2026-01-01' AND '2026-03-31';

Multi-hop traversal

The reasoning loop walks 2–3 hops out from a seed node via Postgres recursive CTE:

WITH RECURSIVE reachable AS (
  SELECT id, 0 AS depth FROM graph_nodes WHERE id = $seed
  UNION
  SELECT n.id, r.depth + 1
  FROM reachable r
  JOIN graph_edges e ON e.source_id = r.id OR e.target_id = r.id
  JOIN graph_nodes n ON n.id = CASE WHEN e.source_id = r.id
                                    THEN e.target_id ELSE e.source_id END
  WHERE r.depth < 2
)
SELECT * FROM reachable;

This enables questions like:

Who has worked with Cole on billing-related topics?

Cole → threads where Cole is participant → other people participant in those same threads.

What decisions were made about Phoenix?

Phoenix topic → meetings attended for Phoenix → observations linked to those meetings.

Graph explorer

The explorer in the web UI is an interactive visualization — search for any entity, expand its neighborhood, filter by edge type, navigate by clicking.

Verify extraction

Confirm new connectors are producing the edges you expect.

Audit answers

Understand “why did the answer include this person?”

Explore unknowns

Find connections you didn’t know existed.

Graph maintenance

Entities and edges update on each sync. Deleted source content triggers node and edge removal on the next run. Edge weights decay slightly per week unseen, so old relationships fade unless reinforced by new activity.

The graph reflects the current state of your data, not a frozen snapshot.

Why Postgres and not Neo4j

Fabric uses Postgres with pgvector for everything — graph, memory, vector search, full-text search, operational data.

One database

No sync overhead between a graph DB and a relational DB.

Recursive CTEs

Fast enough for 2–3 hop traversal at current scales.

pgvector mature

HNSW indexes, cosine distance, production-grade.

Inspectable

Any Postgres client, any BI tool, any ORM you already run.

When the graph outgrows recursive CTEs, Postgres is straightforward to pair with a dedicated graph store as a read replica — the two tables are simple enough to replicate without rework.

​Node structure

​Typed edges

​Why typed edges matter

​Querying the graph

​Multi-hop traversal

Who has worked with Cole on billing-related topics?

What decisions were made about Phoenix?

​Graph explorer

Verify extraction

Audit answers

Explore unknowns

​Graph maintenance

​Why Postgres and not Neo4j

One database

Recursive CTEs

pgvector mature

Inspectable

Node structure

Typed edges

Why typed edges matter

Querying the graph

Multi-hop traversal

Graph explorer

Graph maintenance

Why Postgres and not Neo4j