> ## Documentation Index
> Fetch the complete documentation index at: https://docs.fabric.bulldogtechnologies.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Knowledge Graph

> A typed, weighted, timestamped model of what Fabric knows. Two tables in Postgres. SELECT-able, join-able, and auditable.

Fabric's knowledge graph is a **typed, weighted, timestamped model** of what it knows. Nodes are people, threads, meetings, channels, folders, domains, customers. Edges are the relationships between them — ten typed edge kinds today.

It lives in Postgres as two tables: `graph_nodes` and `graph_edges`. You can `SELECT` against it, join it to your operational data, and inspect it in any Postgres client.

<Info>
  **No Neo4j. No graph database to sync. One database for everything.**
</Info>

## Node structure

Each node is one row in `graph_nodes`:

<ParamField path="id" type="text" required>
  Stable per-node identifier.
</ParamField>

<ParamField path="tenant_id" type="text" required>
  Tenant isolation — every query filters on this.
</ParamField>

<ParamField path="project_id" type="text" required>
  Project-level scoping within a tenant.
</ParamField>

<ParamField path="node_type" type="text" required>
  Typed category: `person`, `email_thread`, `slack_message`, `meeting`, `file`, `folder`, `domain`.
</ParamField>

<ParamField path="source_type" type="text" required>
  Origin connector: `gmail`, `slack`, `fireflies`, `google_drive`, `imap`, `postgres`, `mysql`.
</ParamField>

<ParamField path="source_ref" type="text" required>
  External ID at the source (e.g. Gmail thread ID, Slack message timestamp).
</ParamField>

<ParamField path="title" type="text">
  Display title. Weighted `A` in the BM25 index.
</ParamField>

<ParamField path="content" type="text">
  Full text. Weighted `B` in the BM25 index.
</ParamField>

<ParamField path="source_date" type="timestamptz">
  When the underlying event happened.
</ParamField>

<ParamField path="embedding" type="vector(1536)">
  pgvector column for semantic search.
</ParamField>

<ParamField path="search_vector" type="tsvector">
  Full-text search column maintained by trigger.
</ParamField>

<ParamField path="metadata" type="jsonb">
  Source-specific fields.
</ParamField>

## Typed edges

Each edge is one row in `graph_edges` with `source_id`, `target_id`, `relation`, `weight`, `source_date`, and `metadata`.

<AccordionGroup>
  <Accordion title="sent_by — Message authored by a person" icon="envelope">
    Email or Slack message → the person who wrote it. Most-common edge type in a Gmail-heavy tenant.
  </Accordion>

  <Accordion title="replied_to — Response relationship" icon="reply">
    Message B is a reply to message A. Enables thread reconstruction without relying on provider threading.
  </Accordion>

  <Accordion title="in_thread — Message belongs to a thread" icon="comments">
    Individual messages link to their parent thread node. Lets "summarize this thread" work across sources.
  </Accordion>

  <Accordion title="posted_in — Slack message posted in a channel" icon="hashtag">
    Slack message → channel node. Enables `#channel`-scoped search.
  </Accordion>

  <Accordion title="attended — Person attended a meeting" icon="users">
    Fireflies meeting → person. Foundation for "who was in the Phoenix kickoff?"
  </Accordion>

  <Accordion title="organized_by — Meeting organizer" icon="user-tie">
    Meeting node → organizer person. Different from `attended` — tracks initiative.
  </Accordion>

  <Accordion title="participant — Person involved in a conversation" icon="user-group">
    Generic participation edge, used for email threads with recipients who aren't sender.
  </Accordion>

  <Accordion title="in_folder — File lives in a folder" icon="folder">
    Drive file → parent folder. Enables folder-scoped queries.
  </Accordion>

  <Accordion title="from_domain — Person's email domain" icon="at">
    Person node → domain node. Surfaces cross-domain relationships (internal vs external).
  </Accordion>

  <Accordion title="has_email — Person linked to email address" icon="id-card">
    Person entity → the `person:email@company.com` alias. Enables entity resolution.
  </Accordion>
</AccordionGroup>

Edges are directional but queried bidirectionally — `follow_edges(node_id)` returns all edges where the node is either `source_id` or `target_id`.

## Why typed edges matter

<Note>
  Glean's Enterprise Graph is a closed system — you can't query it, inspect it, or join it with other data. Fabric's graph is two Postgres tables. The structure is the point.
</Note>

A vector store knows "Cole Smith" is semantically near "Project Phoenix." It doesn't know **why**. The graph knows Cole `attended` the Phoenix kickoff on April 7, `sent_by` three emails about the launch, and `replied_to` the legal review thread.

Typed edges turn "find similar text" into "reason about who, when, and why."

## Querying the graph

<Tabs>
  <Tab title="From chat">
    Natural-language questions implicitly traverse the graph.

    > **Q:** Who has context on the billing migration?

    Fabric finds the X topic node, follows `attended`, `participant`, and `sent_by` edges outward, and ranks the returned people by edge weight and recency.
  </Tab>

  <Tab title="From the REST API">
    Direct graph operations exposed as endpoints:

    * `GET /v1/graph/nodes/:id` — inspect a single node
    * `GET /v1/graph/nodes/:id/neighborhood` — fetch all 1-hop edges and neighbors
    * `POST /v1/graph/search` — RRF search across `graph_nodes`
    * `GET /v1/graph/stats` — node/edge counts per type
  </Tab>

  <Tab title="From Postgres directly">
    It's just two tables, so you can query the graph as data:

    ```sql theme={null}
    -- Find everyone Cole worked with on billing topics in Q1
    SELECT DISTINCT n.title
    FROM graph_edges e1
    JOIN graph_edges e2 ON e1.source_id = e2.source_id
                        OR e1.target_id = e2.source_id
    JOIN graph_nodes n  ON n.id = e2.target_id
    WHERE (e1.source_id = 'node_cole' OR e1.target_id = 'node_cole')
      AND n.node_type = 'person'
      AND e2.source_date BETWEEN '2026-01-01' AND '2026-03-31';
    ```
  </Tab>
</Tabs>

## Multi-hop traversal

The reasoning loop walks 2–3 hops out from a seed node via Postgres recursive CTE:

```sql theme={null}
WITH RECURSIVE reachable AS (
  SELECT id, 0 AS depth FROM graph_nodes WHERE id = $seed
  UNION
  SELECT n.id, r.depth + 1
  FROM reachable r
  JOIN graph_edges e ON e.source_id = r.id OR e.target_id = r.id
  JOIN graph_nodes n ON n.id = CASE WHEN e.source_id = r.id
                                    THEN e.target_id ELSE e.source_id END
  WHERE r.depth < 2
)
SELECT * FROM reachable;
```

This enables questions like:

<CardGroup cols={1}>
  <Card title="Who has worked with Cole on billing-related topics?">
    Cole → threads where Cole is `participant` → other people `participant` in those same threads.
  </Card>

  <Card title="What decisions were made about Phoenix?">
    Phoenix topic → meetings `attended` for Phoenix → observations linked to those meetings.
  </Card>
</CardGroup>

## Graph explorer

The explorer in the web UI is an interactive visualization — search for any entity, expand its neighborhood, filter by edge type, navigate by clicking.

<CardGroup cols={3}>
  <Card title="Verify extraction" icon="badge-check">
    Confirm new connectors are producing the edges you expect.
  </Card>

  <Card title="Audit answers" icon="search">
    Understand "why did the answer include this person?"
  </Card>

  <Card title="Explore unknowns" icon="compass">
    Find connections you didn't know existed.
  </Card>
</CardGroup>

## Graph maintenance

Entities and edges update on each sync. Deleted source content triggers node and edge removal on the next run. Edge weights decay slightly per week unseen, so old relationships fade unless reinforced by new activity.

<Tip>
  The graph reflects the current state of your data, not a frozen snapshot.
</Tip>

## Why Postgres and not Neo4j

Fabric uses Postgres with pgvector for everything — graph, memory, vector search, full-text search, operational data.

<CardGroup cols={2}>
  <Card title="One database" icon="database" color="#10a37f">
    No sync overhead between a graph DB and a relational DB.
  </Card>

  <Card title="Recursive CTEs" icon="diagram-successor">
    Fast enough for 2–3 hop traversal at current scales.
  </Card>

  <Card title="pgvector mature" icon="vector-square">
    HNSW indexes, cosine distance, production-grade.
  </Card>

  <Card title="Inspectable" icon="eye">
    Any Postgres client, any BI tool, any ORM you already run.
  </Card>
</CardGroup>

When the graph outgrows recursive CTEs, Postgres is straightforward to pair with a dedicated graph store as a read replica — the two tables are simple enough to replicate without rework.
