Session Ingestion

Session ingestion captures design decisions from your AI coding conversations (Claude Code transcripts) and writes them into the knowledge graph.

How It Works

The pipeline has three phases:

Phase 0 — Preprocess. Reads the raw JSONL transcript, compresses turns (strips tool calls, collapses long outputs), and identifies which files were touched during the session.

Phase 1 — Segment. Sends the compressed conversation to Claude, which splits it into logical segments — each representing a distinct task or discussion topic. Each segment is tagged with whether it likely contains design decisions.

Phase 2 — Extract. For each approved segment, the system pulls in code structure context from the graph (callers, callees, file structure), then asks Claude to extract specific decisions with anchoring information.

~/.claude/projects/*.jsonl
         ↓
   Phase 0: Parse & compress turns
         ↓
   Phase 1: LLM segments the conversation
         ↓
   User approves which segments to analyze
         ↓
   Phase 2: Per-segment deep extraction + graph context
         ↓
   Write DecisionContext nodes to Memgraph
         ↓
   Create PENDING_COMPARISON edges
         ↓
   (later) npm run connect → build relationship edges

Where Transcripts Come From

Claude Code stores conversation transcripts as JSONL files in:

~/.claude/projects/<hashed-project-dir>/<session-id>.jsonl

Each line is a message (user, assistant, or tool call). The ingestion pipeline reads these directly — no export step needed.

Usage

bash

# Process all new sessions across all projects
npm run ingest:sessions:v2

# Process only sessions from a specific project
npm run ingest:sessions:v2 -- --project bite-me-website

# Process a specific session by ID
npm run ingest:sessions:v2 -- --session abc123

# Auto-approve all segments that have decisions (skip interactive prompt)
npm run ingest:sessions:v2 -- --auto-approve

# Dry run — Phase 0 only, no LLM calls (useful for previewing)
npm run ingest:sessions:v2 -- --dry-run

# Re-process a previously ingested session
npm run ingest:sessions:v2 -- --force --session abc123

# Control concurrency for Phase 2 extraction
npm run ingest:sessions:v2 -- --concurrency 3

Interactive Approval

By default, after Phase 1 segments the conversation, you'll see a list like:

[abc12345] bite-me-website
    42 turns | 5 files | ~12000 tokens
    🔍 Phase 1: Segmenting...
    ✓ 4 segments (2 with decisions):
    [1] ✅ Turn 1-12:  Refactored auth middleware to use JWT
        Hints: chose JWT over session cookies, trade-off discussion
    [2] ❌ Turn 13-20: Fixed CSS layout bug
    [3] ✅ Turn 21-35: Designed rate limiting strategy
        Hints: Redis vs in-memory, sliding window approach
    [4] ❌ Turn 36-42: Updated README

    Analyze which? (all / 1,3 / none):

You can select specific segments by number, analyze all, or skip. Use --auto-approve to automatically analyze all segments tagged with decisions.

State Tracking

Processed sessions are tracked in data/ingested-sessions-v2.json. Each entry records:

Session ID
Number of segments found and approved
Number of decisions extracted
Decision IDs (for re-processing with --force)

On subsequent runs, only new (unprocessed) sessions are picked up. Use --force to re-process a session — old decisions are deleted and replaced.

Large Sessions

Sessions exceeding ~80,000 tokens are automatically split into overlapping chunks for Phase 1 segmentation. The overlap (5 turns) prevents decisions at chunk boundaries from being missed. Segments from different chunks are deduplicated before Phase 2.

After Ingestion

Ingestion creates DecisionContext nodes and PENDING_COMPARISON edges. To build the relationship graph (CAUSED_BY, DEPENDS_ON, etc.), run:

bash

npm run connect

See CLI Reference for details.

Session Ingestion ​

How It Works ​

Where Transcripts Come From ​

Usage ​

Interactive Approval ​

State Tracking ​

Large Sessions ​

After Ingestion ​