Rebuild · Build Atlas DRAWING 04 / 06
STAGE 04 / SMART Runner P4–5 ~48 credits · highest risk

Intelligence.

Tools, retrieval, reasoning. The model can search the web, fetch URLs, read & write the vault, and pull relevant context from a vector store. Research mode goes live in chat.

Goal

Tool-using model + RAG over the vault.

Stage 03's SearXNG-on-Tailscale gate must be green. LangGraph is the wildcard — budget two iteration cycles.

Exit criteria
  • fastmcp server registers 4 tools; each callable from chat
  • Search uses SearXNG primary, Tavily fallback; raises if both fail
  • Vault notes chunked, embedded (nomic-embed-text), upserted into Qdrant
  • Three Qdrant collections: vault, web, memory
  • LangGraph: decompose → retrieve → synthesise produces a useful answer
  • Nightly re-index cron live
Drawing 04.A — MCP + RAG Pipeline

Tool calls and retrieval graph

Chat (research mode) model + tool descriptions fastmcp server · :8001 tool: web_search tool: fetch_url tool: vault_read / write tool: code_exec (sandboxed) SearXNG · Tailscale Tavily (fallback) trafilatura → md vault filesystem LangGraph retrieval graph decompose query parallel retrieve rerank synthesise answer + citations Qdrant vault · web · memory 3 collections indexer chunk · nomic-embed nightly cron Tools live in MCP. Retrieval lives in LangGraph. Embeddings live in Qdrant. Each module is independent.
Drawing 04.B — Build Detail

Files & commands

  • mcp/server.py · fastmcp · all tools registered
  • tools/search.py · SearXNG primary, Tavily fallback, raises on dual-fail
  • tools/fetch.py · trafilatura → clean markdown · optional vault save
  • tools/vault.py · read by path · write creates/updates note
  • tools/code.py · sandboxed Python exec
  • Research mode wired into chat · model receives tool descriptions
  • tests/test_tools/* · mocked HTTP per tool
  • rag/indexer.py · chunk + nomic-embed-text + Qdrant upsert
  • rag/retriever.py · query → search → rerank → top N
  • rag/graph.py · LangGraph: decompose → retrieve → synthesise
  • Collection vault · Obsidian notes
  • Collection web · fetched URLs
  • Collection memory · user facts
  • Nightly re-index cron for vault
  • tests/test_rag/* · indexer chunking · retriever · graph nodes
## Runner — Phase 4 (MCP)
Build the fastmcp server + 4 tools per spec. Search must call
SearXNG via the Tailscale URL set in env. Tavily is fallback.
If both fail, raise — never fall back to a stub. All tools tested
with mocked HTTP.

## Runner — Phase 5 (RAG)  ⚠ HIGHEST RISK
LangGraph retrieval graph. Start by reading the LangGraph quickstart
together. Build indexer first, prove an upsert works, THEN write the
graph. If the graph node signatures fight you, simplify — start with
linear flow before parallel retrieve.
Risks

Wildcards

+12 buffer

LangGraph

Newest library in the stack. Budget 28→40 credits. Build linear first, then parallelise.

WATCH

code_exec sandbox

Use a real sandbox (subprocess + tmpdir + timeout). Never exec() in-process.

WATCH

Embedding throughput

First vault index could be slow. Batch upserts, log progress, don't block the worker.

← Stage 03 · Services NEXT · Stage 05
Polish →