STAGE 04 / SMART
Runner P4–5
~48 credits · highest risk
Intelligence.
Tools, retrieval, reasoning. The model can search the web, fetch URLs, read & write the vault, and pull relevant context from a vector store. Research mode goes live in chat.
Goal
Tool-using model + RAG over the vault.
Stage 03's SearXNG-on-Tailscale gate must be green. LangGraph is the wildcard — budget two iteration cycles.
Exit criteria
- fastmcp server registers 4 tools; each callable from chat
- Search uses SearXNG primary, Tavily fallback; raises if both fail
- Vault notes chunked, embedded (nomic-embed-text), upserted into Qdrant
- Three Qdrant collections:
vault,web,memory - LangGraph: decompose → retrieve → synthesise produces a useful answer
- Nightly re-index cron live
Tool calls and retrieval graph
Files & commands
## Runner — Phase 4 (MCP) Build the fastmcp server + 4 tools per spec. Search must call SearXNG via the Tailscale URL set in env. Tavily is fallback. If both fail, raise — never fall back to a stub. All tools tested with mocked HTTP. ## Runner — Phase 5 (RAG) ⚠ HIGHEST RISK LangGraph retrieval graph. Start by reading the LangGraph quickstart together. Build indexer first, prove an upsert works, THEN write the graph. If the graph node signatures fight you, simplify — start with linear flow before parallel retrieve.
Wildcards
+12 buffer
LangGraph
Newest library in the stack. Budget 28→40 credits. Build linear first, then parallelise.
WATCH
code_exec sandbox
Use a real sandbox (subprocess + tmpdir + timeout). Never exec() in-process.
WATCH
Embedding throughput
First vault index could be slow. Batch upserts, log progress, don't block the worker.