STAGE 04 / SMART
Runner P4–5
~48 credits · highest risk
Intelligence.
Tools, retrieval, reasoning. The model can search the web, fetch URLs, read & write the vault, and pull relevant context from a vector store. Research mode goes live in chat.
Goal
Tool-using model + RAG over the vault.
Stage 03's SearXNG-on-Tailscale gate must be green. LangGraph is the wildcard — budget two iteration cycles.
Exit criteria
- fastmcp server registers 4 tools; each callable from chat
- Search uses SearXNG primary, Tavily fallback; raises if both fail
- Vault notes chunked, embedded (nomic-embed-text), upserted into Qdrant
- Three Qdrant collections:
vault,web,memory - LangGraph: decompose → retrieve → synthesise produces a useful answer
- Nightly re-index cron live
Tool calls and retrieval graph
Files & commands
mcp/server.py· fastmcp · all tools registeredtools/search.py· SearXNG primary, Tavily fallback, raises on dual-failtools/fetch.py· trafilatura → clean markdown · optional vault savetools/vault.py· read by path · write creates/updates notetools/code.py· sandboxed Python exec- Research mode wired into chat · model receives tool descriptions
tests/test_tools/*· mocked HTTP per tool
rag/indexer.py· chunk + nomic-embed-text + Qdrant upsertrag/retriever.py· query → search → rerank → top Nrag/graph.py· LangGraph: decompose → retrieve → synthesise- Collection
vault· Obsidian notes - Collection
web· fetched URLs - Collection
memory· user facts - Nightly re-index cron for
vault tests/test_rag/*· indexer chunking · retriever · graph nodes
## Runner — Phase 4 (MCP) Build the fastmcp server + 4 tools per spec. Search must call SearXNG via the Tailscale URL set in env. Tavily is fallback. If both fail, raise — never fall back to a stub. All tools tested with mocked HTTP. ## Runner — Phase 5 (RAG) ⚠ HIGHEST RISK LangGraph retrieval graph. Start by reading the LangGraph quickstart together. Build indexer first, prove an upsert works, THEN write the graph. If the graph node signatures fight you, simplify — start with linear flow before parallel retrieve.
Wildcards
+12 buffer
LangGraph
Newest library in the stack. Budget 28→40 credits. Build linear first, then parallelise.
WATCH
code_exec sandbox
Use a real sandbox (subprocess + tmpdir + timeout). Never exec() in-process.
WATCH
Embedding throughput
First vault index could be slow. Batch upserts, log progress, don't block the worker.