Rebuild · Build Atlas REV 2026.05.06 · DRAWING 00 / 06
Drawing 00 — Project Atlas

LLM Runner
+ Homelab IaC.

Two parallel infrastructure projects, built from scratch in five macro stages. Each stage is a self-contained drawing — paste it into Kiro one phase at a time.

SCOPE
2Repos
5Stages
11Phases
550Credit Budget
llm-runner homelab-iac Kiro · Auto model Terraform + Ansible + Compose Created 2026-05-05
Drawings 01 — 05

Five Stages, In Order.

Each tile opens a full-detail drawing.
01 / Gate
Precheck
S3 backups, vault snapshot, restore spot-check. The gate before any wipe.
12 sign-offs
02 / Base
Foundations
Terraform infra, Ansible roles, Runner scaffolding + providers. Compose starts clean.
Homelab P1–2 Runner P1
03 / Online
Services
Homelab stack live. Runner worker + chat UI streaming. SearXNG over Tailscale.
Homelab P3 Runner P2–3
04 / Smart
Intelligence
MCP tools, RAG library on Qdrant, LangGraph retrieval graph. Research mode armed.
Runner P4–5
05 / Hardened
Polish
Agents, backup + tested restore, GitLab CI. Reproducible from zero.
Runner P6 Homelab P4–5
Live Progress

Project Telemetry

llm-runner
0% 0 / 38
6 phases · ~148 credits projected
homelab-iac
0% 0 / 29
5 phases · ~62 credits projected
precheck
0% 0 / 12
Sign-off gate · run first
Drawing 00.A — Gantt Schedule

Build Sequence (Phase, not Calendar)

Phase ticks, not dates. Homelab Phase 3 must complete before Runner Phase 4 (the SearXNG-on-Tailscale dependency).

PHASE → P0 P1 P2 P3 P4 P5 P6 DONE PRECHECK HOMELAB IAC LLM RUNNER precheck · Infisical → Terraform → backup P0 P1 · Terraform P2 · Ansible Roles P3 · Service Deployment ★ P3 P4 · Backup + Restore P5 · GitLab CI P1 · Foundation P2 · Worker P3 · Web + Chat P4 · MCP Server † P5 · RAG Library P6 · Agents ★ → † gate core build intelligence polish gate critical handoff
Drawing 00.B — Decision Trees

Routing & Restore Logic

Model Routing — runtime Job arrives model: ___ model in router map? no yes Ollama (Contabo) qwen2.5:14b cloud provider? yes rate limit OK? yes Call cloud API Groq · Gemini · Anth. 429 Hold + retry Discord alert no requires_davas or gemma4? yes davas healthy? ok Davas Gemma4:27b project mode down no
Restore — emergency path Loss event what is missing? scope of loss Vault note single file aws s3 cp fast bucket Service / config runner, gitea, … restore.yml --tags vault | gitea | dbs Whole VM or hardware loss terraform apply + ansible site.yml Spot-check restored data read · diff · open in app Restore complete log it in vault session note
Drawing 00.C — Choices Still Open

Defaults Set, Doors Left Ajar

Every row has a chosen default . Other options listed for the day you regret it.

Decision Default ▸ Alternates Why
Chat history ▸ SQLite DynamoDB · PostgreSQL One file, zero ops
Vector store ▸ Qdrant ChromaDB · pgvector Stable, REST API, good docs
Reverse proxy ▸ nginx Traefik · Caddy Already known
MCP framework ▸ fastmcp Official MCP SDK Pythonic, less boilerplate
Embedding model ▸ nomic-embed-text (Ollama) OpenAI · Gemini Free, local, no rate limit
CI registry ▸ GitLab Registry AWS ECR Already integrated
Drawing 00.D — Out of Scope

Doors Closed (For Now)

DEFERRED

K3s

Role exists, flag off. Flip when basics work.

DEFERRED

Immich

One Ansible role away. Add on demand.

DEFERRED

Voice UI

Future project. Not MVP.

DEFERRED

DynamoDB

SQLite first, upgrade path exists.

REJECTED

HuggingFace

Free tier too unreliable.

REJECTED

Image gen

Off-goal entirely.

REJECTED

Fine-tuning

Too complex. Out of scope.

FUTURE

Multi-agent

Architecture supports it, build toward it.