project-brief.md
1 # Project Brief — Bob 2 3 > Version: 1.0 | Status: Living Document | Last updated: 2026-03-23 4 5 ## Problem Statement 6 7 Modern families generate and depend on vast amounts of knowledge — schedules, medical records, financial data, home maintenance history, recipes, shared memories, and more — scattered across cloud services they don't control. Smart home systems are powerful but lack semantic understanding and operate as isolated silos. There is no unified, family-controlled platform that combines knowledge management with home awareness and AI-agent capabilities while keeping data sovereign, local, and private. 8 9 ## Vision 10 11 Bob is a self-hosted, agentic family knowledge management and home automation platform. It runs entirely on family-owned hardware (rig.lan, 3x RTX 3090), keeping all data local and encrypted. AI agents continuously organize, enrich, and make family knowledge queryable through natural voice and text interfaces. Semantic-first situational awareness integrates smart home devices, family activities, and environmental context into a coherent operational picture — all under family control, with no cloud dependencies for core functionality. 12 13 ## Stakeholders 14 15 | Role | Interest | 16 |------|----------| 17 | Family Members (all) | Voice-based interaction with smart home; natural language queries over family knowledge; privacy | 18 | System Administrator (user) | Declarative, agent-manageable infrastructure; reproducible deployments; low maintenance burden | 19 | AI Agents (Bob runtime) | Structured access to knowledge graphs, device state, and system configuration; MCP integration | 20 21 ## Success Criteria 22 23 | ID | Criterion | Measure | 24 |----|-----------|---------| 25 | SC-1 | All family knowledge is queryable via natural language | Family members can ask questions and get accurate answers from the knowledge graph | 26 | SC-2 | Voice interaction works in key rooms | Wake word → STT → LLM → TTS round-trip < 1 second in at least 3 rooms | 27 | SC-3 | Smart home devices are controllable via Bob | Bob can read/write HA device state and execute automations semantically | 28 | SC-4 | Zero cloud dependency for core functions | All STT, TTS, LLM, knowledge graph, and automation run on rig.lan | 29 | SC-5 | System is agent-manageable | AI agents can read/modify NixOS configuration and propose infrastructure changes | 30 | SC-6 | Data sovereignty | All family data encrypted at rest, CRDT-synced across family devices, no third-party access | 31 32 ## Constraints 33 34 - **Hardware**: Single server (rig.lan) with 3x RTX 3090 (72 GB VRAM total), plus system RAM (TBD, recommend 128 GB+) 35 - **GPU budget**: Must share VRAM between LLM inference (~50-57 GB), STT (~4-7 GB), and TTS (~4-7 GB) 36 - **Power**: 3x RTX 3090 under load draws ~1000W; adequate PSU and circuit required 37 - **OS**: NixOS (declarative, agent-manageable) — learning curve is a factor 38 - **Network**: Local network (rig.lan), with optional Tailscale/WireGuard for remote family access 39 - **Maturity**: Some key components (NextGraph, SemStreams, TrustGraph) are pre-production; architecture must tolerate swapping components 40 41 ## Architecture Overview 42 43 ``` 44 rig.lan (NixOS + Docker) | 3x RTX 3090 (72 GB VRAM) 45 ================================================================ 46 GPU 0: LLM (DeepSeek R1 70B Q5) GPU 1: STT Pipeline 47 GPU 2: TTS Pipeline (voice models share GPU space) 48 ================================================================ 49 NATS JetStream ←→ Semantic Event Bus 50 ================================================================ 51 Data Sovereignty Intelligence Situational Awareness 52 (Automerge + Oxigraph (TrustGraph + (HA bridge + NATS + 53 → NextGraph) Graphiti) semantic home model) 54 ================================================================ 55 Voice Interface Clients 56 (Wyoming + Pipecat) (Web, ATAK/iTAK, Mobile) 57 ``` 58 59 Detailed architecture in `_bmad/architecture.md`. 60 61 ## Technology Decisions 62 63 | Decision | Choice | Rationale | 64 |----------|--------|-----------| 65 | Host OS | NixOS | Declarative text-file config is agent-manageable; atomic rollbacks; reproducible; validated by real agent-managed infra projects (Stapelberg, Levin) | 66 | Service hosting | Docker on NixOS | NixOS manages host/drivers/networking; Docker runs application services; avoids fighting Nix packaging for every app | 67 | Primary LLM | DeepSeek R1 Distill 70B (Q5_K_M) | Best reasoning at 70B class; fits in ~49 GB leaving room for voice; chain-of-thought visibility | 68 | LLM serving | vLLM with tensor parallelism | High throughput, PagedAttention, multi-GPU support, OpenAI-compatible API | 69 | Knowledge management | Automerge + Oxigraph (→ NextGraph) | CRDT sync + RDF/SPARQL now with proven libs; migrate to NextGraph when production-ready | 70 | Intelligence layer | TrustGraph | Automated knowledge graph construction, Ontology RAG, local LLM support, RDF-native | 71 | Agent memory | Graphiti (Zep) | Temporal knowledge graphs for real-time interaction memory; complements TrustGraph | 72 | Event backbone | NATS JetStream | Lightweight, MQTT bridge built-in, semantic subject routing, state store | 73 | Device integration | HomeAssistant (bridged) | 2000+ device integrations; Bob consumes HA WebSocket API as data source | 74 | STT | faster-whisper large-v3-turbo (INT8) | ~2 GB VRAM, 7.75% WER, ~800x real-time, MIT license | 75 | TTS (primary) | Kokoro-82M | <2 GB VRAM, <0.3s latency, Apache 2.0, 54 voices | 76 | TTS (voice cloning) | Chatterbox-Turbo | 350M params, MIT, sub-200ms, 23 languages | 77 | TTS (expressive) | Orpheus 1B | Emotion tags, Apache 2.0 | 78 | Wake word | openWakeWord | CPU-only, trainable, HA-integrated | 79 | Speaker ID | pyannote.audio 3.1 | Production-grade diarization | 80 | Voice framework | Wyoming Protocol + Pipecat | HA integration + custom voice agent pipelines | 81 | SA interop | CoT bridge | Cursor on Target for TAK ecosystem interoperability | 82 | Secrets | sops-nix | Encrypted secrets in git, SSH key integration |