PRD.md
1 # Operational AI Demo-in-a-Box (SA SME) 2 3 ## Final Architecture + PRD + Requirements + Tech Spec (Cursor Build Pack) 4 5 ### Document purpose 6 7 A single build specification for Cursor/Antigravity to implement a **Dockerised, decoupled**, production-reusable prototype using **Google Gemini on Vertex + Vertex AI Search + Xero via MCP**, with **trust controls** (citations, approvals, read-only finance), **auditability**, and **incremental delivery**. The initial **Streamlit “demo shell”** must be replaceable later by **Node/Next.js UI** without rewriting backend logic. 8 9 --- 10 11 ## 1) Objectives 12 13 ### Primary objective 14 15 Deliver a **visual, product-like demo** in a browser that proves value in minutes: 16 17 * **Module A:** “Ask Your Business” — cited answers over messy documents (“no source = no answer”) 18 * **Module B:** “Inbox Triage” — classify + extract structured fields + draft-only + human approval 19 * **Module C:** “Xero Finance Lens” — natural language finance queries via Xero MCP, **read-only**, with drill-down verification tables 20 21 ### Secondary objectives 22 23 * The prototype must be reusable as a foundation for client PoCs and production builds. 24 * Enforce a security-first posture aligned to POPIA expectations (least privilege, audit trail, controlled actions). 25 * Keep it simple: prove “art of the possible” without boiling the ocean. 26 27 --- 28 29 ## 2) Non-goals (explicit exclusions) 30 31 * No model training / fine-tuning / custom foundation models. 32 * No autonomous execution of high-risk actions (payments, refunds, journal posting, contract signing). 33 * No full email integration (MVP is upload-based only; connectors later). 34 * No ERP/CRM reimplementation; no full data warehouse; no master data remediation. 35 * No guarantee of “AI processing in South Africa only” (some managed AI/search capabilities may require global/EU endpoints depending on service availability). Host the app/infra in Johannesburg where feasible; treat endpoint location as a compliance design choice. 36 37 --- 38 39 ## 3) Key decisions (must be enforced) 40 41 ### 3.1 Stack decisions (MVP) 42 43 * **Docs retrieval/grounding:** Vertex AI Search (Discovery Engine / Vertex AI Search) 44 * **LLM:** Gemini on Vertex via `google-genai` SDK 45 46 * **No deprecated model versions.** Do not reference Gemini 1.5. 47 * Use configurable model IDs: primary + fallback. 48 * **Finance integration:** Xero via MCP server running behind an MCP bridge/proxy 49 * **UI:** Streamlit initially (thin client only) 50 * **Backend:** API-first (FastAPI recommended) 51 * **Async processing:** Worker service + Redis queue (ingestion/indexing/email parsing) 52 * **Persistence:** Postgres for metadata/audit/approvals; named volumes for dev 53 * **Storage:** Storage abstraction local volume vs GCS (`STORAGE_BACKEND=local|gcs`) 54 55 ### 3.2 Replaceable UI principle (non-negotiable) 56 57 * Streamlit **must never**: 58 59 * call Google Vertex APIs directly 60 * call Xero/MCP directly 61 * connect to Postgres directly 62 * Streamlit **only** calls the `api-gateway` via HTTP. 63 * All domain logic + integrations live behind stable API contracts (OpenAPI/Swagger). 64 65 ### 3.3 OAuth ownership (critical for decoupling) 66 67 * OAuth callbacks **must not** terminate in Streamlit. 68 * Xero OAuth must be handled by the **mcp-bridge** (recommended) or `api-gateway`. 69 * Redirect URIs must target `mcp-bridge` (e.g., `http://localhost:3000/oauth/xero/callback`), not the Streamlit port. 70 71 --- 72 73 ## 4) Product experience (what the prospect sees) 74 75 ### 4.1 Landing page (mandatory) 76 77 A cohesive landing page with: 78 79 * 3 tiles: **Docs / Inbox / Finance** 80 * “Try it now” per tile 81 * Pre-canned demo prompts per module (no improvisation required) 82 * Status badges: Docs indexed / Xero connected / Emails uploaded 83 * Trust banner: **“Cited answers only • Human approval required • Read-only finance”** 84 85 ### 4.2 Common UI patterns (all modules) 86 87 * Evidence panel (citations/snippets, tool calls summary where relevant) 88 * “Draft” watermark on any generated action 89 * One-click export: 90 91 * Module A: export answer + citations 92 * Module C: export drill-down table (CSV) 93 94 --- 95 96 ## 5) Functional requirements 97 98 ### Module A — Ask Your Business (Docs / RAG) 99 100 **Goal:** Trusted answers over messy documents. 101 102 **Inputs** 103 104 * Upload PDFs (required) + optionally DOCX/TXT. 105 * Persist originals across container rebuilds. 106 * Index into Vertex AI Search datastore. 107 108 **Query behavior (hard trust rule)** 109 110 * **No source = no answer** (default ON, cannot be bypassed in client demos). 111 * If retrieval returns insufficient evidence: 112 113 * Return exactly: **“Information not found in internal records.”** 114 115 **Grounding policy** 116 117 * Use a retrieval threshold policy expressed in supported Vertex AI Search terms: 118 119 * Either a strict threshold mode (e.g., HIGH) 120 * Or `semanticRelevanceThreshold=0.7` via filter spec (as supported) 121 (Expose as config `DOCS_RELEVANCE_MODE` and `DOCS_SEMANTIC_THRESHOLD`.) 122 123 **Outputs** 124 125 * Answer text 126 * Citations array: `{doc_name, snippet, page_or_section?, uri_or_id}` 127 * Evidence panel must show snippets and source identifiers. 128 129 **Admin** 130 131 * Document list and indexing status (pending/indexing/ready/failed) 132 * Delete doc (soft delete) + reindex capability 133 134 --- 135 136 ### Module B — Inbox Triage (Upload-based, human approval) 137 138 **Goal:** Turn emails into structured, reviewable work—draft-only. 139 140 **Inputs** 141 142 * Upload `.eml` / `.msg` / `.txt` (MVP) 143 * Parse thread content into canonical text 144 145 **Processing** 146 147 * Classify into: `Invoices | Sales | HR | Ops | Other` 148 * Extract **structured JSON** using Gemini (Flash family) with a fixed schema: 149 150 Schema (minimum): 151 152 * `category` 153 * `vendor_or_customer_name` 154 * `amount` (number) 155 * `currency` 156 * `vat_number` 157 * `invoice_number` 158 * `due_date` (ISO date) 159 * `action_recommendation` (enum: `draft_reply | create_task | request_missing_info | escalate`) 160 * `confidence` (0..1) 161 * `evidence_snippets[]` (strings copied from email text) 162 163 **Validation rules** 164 165 * `amount` must parse and be positive if present 166 * `due_date` must parse if present 167 * If required fields missing for “invoice-like” messages: 168 169 * set `action_recommendation=request_missing_info` 170 * lower confidence 171 * highlight missing fields in UI 172 173 **Hard trust rule** 174 175 * System outputs are **draft-only**. 176 * Human approval required before any external action (sending mail, creating tickets, writing to any system). MVP stops at “approved draft”. 177 178 **UI** 179 180 * Queue view: new → extracted → awaiting approval → approved/rejected 181 * Approval page shows: 182 183 * original email 184 * extracted JSON 185 * draft reply/task 186 * approve/reject + comment 187 * Audit record written for every transition 188 189 --- 190 191 ### Module C — Xero Finance Lens (Read-only via MCP) 192 193 **Goal:** Answer finance questions from live accounting data with drill-down verification. 194 195 **Connection** 196 197 * OAuth 2.0 with PKCE handled by `mcp-bridge`. 198 * Tokens persisted (encrypted) across restarts. 199 * Streamlit never handles OAuth. 200 201 **Integration** 202 203 * `mcp-bridge` runs the Xero MCP server. 204 * `api-gateway` calls `mcp-bridge` over internal HTTP. 205 * Enforce **deny-by-default tool policy** at the gateway: 206 207 * allow-list read-only actions only 208 * block write-capable tools even if exposed by MCP server 209 210 **Query rules (hard trust)** 211 212 * Every response must include a drill-down table with source records used. 213 * If tool calls fail or return no data: 214 215 * Return “Insufficient data to answer” and show “what data was queried”. 216 217 **Outputs** 218 219 * Answer narrative 220 * Drill-down table rows (invoices/contacts/ageing) 221 * Export CSV 222 223 --- 224 225 ## 6) Non-functional requirements (NFRs) 226 227 ### 6.1 Decoupling and API stability 228 229 * `api-gateway` publishes OpenAPI spec; UI consumes only HTTP endpoints. 230 * All module logic is in backend services; UI is replaceable. 231 232 ### 6.2 Docker-first microservices (compose) 233 234 All components run via Docker Compose and are independently buildable: 235 236 * `frontend` (Streamlit demo shell) 237 * `api-gateway` (FastAPI) 238 * `worker` (async jobs) 239 * `mcp-bridge` (Node: MCP over HTTP + OAuth PKCE) 240 * `postgres` 241 * `redis` 242 243 ### 6.3 Persistence (must survive rebuild/restart) 244 245 Named volumes required: 246 247 * `pgdata` → Postgres 248 * `uploads` → local storage backend for docs/emails (dev) 249 * `sessions` → encrypted token/session material (dev only; production uses Secret Manager/KMS) 250 251 Redis is **not** a system of record. It is queue/caching only. 252 253 ### 6.4 Environment-driven configuration (externalised) 254 255 * No hardcoded config values. 256 * Provide `.env.example` and a startup validator that fails fast if required env vars are missing. 257 * Separate “app region” (hosting) from “AI feature location” (Vertex endpoints) via env vars. 258 259 ### 6.5 Versioning and deprecation control 260 261 * Maintain `VERSIONS.md` with pinned: 262 263 * base images 264 * MCP server package versions 265 * Gemini model IDs (primary + fallback) 266 * Vertex AI Search configuration 267 * No floating tags for critical dependencies; pin major/minor at minimum. 268 269 ### 6.6 Container build best practices 270 271 * Slim base images 272 * Multi-stage builds where relevant 273 * Non-root user in every container 274 * Health checks for all services 275 * Dependency startup ordering based on health checks 276 277 ### 6.7 Security posture 278 279 * Secrets never committed; `.env` ignored. 280 * Local dev may use key files; production must use runtime identity (no long-lived key files). 281 * Least privilege IAM for GCP service accounts. 282 * Full audit event capture for: 283 284 * queries, retrieval sources, tool calls, approvals, errors 285 * MCP supply-chain controls: 286 287 * pin MCP server versions 288 * allow-list tools 289 * log all tool calls 290 291 --- 292 293 ## 7) Service architecture 294 295 ### 7.1 Service matrix 296 297 | Service | Tech | Port | Purpose | 298 | ------------- | ----------------------- | ---: | ------------------------------------------------------------------- | 299 | `frontend` | Streamlit (Python 3.12) | 8501 | Thin UI shell; calls `api-gateway` only | 300 | `api-gateway` | FastAPI (Python 3.12) | 8000 | Core orchestration; Vertex/Gemini calls; trust enforcement; audit | 301 | `worker` | Python (Celery/RQ) | — | Background jobs: doc ingest/index trigger, email parsing/extraction | 302 | `mcp-bridge` | Node (pinned major) | 3000 | Runs MCP servers; OAuth PKCE; exposes MCP tools over HTTP | 303 | `postgres` | Postgres | 5432 | System-of-record: metadata, approvals, audit events | 304 | `redis` | Redis | 6379 | Queue and caching | 305 306 ### 7.2 Responsibilities 307 308 * **frontend** 309 310 * UI navigation, upload forms, display evidence, approvals 311 * **api-gateway** 312 313 * stable REST API + OpenAPI 314 * Module A routing to Vertex AI Search (with thresholds) 315 * Module B Gemini extraction (schema enforcement) + approval workflow 316 * Module C query orchestration to `mcp-bridge` + tool allow-list enforcement 317 * audit writing to Postgres 318 * **worker** 319 320 * ingestion pipelines 321 * retries/backoff 322 * indexing triggers to Vertex AI Search 323 * **mcp-bridge** 324 325 * OAuth PKCE callbacks for Xero 326 * MCP server process management 327 * exposes a controlled HTTP interface to MCP tool calls (internal only) 328 329 --- 330 331 ## 8) Data model (Postgres) 332 333 ### Tables (minimum) 334 335 * `audit_event` 336 337 * id, ts, module, user_id/session_id, request_id, prompt_hash, sources_json, tool_calls_json, decision_json, status, error 338 * `doc_asset` 339 340 * id, filename, storage_uri, uploaded_at, indexed_status, datastore_ref, deleted_at 341 * `email_asset` 342 343 * id, filename, storage_uri, uploaded_at, parsed_text_ref, classification, extracted_json, approval_status, approver_id, approved_at 344 * `xero_tenant` 345 346 * id, tenant_id, connected_at, token_ref (encrypted), last_used_at 347 348 --- 349 350 ## 9) API contracts (UI-to-backend) 351 352 ### Common rules 353 354 * All responses return `request_id` for traceability. 355 * All module endpoints write an audit event (success/failure). 356 357 ### Module A 358 359 * `POST /docs/upload` 360 * `POST /docs/index` (trigger indexing) 361 * `GET /docs/status` 362 * `POST /docs/query` → returns `{answer, citations[]}` 363 364 * Must refuse if citations are empty. 365 366 ### Module B 367 368 * `POST /inbox/upload` 369 * `POST /inbox/process/{id}` 370 * `GET /inbox/queue` 371 * `GET /inbox/{id}` 372 * `POST /inbox/{id}/approve` (approve/reject + comment) 373 374 ### Module C 375 376 * `GET /finance/status` (connected yes/no) 377 * `POST /finance/query` → `{answer, rows[], query_trace}` 378 379 * Must include verification rows. 380 381 ### Admin 382 383 * `GET /audit` (filters by module, status, date) 384 * `GET /health` 385 386 --- 387 388 ## 10) Environment parameters (`.env.example`) 389 390 ```bash 391 # App basics 392 APP_ENV=local 393 APP_REGION=africa-south1 394 API_BASE_URL=http://api-gateway:8000 395 STORAGE_BACKEND=local # local|gcs 396 397 # GCP / Vertex 398 GOOGLE_CLOUD_PROJECT="your-project-id" 399 GOOGLE_GENAI_USE_VERTEXAI=True 400 VERTEX_LOCATION="global" # or your chosen Vertex GenAI location 401 DISCOVERY_ENGINE_LOCATION="global" # Vertex AI Search location (commonly global) 402 GCS_BUCKET_NAME="sme-ops-center-uploads" 403 404 # Vertex AI Search / Discovery Engine (from console Steps 13-15) 405 DATA_STORE_ID="your-data-store-id" 406 ENGINE_ID="your-engine-id" 407 408 # Vertex AI Search / RAG controls 409 DOCS_RELEVANCE_MODE="HIGH" # or FILTER_SPEC 410 DOCS_SEMANTIC_THRESHOLD=0.7 # used if FILTER_SPEC 411 DOCS_MAX_RESULTS=5 412 413 # Gemini models (no deprecated IDs) 414 GEMINI_MODEL_PRIMARY="gemini-2.5-flash" 415 GEMINI_MODEL_FALLBACK="gemini-2.0-flash" 416 GEMINI_MAX_OUTPUT_TOKENS=1024 417 418 # Xero (OAuth handled by mcp-bridge) 419 XERO_CLIENT_ID="your-client-id" 420 XERO_CLIENT_SECRET="your-client-secret" 421 XERO_REDIRECT_URI="http://localhost:3000/oauth/xero/callback" 422 423 # Security 424 SECRET_KEY="generate-a-secure-key" 425 ENCRYPTION_SALT="for-token-storage" 426 ALLOWED_ORIGINS="http://localhost:8501" 427 428 # Database 429 POSTGRES_HOST=postgres 430 POSTGRES_DB=smeops 431 POSTGRES_USER=smeops 432 POSTGRES_PASSWORD="change-me" 433 434 # Redis 435 REDIS_URL=redis://redis:6379/0 436 ``` 437 438 Startup validation must fail-fast if required vars are absent. 439 440 --- 441 442 ## 11) Docker / Compose requirements 443 444 ### Compose requirements 445 446 * Use named volumes: 447 448 * `pgdata` (Postgres) 449 * `uploads` (local storage dev) 450 * `sessions` (dev token/session persistence for bridge, encrypted) 451 * `redis-data` (Redis persistence) 452 * All application services run as non-root (UID 1000 for Python/Node containers). 453 * **Postgres/Redis**: Do NOT override user; use official image default non-root users (they handle initialization). 454 * Health checks for Postgres/Redis/API. 455 * **Important**: For Node services with bind mounts, use anonymous volume for `node_modules` to preserve installed dependencies (e.g., `/app/node_modules`). 456 457 ### Persistence rules 458 459 * Uploaded docs/emails must survive rebuilds. 460 * Token/session data must survive restarts (encrypted at rest for dev volume; production uses managed secrets). 461 * `node_modules` in Node services should be preserved via anonymous volume when source is bind-mounted. 462 463 ### Base Images and Versions 464 465 * Python services: `python:3.12-slim` 466 * Node services: `node:20-slim` (pinned in package.json engines) 467 * Postgres: `postgres:16-alpine` 468 * Redis: `redis:7-alpine` 469 470 ### Known Configuration Issues Resolved (Milestone 0) 471 472 * **Node UID conflict**: Node images already have `node` user (UID 1000); use existing user instead of creating new one. 473 * **Empty requirements.txt**: Use conditional pip install to handle empty/comment-only requirements files. 474 * **npm ci vs npm install**: Use `npm install` for scaffold phase (no package-lock.json yet); switch to `npm ci` once lock file exists. 475 * **Volume mount overwrites**: Anonymous volumes preserve `node_modules` when source code is bind-mounted for development. 476 * **Postgres/Redis permissions**: Official images initialize correctly with their default users; do not override with custom UIDs. 477 478 --- 479 480 ## 12) Incremental build milestones (must follow) 481 482 ### Milestone 0 — Scaffold 483 484 * repo + docker compose boots cleanly 485 * API health endpoint 486 * Postgres migrations run 487 * Audit event write works 488 * Streamlit loads and can call API 489 490 ### Milestone 1 — Module A 491 492 * Upload docs → persist → index job → query returns citations or refuses 493 * Evidence panel works 494 * Index status view works 495 496 ### Milestone 2 — Module B 497 498 * Upload email → parse → classify/extract JSON (schema-validated) 499 * Approval workflow works end-to-end 500 * Draft replies/tasks shown only (no external actions) 501 502 ### Milestone 3 — Module C 503 504 * Xero OAuth via bridge works 505 * Finance queries return narrative + drill-down rows 506 * Read-only tool allow-list enforced and tested 507 508 ### Milestone 4 — Demo hardening 509 510 * Guided demo mode with pre-canned prompts 511 * Export CSV (finance) + export answer/citations (docs) 512 * Robust error states and status badges 513 514 --- 515 516 ## 13) Repository structure (recommended) 517 518 ```text 519 repo/ 520 README.md 521 PRD.md 522 VERSIONS.md 523 .env.example 524 .gitignore 525 docker-compose.yml 526 frontend/ 527 Dockerfile 528 app.py 529 requirements.txt 530 api-gateway/ 531 Dockerfile 532 app/ 533 main.py 534 routes/ 535 services/ 536 schemas/ 537 security/ 538 storage/ 539 requirements.txt 540 worker/ 541 Dockerfile 542 app/ 543 worker.py 544 jobs/ 545 requirements.txt 546 mcp-bridge/ 547 Dockerfile 548 package.json 549 src/ 550 server.ts 551 oauth/ 552 mcp/ 553 allowlist/ 554 db/ 555 migrations/ 556 .cursor/ 557 rules/ 558 architecture.mdc 559 ``` 560 561 --- 562 563 ## 14) Cursor rules (`.cursor/rules/architecture.mdc`) — required content 564 565 Add a rule file that enforces: 566 567 * Docker-only execution 568 * UI calls only API (no direct provider/system calls) 569 * Strict milestones (no big-bang) 570 * No secrets in repo 571 * No deprecated model IDs 572 * Read-only allow-list for Xero tools 573 * Audit logging mandatory per request 574 * Persistent volumes for Postgres/uploads/sessions 575 576 --- 577 578 ## 15) “Interrogate requirements” checklist (Cursor must follow) 579 580 Before implementing any feature: 581 582 * Identify module + acceptance criteria 583 * Identify data needed + where it is persisted 584 * Identify API contract impact 585 * Identify security implications (secrets, OAuth, audit, tool allow-list) 586 * Confirm Dockerisation + restart survival 587 * Confirm no deprecated model/package usage 588 * Confirm simplest path (MVP first) 589 590 --- 591 592 # Copy-paste “Master Cursor Instructions” 593 594 > Act as a Senior Lead Architect. Initialize a monorepo `sme-ops-center` with a decoupled microservices architecture. 595 > 596 > 1. Create `docker-compose.yml` defining `frontend` (Streamlit), `api-gateway` (FastAPI), `worker`, `mcp-bridge` (Node), `postgres`, and `redis`, all running as non-root with named volumes for persistence. 597 > 2. Enforce decoupling: frontend must only call REST endpoints on `api-gateway`; it must not call GCP, Xero, or Postgres directly. 598 > 3. Implement environment validation on startup from `.env` and provide `.env.example`. No secrets committed. 599 > 4. Implement Module A using Vertex AI Search with strict grounding and “no source = no answer”. 600 > 5. Implement Module B using Gemini on Vertex (no deprecated model IDs) with schema-validated JSON extraction + human approval workflow; generate drafts only. 601 > 6. Implement Module C using Xero MCP server behind `mcp-bridge`. OAuth callback must be owned by bridge/gateway (not Streamlit). Enforce read-only tool allow-list and always return drill-down tables. 602 > 7. Build incrementally: Milestone 0 → Module A → Module B → Module C → hardening. Do not build everything at once. 603 > 8. Generate OpenAPI for `api-gateway` and keep UI replaceable by a future Node/Next.js frontend without backend rewrite. 604 605 ---