/ docs / PRD.md
PRD.md
  1  # Operational AI Demo-in-a-Box (SA SME)
  2  
  3  ## Final Architecture + PRD + Requirements + Tech Spec (Cursor Build Pack)
  4  
  5  ### Document purpose
  6  
  7  A single build specification for Cursor/Antigravity to implement a **Dockerised, decoupled**, production-reusable prototype using **Google Gemini on Vertex + Vertex AI Search + Xero via MCP**, with **trust controls** (citations, approvals, read-only finance), **auditability**, and **incremental delivery**. The initial **Streamlit “demo shell”** must be replaceable later by **Node/Next.js UI** without rewriting backend logic.
  8  
  9  ---
 10  
 11  ## 1) Objectives
 12  
 13  ### Primary objective
 14  
 15  Deliver a **visual, product-like demo** in a browser that proves value in minutes:
 16  
 17  * **Module A:** “Ask Your Business” — cited answers over messy documents (“no source = no answer”)
 18  * **Module B:** “Inbox Triage” — classify + extract structured fields + draft-only + human approval
 19  * **Module C:** “Xero Finance Lens” — natural language finance queries via Xero MCP, **read-only**, with drill-down verification tables
 20  
 21  ### Secondary objectives
 22  
 23  * The prototype must be reusable as a foundation for client PoCs and production builds.
 24  * Enforce a security-first posture aligned to POPIA expectations (least privilege, audit trail, controlled actions).
 25  * Keep it simple: prove “art of the possible” without boiling the ocean.
 26  
 27  ---
 28  
 29  ## 2) Non-goals (explicit exclusions)
 30  
 31  * No model training / fine-tuning / custom foundation models.
 32  * No autonomous execution of high-risk actions (payments, refunds, journal posting, contract signing).
 33  * No full email integration (MVP is upload-based only; connectors later).
 34  * No ERP/CRM reimplementation; no full data warehouse; no master data remediation.
 35  * No guarantee of “AI processing in South Africa only” (some managed AI/search capabilities may require global/EU endpoints depending on service availability). Host the app/infra in Johannesburg where feasible; treat endpoint location as a compliance design choice.
 36  
 37  ---
 38  
 39  ## 3) Key decisions (must be enforced)
 40  
 41  ### 3.1 Stack decisions (MVP)
 42  
 43  * **Docs retrieval/grounding:** Vertex AI Search (Discovery Engine / Vertex AI Search)
 44  * **LLM:** Gemini on Vertex via `google-genai` SDK
 45  
 46    * **No deprecated model versions.** Do not reference Gemini 1.5.
 47    * Use configurable model IDs: primary + fallback.
 48  * **Finance integration:** Xero via MCP server running behind an MCP bridge/proxy
 49  * **UI:** Streamlit initially (thin client only)
 50  * **Backend:** API-first (FastAPI recommended)
 51  * **Async processing:** Worker service + Redis queue (ingestion/indexing/email parsing)
 52  * **Persistence:** Postgres for metadata/audit/approvals; named volumes for dev
 53  * **Storage:** Storage abstraction local volume vs GCS (`STORAGE_BACKEND=local|gcs`)
 54  
 55  ### 3.2 Replaceable UI principle (non-negotiable)
 56  
 57  * Streamlit **must never**:
 58  
 59    * call Google Vertex APIs directly
 60    * call Xero/MCP directly
 61    * connect to Postgres directly
 62  * Streamlit **only** calls the `api-gateway` via HTTP.
 63  * All domain logic + integrations live behind stable API contracts (OpenAPI/Swagger).
 64  
 65  ### 3.3 OAuth ownership (critical for decoupling)
 66  
 67  * OAuth callbacks **must not** terminate in Streamlit.
 68  * Xero OAuth must be handled by the **mcp-bridge** (recommended) or `api-gateway`.
 69  * Redirect URIs must target `mcp-bridge` (e.g., `http://localhost:3000/oauth/xero/callback`), not the Streamlit port.
 70  
 71  ---
 72  
 73  ## 4) Product experience (what the prospect sees)
 74  
 75  ### 4.1 Landing page (mandatory)
 76  
 77  A cohesive landing page with:
 78  
 79  * 3 tiles: **Docs / Inbox / Finance**
 80  * “Try it now” per tile
 81  * Pre-canned demo prompts per module (no improvisation required)
 82  * Status badges: Docs indexed / Xero connected / Emails uploaded
 83  * Trust banner: **“Cited answers only • Human approval required • Read-only finance”**
 84  
 85  ### 4.2 Common UI patterns (all modules)
 86  
 87  * Evidence panel (citations/snippets, tool calls summary where relevant)
 88  * “Draft” watermark on any generated action
 89  * One-click export:
 90  
 91    * Module A: export answer + citations
 92    * Module C: export drill-down table (CSV)
 93  
 94  ---
 95  
 96  ## 5) Functional requirements
 97  
 98  ### Module A — Ask Your Business (Docs / RAG)
 99  
100  **Goal:** Trusted answers over messy documents.
101  
102  **Inputs**
103  
104  * Upload PDFs (required) + optionally DOCX/TXT.
105  * Persist originals across container rebuilds.
106  * Index into Vertex AI Search datastore.
107  
108  **Query behavior (hard trust rule)**
109  
110  * **No source = no answer** (default ON, cannot be bypassed in client demos).
111  * If retrieval returns insufficient evidence:
112  
113    * Return exactly: **“Information not found in internal records.”**
114  
115  **Grounding policy**
116  
117  * Use a retrieval threshold policy expressed in supported Vertex AI Search terms:
118  
119    * Either a strict threshold mode (e.g., HIGH)
120    * Or `semanticRelevanceThreshold=0.7` via filter spec (as supported)
121      (Expose as config `DOCS_RELEVANCE_MODE` and `DOCS_SEMANTIC_THRESHOLD`.)
122  
123  **Outputs**
124  
125  * Answer text
126  * Citations array: `{doc_name, snippet, page_or_section?, uri_or_id}`
127  * Evidence panel must show snippets and source identifiers.
128  
129  **Admin**
130  
131  * Document list and indexing status (pending/indexing/ready/failed)
132  * Delete doc (soft delete) + reindex capability
133  
134  ---
135  
136  ### Module B — Inbox Triage (Upload-based, human approval)
137  
138  **Goal:** Turn emails into structured, reviewable work—draft-only.
139  
140  **Inputs**
141  
142  * Upload `.eml` / `.msg` / `.txt` (MVP)
143  * Parse thread content into canonical text
144  
145  **Processing**
146  
147  * Classify into: `Invoices | Sales | HR | Ops | Other`
148  * Extract **structured JSON** using Gemini (Flash family) with a fixed schema:
149  
150  Schema (minimum):
151  
152  * `category`
153  * `vendor_or_customer_name`
154  * `amount` (number)
155  * `currency`
156  * `vat_number`
157  * `invoice_number`
158  * `due_date` (ISO date)
159  * `action_recommendation` (enum: `draft_reply | create_task | request_missing_info | escalate`)
160  * `confidence` (0..1)
161  * `evidence_snippets[]` (strings copied from email text)
162  
163  **Validation rules**
164  
165  * `amount` must parse and be positive if present
166  * `due_date` must parse if present
167  * If required fields missing for “invoice-like” messages:
168  
169    * set `action_recommendation=request_missing_info`
170    * lower confidence
171    * highlight missing fields in UI
172  
173  **Hard trust rule**
174  
175  * System outputs are **draft-only**.
176  * Human approval required before any external action (sending mail, creating tickets, writing to any system). MVP stops at “approved draft”.
177  
178  **UI**
179  
180  * Queue view: new → extracted → awaiting approval → approved/rejected
181  * Approval page shows:
182  
183    * original email
184    * extracted JSON
185    * draft reply/task
186    * approve/reject + comment
187  * Audit record written for every transition
188  
189  ---
190  
191  ### Module C — Xero Finance Lens (Read-only via MCP)
192  
193  **Goal:** Answer finance questions from live accounting data with drill-down verification.
194  
195  **Connection**
196  
197  * OAuth 2.0 with PKCE handled by `mcp-bridge`.
198  * Tokens persisted (encrypted) across restarts.
199  * Streamlit never handles OAuth.
200  
201  **Integration**
202  
203  * `mcp-bridge` runs the Xero MCP server.
204  * `api-gateway` calls `mcp-bridge` over internal HTTP.
205  * Enforce **deny-by-default tool policy** at the gateway:
206  
207    * allow-list read-only actions only
208    * block write-capable tools even if exposed by MCP server
209  
210  **Query rules (hard trust)**
211  
212  * Every response must include a drill-down table with source records used.
213  * If tool calls fail or return no data:
214  
215    * Return “Insufficient data to answer” and show “what data was queried”.
216  
217  **Outputs**
218  
219  * Answer narrative
220  * Drill-down table rows (invoices/contacts/ageing)
221  * Export CSV
222  
223  ---
224  
225  ## 6) Non-functional requirements (NFRs)
226  
227  ### 6.1 Decoupling and API stability
228  
229  * `api-gateway` publishes OpenAPI spec; UI consumes only HTTP endpoints.
230  * All module logic is in backend services; UI is replaceable.
231  
232  ### 6.2 Docker-first microservices (compose)
233  
234  All components run via Docker Compose and are independently buildable:
235  
236  * `frontend` (Streamlit demo shell)
237  * `api-gateway` (FastAPI)
238  * `worker` (async jobs)
239  * `mcp-bridge` (Node: MCP over HTTP + OAuth PKCE)
240  * `postgres`
241  * `redis`
242  
243  ### 6.3 Persistence (must survive rebuild/restart)
244  
245  Named volumes required:
246  
247  * `pgdata` → Postgres
248  * `uploads` → local storage backend for docs/emails (dev)
249  * `sessions` → encrypted token/session material (dev only; production uses Secret Manager/KMS)
250  
251  Redis is **not** a system of record. It is queue/caching only.
252  
253  ### 6.4 Environment-driven configuration (externalised)
254  
255  * No hardcoded config values.
256  * Provide `.env.example` and a startup validator that fails fast if required env vars are missing.
257  * Separate “app region” (hosting) from “AI feature location” (Vertex endpoints) via env vars.
258  
259  ### 6.5 Versioning and deprecation control
260  
261  * Maintain `VERSIONS.md` with pinned:
262  
263    * base images
264    * MCP server package versions
265    * Gemini model IDs (primary + fallback)
266    * Vertex AI Search configuration
267  * No floating tags for critical dependencies; pin major/minor at minimum.
268  
269  ### 6.6 Container build best practices
270  
271  * Slim base images
272  * Multi-stage builds where relevant
273  * Non-root user in every container
274  * Health checks for all services
275  * Dependency startup ordering based on health checks
276  
277  ### 6.7 Security posture
278  
279  * Secrets never committed; `.env` ignored.
280  * Local dev may use key files; production must use runtime identity (no long-lived key files).
281  * Least privilege IAM for GCP service accounts.
282  * Full audit event capture for:
283  
284    * queries, retrieval sources, tool calls, approvals, errors
285  * MCP supply-chain controls:
286  
287    * pin MCP server versions
288    * allow-list tools
289    * log all tool calls
290  
291  ---
292  
293  ## 7) Service architecture
294  
295  ### 7.1 Service matrix
296  
297  | Service       | Tech                    | Port | Purpose                                                             |
298  | ------------- | ----------------------- | ---: | ------------------------------------------------------------------- |
299  | `frontend`    | Streamlit (Python 3.12) | 8501 | Thin UI shell; calls `api-gateway` only                             |
300  | `api-gateway` | FastAPI (Python 3.12)   | 8000 | Core orchestration; Vertex/Gemini calls; trust enforcement; audit   |
301  | `worker`      | Python (Celery/RQ)      |    — | Background jobs: doc ingest/index trigger, email parsing/extraction |
302  | `mcp-bridge`  | Node (pinned major)     | 3000 | Runs MCP servers; OAuth PKCE; exposes MCP tools over HTTP           |
303  | `postgres`    | Postgres                | 5432 | System-of-record: metadata, approvals, audit events                 |
304  | `redis`       | Redis                   | 6379 | Queue and caching                                                   |
305  
306  ### 7.2 Responsibilities
307  
308  * **frontend**
309  
310    * UI navigation, upload forms, display evidence, approvals
311  * **api-gateway**
312  
313    * stable REST API + OpenAPI
314    * Module A routing to Vertex AI Search (with thresholds)
315    * Module B Gemini extraction (schema enforcement) + approval workflow
316    * Module C query orchestration to `mcp-bridge` + tool allow-list enforcement
317    * audit writing to Postgres
318  * **worker**
319  
320    * ingestion pipelines
321    * retries/backoff
322    * indexing triggers to Vertex AI Search
323  * **mcp-bridge**
324  
325    * OAuth PKCE callbacks for Xero
326    * MCP server process management
327    * exposes a controlled HTTP interface to MCP tool calls (internal only)
328  
329  ---
330  
331  ## 8) Data model (Postgres)
332  
333  ### Tables (minimum)
334  
335  * `audit_event`
336  
337    * id, ts, module, user_id/session_id, request_id, prompt_hash, sources_json, tool_calls_json, decision_json, status, error
338  * `doc_asset`
339  
340    * id, filename, storage_uri, uploaded_at, indexed_status, datastore_ref, deleted_at
341  * `email_asset`
342  
343    * id, filename, storage_uri, uploaded_at, parsed_text_ref, classification, extracted_json, approval_status, approver_id, approved_at
344  * `xero_tenant`
345  
346    * id, tenant_id, connected_at, token_ref (encrypted), last_used_at
347  
348  ---
349  
350  ## 9) API contracts (UI-to-backend)
351  
352  ### Common rules
353  
354  * All responses return `request_id` for traceability.
355  * All module endpoints write an audit event (success/failure).
356  
357  ### Module A
358  
359  * `POST /docs/upload`
360  * `POST /docs/index` (trigger indexing)
361  * `GET /docs/status`
362  * `POST /docs/query` → returns `{answer, citations[]}`
363  
364    * Must refuse if citations are empty.
365  
366  ### Module B
367  
368  * `POST /inbox/upload`
369  * `POST /inbox/process/{id}`
370  * `GET /inbox/queue`
371  * `GET /inbox/{id}`
372  * `POST /inbox/{id}/approve` (approve/reject + comment)
373  
374  ### Module C
375  
376  * `GET /finance/status` (connected yes/no)
377  * `POST /finance/query` → `{answer, rows[], query_trace}`
378  
379    * Must include verification rows.
380  
381  ### Admin
382  
383  * `GET /audit` (filters by module, status, date)
384  * `GET /health`
385  
386  ---
387  
388  ## 10) Environment parameters (`.env.example`)
389  
390  ```bash
391  # App basics
392  APP_ENV=local
393  APP_REGION=africa-south1
394  API_BASE_URL=http://api-gateway:8000
395  STORAGE_BACKEND=local   # local|gcs
396  
397  # GCP / Vertex
398  GOOGLE_CLOUD_PROJECT="your-project-id"
399  GOOGLE_GENAI_USE_VERTEXAI=True
400  VERTEX_LOCATION="global"              # or your chosen Vertex GenAI location
401  DISCOVERY_ENGINE_LOCATION="global"    # Vertex AI Search location (commonly global)
402  GCS_BUCKET_NAME="sme-ops-center-uploads"
403  
404  # Vertex AI Search / Discovery Engine (from console Steps 13-15)
405  DATA_STORE_ID="your-data-store-id"
406  ENGINE_ID="your-engine-id"
407  
408  # Vertex AI Search / RAG controls
409  DOCS_RELEVANCE_MODE="HIGH"            # or FILTER_SPEC
410  DOCS_SEMANTIC_THRESHOLD=0.7           # used if FILTER_SPEC
411  DOCS_MAX_RESULTS=5
412  
413  # Gemini models (no deprecated IDs)
414  GEMINI_MODEL_PRIMARY="gemini-2.5-flash"
415  GEMINI_MODEL_FALLBACK="gemini-2.0-flash"
416  GEMINI_MAX_OUTPUT_TOKENS=1024
417  
418  # Xero (OAuth handled by mcp-bridge)
419  XERO_CLIENT_ID="your-client-id"
420  XERO_CLIENT_SECRET="your-client-secret"
421  XERO_REDIRECT_URI="http://localhost:3000/oauth/xero/callback"
422  
423  # Security
424  SECRET_KEY="generate-a-secure-key"
425  ENCRYPTION_SALT="for-token-storage"
426  ALLOWED_ORIGINS="http://localhost:8501"
427  
428  # Database
429  POSTGRES_HOST=postgres
430  POSTGRES_DB=smeops
431  POSTGRES_USER=smeops
432  POSTGRES_PASSWORD="change-me"
433  
434  # Redis
435  REDIS_URL=redis://redis:6379/0
436  ```
437  
438  Startup validation must fail-fast if required vars are absent.
439  
440  ---
441  
442  ## 11) Docker / Compose requirements
443  
444  ### Compose requirements
445  
446  * Use named volumes:
447  
448    * `pgdata` (Postgres)
449    * `uploads` (local storage dev)
450    * `sessions` (dev token/session persistence for bridge, encrypted)
451    * `redis-data` (Redis persistence)
452  * All application services run as non-root (UID 1000 for Python/Node containers).
453  * **Postgres/Redis**: Do NOT override user; use official image default non-root users (they handle initialization).
454  * Health checks for Postgres/Redis/API.
455  * **Important**: For Node services with bind mounts, use anonymous volume for `node_modules` to preserve installed dependencies (e.g., `/app/node_modules`).
456  
457  ### Persistence rules
458  
459  * Uploaded docs/emails must survive rebuilds.
460  * Token/session data must survive restarts (encrypted at rest for dev volume; production uses managed secrets).
461  * `node_modules` in Node services should be preserved via anonymous volume when source is bind-mounted.
462  
463  ### Base Images and Versions
464  
465  * Python services: `python:3.12-slim`
466  * Node services: `node:20-slim` (pinned in package.json engines)
467  * Postgres: `postgres:16-alpine`
468  * Redis: `redis:7-alpine`
469  
470  ### Known Configuration Issues Resolved (Milestone 0)
471  
472  * **Node UID conflict**: Node images already have `node` user (UID 1000); use existing user instead of creating new one.
473  * **Empty requirements.txt**: Use conditional pip install to handle empty/comment-only requirements files.
474  * **npm ci vs npm install**: Use `npm install` for scaffold phase (no package-lock.json yet); switch to `npm ci` once lock file exists.
475  * **Volume mount overwrites**: Anonymous volumes preserve `node_modules` when source code is bind-mounted for development.
476  * **Postgres/Redis permissions**: Official images initialize correctly with their default users; do not override with custom UIDs.
477  
478  ---
479  
480  ## 12) Incremental build milestones (must follow)
481  
482  ### Milestone 0 — Scaffold
483  
484  * repo + docker compose boots cleanly
485  * API health endpoint
486  * Postgres migrations run
487  * Audit event write works
488  * Streamlit loads and can call API
489  
490  ### Milestone 1 — Module A
491  
492  * Upload docs → persist → index job → query returns citations or refuses
493  * Evidence panel works
494  * Index status view works
495  
496  ### Milestone 2 — Module B
497  
498  * Upload email → parse → classify/extract JSON (schema-validated)
499  * Approval workflow works end-to-end
500  * Draft replies/tasks shown only (no external actions)
501  
502  ### Milestone 3 — Module C
503  
504  * Xero OAuth via bridge works
505  * Finance queries return narrative + drill-down rows
506  * Read-only tool allow-list enforced and tested
507  
508  ### Milestone 4 — Demo hardening
509  
510  * Guided demo mode with pre-canned prompts
511  * Export CSV (finance) + export answer/citations (docs)
512  * Robust error states and status badges
513  
514  ---
515  
516  ## 13) Repository structure (recommended)
517  
518  ```text
519  repo/
520    README.md
521    PRD.md
522    VERSIONS.md
523    .env.example
524    .gitignore
525    docker-compose.yml
526    frontend/
527      Dockerfile
528      app.py
529      requirements.txt
530    api-gateway/
531      Dockerfile
532      app/
533        main.py
534        routes/
535        services/
536        schemas/
537        security/
538        storage/
539      requirements.txt
540    worker/
541      Dockerfile
542      app/
543        worker.py
544        jobs/
545      requirements.txt
546    mcp-bridge/
547      Dockerfile
548      package.json
549      src/
550        server.ts
551        oauth/
552        mcp/
553        allowlist/
554    db/
555      migrations/
556    .cursor/
557      rules/
558        architecture.mdc
559  ```
560  
561  ---
562  
563  ## 14) Cursor rules (`.cursor/rules/architecture.mdc`) — required content
564  
565  Add a rule file that enforces:
566  
567  * Docker-only execution
568  * UI calls only API (no direct provider/system calls)
569  * Strict milestones (no big-bang)
570  * No secrets in repo
571  * No deprecated model IDs
572  * Read-only allow-list for Xero tools
573  * Audit logging mandatory per request
574  * Persistent volumes for Postgres/uploads/sessions
575  
576  ---
577  
578  ## 15) “Interrogate requirements” checklist (Cursor must follow)
579  
580  Before implementing any feature:
581  
582  * Identify module + acceptance criteria
583  * Identify data needed + where it is persisted
584  * Identify API contract impact
585  * Identify security implications (secrets, OAuth, audit, tool allow-list)
586  * Confirm Dockerisation + restart survival
587  * Confirm no deprecated model/package usage
588  * Confirm simplest path (MVP first)
589  
590  ---
591  
592  # Copy-paste “Master Cursor Instructions”
593  
594  > Act as a Senior Lead Architect. Initialize a monorepo `sme-ops-center` with a decoupled microservices architecture.
595  >
596  > 1. Create `docker-compose.yml` defining `frontend` (Streamlit), `api-gateway` (FastAPI), `worker`, `mcp-bridge` (Node), `postgres`, and `redis`, all running as non-root with named volumes for persistence.
597  > 2. Enforce decoupling: frontend must only call REST endpoints on `api-gateway`; it must not call GCP, Xero, or Postgres directly.
598  > 3. Implement environment validation on startup from `.env` and provide `.env.example`. No secrets committed.
599  > 4. Implement Module A using Vertex AI Search with strict grounding and “no source = no answer”.
600  > 5. Implement Module B using Gemini on Vertex (no deprecated model IDs) with schema-validated JSON extraction + human approval workflow; generate drafts only.
601  > 6. Implement Module C using Xero MCP server behind `mcp-bridge`. OAuth callback must be owned by bridge/gateway (not Streamlit). Enforce read-only tool allow-list and always return drill-down tables.
602  > 7. Build incrementally: Milestone 0 → Module A → Module B → Module C → hardening. Do not build everything at once.
603  > 8. Generate OpenAPI for `api-gateway` and keep UI replaceable by a future Node/Next.js frontend without backend rewrite.
604  
605  ---