/ README.md
README.md
1 # SME Ops-Center 2 3 **Operational AI Demo-in-a-Box (SA SME)** 4 5 A Dockerized, decoupled, production-reusable prototype using **Google Gemini on Vertex + Vertex AI Search + Xero via MCP**, with trust controls (citations, approvals, read-only finance), auditability, and incremental delivery. 6 7 ## Quick Start 8 9 ### Prerequisites 10 - Docker Desktop or Docker Engine with Docker Compose v2+ 11 - Git 12 13 ### Setup 14 15 1. **Clone the repository:** 16 ```bash 17 git clone <repo-url> 18 cd sme-ops-center 19 ``` 20 21 2. **Copy environment template:** 22 ```bash 23 cp .env.example .env 24 ``` 25 26 3. **Edit `.env` with your actual values:** 27 - `STORAGE_BACKEND=gcs` (required for Vertex AI Search document ingestion) 28 - `GOOGLE_CLOUD_PROJECT` — your GCP project ID 29 - `GCS_BUCKET_NAME` — your GCS bucket name 30 - `DATA_STORE_ID` and `ENGINE_ID` — from Vertex AI Search console (Steps 13–15 in GCP checklist) 31 - `DISCOVERY_ENGINE_LOCATION=global` 32 - Xero OAuth credentials (for Module C) 33 - Database passwords 34 - Other configuration as needed 35 36 **Data Store import prefix:** When creating the Vertex AI Search Data Store, set the import prefix to `gs://<bucket>/docs/` so it matches where the app uploads documents. 37 38 4. **Configure Google Cloud Storage credentials:** 39 - Place your GCP service account JSON file at `E:\sme-ops-center-secrets\smeops-api-sa.json` 40 - The file will be automatically mounted to the `api-gateway` container at `/run/secrets/gcp-sa.json` (read-only) 41 - Set `GOOGLE_APPLICATION_CREDENTIALS=/run/secrets/gcp-sa.json` in `.env` (already configured in docker-compose.yml) 42 43 5. **Start all services:** 44 ```bash 45 docker compose up --build 46 ``` 47 48 **Startup Sequence:** 49 Docker Compose automatically handles the startup dependencies: 50 1. **Postgres & Redis** start first and wait for health checks 51 2. **MCP Bridge** starts independently 52 3. **API Gateway** waits for Postgres/Redis to be healthy, then: 53 - Runs database migrations automatically 54 - Waits for database to be ready (retry logic) 55 - Starts the FastAPI server 56 4. **Worker** waits for Postgres, Redis, and API Gateway to start 57 5. **Frontend** waits for API Gateway to start 58 59 **No manual build sequence needed** — just run `docker compose up --build` and all services will start in the correct order. 60 61 6. **Access services:** 62 - Frontend: http://localhost:8501 63 - API Gateway: http://localhost:8000 64 - MCP Bridge: http://localhost:3000 65 - Postgres: localhost:5432 66 - Redis: localhost:6379 67 68 ### Health Checks 69 70 - API Gateway: http://localhost:8000/health 71 - MCP Bridge: http://localhost:3000/health 72 - GCS Smoke Test: http://localhost:8000/gcs/smoke (tests Google Cloud Storage connectivity) 73 74 ## Architecture 75 76 ### Services 77 78 | Service | Technology | Port | Purpose | 79 |---------------|-----------------|------|----------------------------------------------| 80 | `frontend` | Streamlit | 8501 | Thin UI shell; calls `api-gateway` only | 81 | `api-gateway` | FastAPI | 8000 | Core orchestration; trust enforcement; audit | 82 | `worker` | Python | — | Background jobs (doc ingest, email parsing) | 83 | `mcp-bridge` | Node.js | 3000 | MCP servers; OAuth PKCE; HTTP interface | 84 | `postgres` | PostgreSQL 16 | 5432 | System-of-record: metadata, approvals, audit | 85 | `redis` | Redis 7 | 6379 | Queue and caching | 86 87 ### Key Principles 88 89 - **Docker-first**: Everything runs via Docker Compose 90 - **Strict decoupling**: Frontend only calls `api-gateway` over HTTP 91 - **API-first**: All business logic behind stable REST endpoints 92 - **Incremental delivery**: Milestone 0 → Module A → Module B → Module C → hardening 93 - **Security**: Non-root containers, no secrets in repo, audit logging 94 - **No deprecated models**: Gemini 2.x/2.5 via Vertex only 95 96 ## Project Status 97 98 ### Milestone 0: ✅ Complete 99 - Docker Compose scaffold with all services 100 - Health endpoints for API Gateway and MCP Bridge 101 - Named volumes for persistence 102 - Environment configuration aligned with PRD 103 104 See [MILESTONE0_STATUS.md](./MILESTONE0_STATUS.md) for detailed status and issues resolved. 105 106 ### Milestone 1: 🟡 In Progress (95% Complete) 107 - ✅ Task 1: Database migrations and core tables (`doc_asset`, `audit_event`) 108 - ✅ Task 2: Module A API endpoints (upload, status, index, query stub) 109 - ✅ Task 3: Frontend UI implementation (end-to-end flow with trust surface) 110 - ✅ Task 4: GCS smoke test endpoint (Google Cloud Storage integration test) 111 - ✅ Task 5a: Vertex AI Search document ingestion (GCS `docs/` path + Discovery Engine import) 112 - ⏳ Task 5b: Vertex AI Search query integration (replace stub with real retrieval) 113 114 **Implemented APIs:** 115 - `POST /docs/upload` - Upload documents; when `STORAGE_BACKEND=gcs`, saves to `gs://bucket/docs/` and triggers Vertex AI Search import 116 - `POST /docs/index` - Trigger indexing for PENDING docs in GCS (optional `doc_id` to index specific doc) 117 - `GET /docs/status` - Get document status list (including `indexed_status`: pending/indexing/ready/failed) 118 - `POST /docs/query` - Query stub (returns refusal until Vertex AI Search query API integrated) 119 - `GET /gcs/smoke` - GCS smoke test (uploads, verifies, and deletes a test blob) 120 121 **Implemented Frontend:** 122 - Landing page with 3 module tiles (Docs enabled, Inbox/Finance coming soon) 123 - Docs module with Upload, Status, and Query tabs 124 - Request ID panel (trust surface) showing last request_id for each operation 125 - Environment variable configuration (`API_BASE_URL`) - no hardcoding 126 127 See [MILESTONE1_STATUS.md](./MILESTONE1_STATUS.md) for detailed status. 128 129 ### Next Milestones 130 - **Milestone 1** (remaining): Vertex AI Search query API integration (replace query stub) 131 - **Milestone 2**: Module B - Email triage and approval workflow 132 - **Milestone 3**: Module C - Xero Finance Lens with read-only MCP 133 - **Milestone 4**: Demo hardening 134 135 ## Documentation 136 137 - **[Product Requirements Document (PRD)](./docs/PRD.md)** - Complete specification 138 - **[Architecture Rules](./.cursor/rules/architecture.mdc)** - Development guidelines 139 - **[Milestone 0 Status](./MILESTONE0_STATUS.md)** - Docker Compose scaffold status 140 - **[Milestone 1 Status](./MILESTONE1_STATUS.md)** - Module A APIs status (in progress) 141 - **[Frontend UI Implementation](./FRONTEND_UI_IMPLEMENTATION.md)** - Frontend UI implementation summary 142 - **[Versions](./VERSIONS.md)** - Pinned dependency versions 143 144 ## Development 145 146 ### Project Structure 147 148 ``` 149 sme-ops-center/ 150 ├── docker-compose.yml # Service orchestration 151 ├── .env.example # Environment template 152 ├── frontend/ # Streamlit UI 153 │ ├── app.py # Main UI application 154 │ ├── utils.py # API client utilities 155 │ └── requirements.txt # Python dependencies 156 ├── api-gateway/ # FastAPI backend 157 │ ├── app/ # Application code 158 │ │ ├── routes/ # API routes 159 │ │ │ ├── docs.py # Module A routes 160 │ │ │ └── gcs.py # GCS smoke test routes 161 │ │ ├── models.py # Database models 162 │ │ ├── schemas.py # Pydantic schemas 163 │ │ └── services.py # Business logic 164 │ └── migrations/ # Alembic migrations 165 ├── worker/ # Background jobs 166 ├── mcp-bridge/ # Node.js MCP server 167 ├── db/ # Database migrations (legacy) 168 └── docs/ # Documentation 169 ``` 170 171 ### Key Configuration Files 172 173 - `.env.example` - Environment variable template (see PRD Section 10) 174 - `docker-compose.yml` - Service definitions and volumes 175 - `VERSIONS.md` - Pinned versions for reproducibility 176 177 ## Important Notes 178 179 ### Security 180 - Never commit `.env` file (it's gitignored) 181 - All containers run as non-root users 182 - Secrets must use environment variables or secret management 183 - Production deployments should use managed secrets (not `.env` files) 184 185 ### Container User IDs 186 - **Application containers** (frontend, api-gateway, worker, mcp-bridge): UID 1000 187 - **Postgres/Redis**: Use official image default non-root users (do not override) 188 189 ### Startup Dependencies 190 191 All startup dependencies are configured in `docker-compose.yml`: 192 193 - **API Gateway** → waits for Postgres and Redis to be healthy (health checks) 194 - **Worker** → waits for Postgres, Redis (healthy), and API Gateway (started) 195 - **Frontend** → waits for API Gateway (started) 196 - **MCP Bridge** → no dependencies (starts independently) 197 198 **Health Checks:** 199 - Postgres: `pg_isready` check (10s interval, 5 retries) 200 - Redis: `redis-cli ping` check (10s interval, 5 retries) 201 - API Gateway: HTTP health endpoint check (10s interval, 5 retries, 30s start period) 202 203 **Automatic Migrations:** 204 API Gateway automatically runs database migrations on startup (via `app/migrations.py`) with retry logic to wait for Postgres to be ready. No manual migration step required. 205 206 ### Volume Mounts 207 - Source code is bind-mounted for development (`./service:/app`) 208 - `mcp-bridge` uses anonymous volume for `node_modules` to preserve dependencies 209 - Named volumes (`pgdata`, `uploads`, `sessions`, `redis-data`) persist across rebuilds 210 - `api-gateway` mounts GCP service account credentials: `E:\sme-ops-center-secrets\smeops-api-sa.json:/run/secrets/gcp-sa.json:ro` 211 212 ## Troubleshooting 213 214 ### Common Issues 215 216 1. **Permission errors**: Ensure Docker has proper permissions on host directories 217 2. **Port conflicts**: Check if ports 8501, 8000, 3000, 5432, 6379 are available 218 3. **Volume initialization**: If Postgres/Redis fail to start, try `docker compose down -v` to reset volumes 219 220 See [MILESTONE0_STATUS.md](./MILESTONE0_STATUS.md) for detailed issue resolution. 221 222 ## License 223 224 [License information]