/ README.md
README.md
  1  # SME Ops-Center
  2  
  3  **Operational AI Demo-in-a-Box (SA SME)**
  4  
  5  A Dockerized, decoupled, production-reusable prototype using **Google Gemini on Vertex + Vertex AI Search + Xero via MCP**, with trust controls (citations, approvals, read-only finance), auditability, and incremental delivery.
  6  
  7  ## Quick Start
  8  
  9  ### Prerequisites
 10  - Docker Desktop or Docker Engine with Docker Compose v2+
 11  - Git
 12  
 13  ### Setup
 14  
 15  1. **Clone the repository:**
 16  ```bash
 17  git clone <repo-url>
 18  cd sme-ops-center
 19  ```
 20  
 21  2. **Copy environment template:**
 22  ```bash
 23  cp .env.example .env
 24  ```
 25  
 26  3. **Edit `.env` with your actual values:**
 27     - `STORAGE_BACKEND=gcs` (required for Vertex AI Search document ingestion)
 28     - `GOOGLE_CLOUD_PROJECT` — your GCP project ID
 29     - `GCS_BUCKET_NAME` — your GCS bucket name
 30     - `DATA_STORE_ID` and `ENGINE_ID` — from Vertex AI Search console (Steps 13–15 in GCP checklist)
 31     - `DISCOVERY_ENGINE_LOCATION=global`
 32     - Xero OAuth credentials (for Module C)
 33     - Database passwords
 34     - Other configuration as needed
 35  
 36     **Data Store import prefix:** When creating the Vertex AI Search Data Store, set the import prefix to `gs://<bucket>/docs/` so it matches where the app uploads documents.
 37  
 38  4. **Configure Google Cloud Storage credentials:**
 39     - Place your GCP service account JSON file at `E:\sme-ops-center-secrets\smeops-api-sa.json`
 40     - The file will be automatically mounted to the `api-gateway` container at `/run/secrets/gcp-sa.json` (read-only)
 41     - Set `GOOGLE_APPLICATION_CREDENTIALS=/run/secrets/gcp-sa.json` in `.env` (already configured in docker-compose.yml)
 42  
 43  5. **Start all services:**
 44  ```bash
 45  docker compose up --build
 46  ```
 47  
 48  **Startup Sequence:**
 49  Docker Compose automatically handles the startup dependencies:
 50  1. **Postgres & Redis** start first and wait for health checks
 51  2. **MCP Bridge** starts independently
 52  3. **API Gateway** waits for Postgres/Redis to be healthy, then:
 53     - Runs database migrations automatically
 54     - Waits for database to be ready (retry logic)
 55     - Starts the FastAPI server
 56  4. **Worker** waits for Postgres, Redis, and API Gateway to start
 57  5. **Frontend** waits for API Gateway to start
 58  
 59  **No manual build sequence needed** — just run `docker compose up --build` and all services will start in the correct order.
 60  
 61  6. **Access services:**
 62     - Frontend: http://localhost:8501
 63     - API Gateway: http://localhost:8000
 64     - MCP Bridge: http://localhost:3000
 65     - Postgres: localhost:5432
 66     - Redis: localhost:6379
 67  
 68  ### Health Checks
 69  
 70  - API Gateway: http://localhost:8000/health
 71  - MCP Bridge: http://localhost:3000/health
 72  - GCS Smoke Test: http://localhost:8000/gcs/smoke (tests Google Cloud Storage connectivity)
 73  
 74  ## Architecture
 75  
 76  ### Services
 77  
 78  | Service       | Technology      | Port | Purpose                                      |
 79  |---------------|-----------------|------|----------------------------------------------|
 80  | `frontend`    | Streamlit       | 8501 | Thin UI shell; calls `api-gateway` only      |
 81  | `api-gateway` | FastAPI         | 8000 | Core orchestration; trust enforcement; audit |
 82  | `worker`      | Python          | —    | Background jobs (doc ingest, email parsing)  |
 83  | `mcp-bridge`  | Node.js         | 3000 | MCP servers; OAuth PKCE; HTTP interface      |
 84  | `postgres`    | PostgreSQL 16   | 5432 | System-of-record: metadata, approvals, audit |
 85  | `redis`       | Redis 7         | 6379 | Queue and caching                            |
 86  
 87  ### Key Principles
 88  
 89  - **Docker-first**: Everything runs via Docker Compose
 90  - **Strict decoupling**: Frontend only calls `api-gateway` over HTTP
 91  - **API-first**: All business logic behind stable REST endpoints
 92  - **Incremental delivery**: Milestone 0 → Module A → Module B → Module C → hardening
 93  - **Security**: Non-root containers, no secrets in repo, audit logging
 94  - **No deprecated models**: Gemini 2.x/2.5 via Vertex only
 95  
 96  ## Project Status
 97  
 98  ### Milestone 0: ✅ Complete
 99  - Docker Compose scaffold with all services
100  - Health endpoints for API Gateway and MCP Bridge
101  - Named volumes for persistence
102  - Environment configuration aligned with PRD
103  
104  See [MILESTONE0_STATUS.md](./MILESTONE0_STATUS.md) for detailed status and issues resolved.
105  
106  ### Milestone 1: 🟡 In Progress (95% Complete)
107  - ✅ Task 1: Database migrations and core tables (`doc_asset`, `audit_event`)
108  - ✅ Task 2: Module A API endpoints (upload, status, index, query stub)
109  - ✅ Task 3: Frontend UI implementation (end-to-end flow with trust surface)
110  - ✅ Task 4: GCS smoke test endpoint (Google Cloud Storage integration test)
111  - ✅ Task 5a: Vertex AI Search document ingestion (GCS `docs/` path + Discovery Engine import)
112  - ⏳ Task 5b: Vertex AI Search query integration (replace stub with real retrieval)
113  
114  **Implemented APIs:**
115  - `POST /docs/upload` - Upload documents; when `STORAGE_BACKEND=gcs`, saves to `gs://bucket/docs/` and triggers Vertex AI Search import
116  - `POST /docs/index` - Trigger indexing for PENDING docs in GCS (optional `doc_id` to index specific doc)
117  - `GET /docs/status` - Get document status list (including `indexed_status`: pending/indexing/ready/failed)
118  - `POST /docs/query` - Query stub (returns refusal until Vertex AI Search query API integrated)
119  - `GET /gcs/smoke` - GCS smoke test (uploads, verifies, and deletes a test blob)
120  
121  **Implemented Frontend:**
122  - Landing page with 3 module tiles (Docs enabled, Inbox/Finance coming soon)
123  - Docs module with Upload, Status, and Query tabs
124  - Request ID panel (trust surface) showing last request_id for each operation
125  - Environment variable configuration (`API_BASE_URL`) - no hardcoding
126  
127  See [MILESTONE1_STATUS.md](./MILESTONE1_STATUS.md) for detailed status.
128  
129  ### Next Milestones
130  - **Milestone 1** (remaining): Vertex AI Search query API integration (replace query stub)
131  - **Milestone 2**: Module B - Email triage and approval workflow
132  - **Milestone 3**: Module C - Xero Finance Lens with read-only MCP
133  - **Milestone 4**: Demo hardening
134  
135  ## Documentation
136  
137  - **[Product Requirements Document (PRD)](./docs/PRD.md)** - Complete specification
138  - **[Architecture Rules](./.cursor/rules/architecture.mdc)** - Development guidelines
139  - **[Milestone 0 Status](./MILESTONE0_STATUS.md)** - Docker Compose scaffold status
140  - **[Milestone 1 Status](./MILESTONE1_STATUS.md)** - Module A APIs status (in progress)
141  - **[Frontend UI Implementation](./FRONTEND_UI_IMPLEMENTATION.md)** - Frontend UI implementation summary
142  - **[Versions](./VERSIONS.md)** - Pinned dependency versions
143  
144  ## Development
145  
146  ### Project Structure
147  
148  ```
149  sme-ops-center/
150  ├── docker-compose.yml      # Service orchestration
151  ├── .env.example            # Environment template
152  ├── frontend/               # Streamlit UI
153  │   ├── app.py             # Main UI application
154  │   ├── utils.py           # API client utilities
155  │   └── requirements.txt   # Python dependencies
156  ├── api-gateway/            # FastAPI backend
157  │   ├── app/                # Application code
158  │   │   ├── routes/         # API routes
159  │   │   │   ├── docs.py     # Module A routes
160  │   │   │   └── gcs.py      # GCS smoke test routes
161  │   │   ├── models.py       # Database models
162  │   │   ├── schemas.py      # Pydantic schemas
163  │   │   └── services.py     # Business logic
164  │   └── migrations/         # Alembic migrations
165  ├── worker/                 # Background jobs
166  ├── mcp-bridge/             # Node.js MCP server
167  ├── db/                     # Database migrations (legacy)
168  └── docs/                   # Documentation
169  ```
170  
171  ### Key Configuration Files
172  
173  - `.env.example` - Environment variable template (see PRD Section 10)
174  - `docker-compose.yml` - Service definitions and volumes
175  - `VERSIONS.md` - Pinned versions for reproducibility
176  
177  ## Important Notes
178  
179  ### Security
180  - Never commit `.env` file (it's gitignored)
181  - All containers run as non-root users
182  - Secrets must use environment variables or secret management
183  - Production deployments should use managed secrets (not `.env` files)
184  
185  ### Container User IDs
186  - **Application containers** (frontend, api-gateway, worker, mcp-bridge): UID 1000
187  - **Postgres/Redis**: Use official image default non-root users (do not override)
188  
189  ### Startup Dependencies
190  
191  All startup dependencies are configured in `docker-compose.yml`:
192  
193  - **API Gateway** → waits for Postgres and Redis to be healthy (health checks)
194  - **Worker** → waits for Postgres, Redis (healthy), and API Gateway (started)
195  - **Frontend** → waits for API Gateway (started)
196  - **MCP Bridge** → no dependencies (starts independently)
197  
198  **Health Checks:**
199  - Postgres: `pg_isready` check (10s interval, 5 retries)
200  - Redis: `redis-cli ping` check (10s interval, 5 retries)
201  - API Gateway: HTTP health endpoint check (10s interval, 5 retries, 30s start period)
202  
203  **Automatic Migrations:**
204  API Gateway automatically runs database migrations on startup (via `app/migrations.py`) with retry logic to wait for Postgres to be ready. No manual migration step required.
205  
206  ### Volume Mounts
207  - Source code is bind-mounted for development (`./service:/app`)
208  - `mcp-bridge` uses anonymous volume for `node_modules` to preserve dependencies
209  - Named volumes (`pgdata`, `uploads`, `sessions`, `redis-data`) persist across rebuilds
210  - `api-gateway` mounts GCP service account credentials: `E:\sme-ops-center-secrets\smeops-api-sa.json:/run/secrets/gcp-sa.json:ro`
211  
212  ## Troubleshooting
213  
214  ### Common Issues
215  
216  1. **Permission errors**: Ensure Docker has proper permissions on host directories
217  2. **Port conflicts**: Check if ports 8501, 8000, 3000, 5432, 6379 are available
218  3. **Volume initialization**: If Postgres/Redis fail to start, try `docker compose down -v` to reset volumes
219  
220  See [MILESTONE0_STATUS.md](./MILESTONE0_STATUS.md) for detailed issue resolution.
221  
222  ## License
223  
224  [License information]