/ docs / architecture_evolution.md
architecture_evolution.md
  1  # AI-Infra-Guard Architecture Evolution
  2  
  3  This document describes the architectural evolution of A.I.G (AI-Infra-Guard) across three major stages: v0.1, v2.6, and v3.6.0+. It focuses on responsibility boundaries, capability growth, and runtime topology at each stage.
  4  
  5  ---
  6  
  7  ## v0.1 — Single-Binary Infrastructure Scanner
  8  
  9  **Purpose**: Lightweight, zero-dependency AI infrastructure vulnerability scanner. Ship fast, scan fast.
 10  
 11  ### Capability Summary
 12  
 13  - Network-based fingerprinting of AI services (Ollama, vLLM, Dify, etc.)
 14  - Rule-driven CVE/GHSA vulnerability matching against detected components
 15  - CLI-first task dispatch; WebUI for result visualization
 16  
 17  ### Technical Stack & Delivery
 18  
 19  | Aspect | Detail |
 20  |---|---|
 21  | Language | Go |
 22  | Delivery | Single compiled binary |
 23  | Entry point | `cmd/cli` |
 24  | Config | CLI flags / YAML rule files |
 25  
 26  ### Architecture Layers
 27  
 28  ```
 29  ┌────────────────────────────────────────┐
 30  │         Entry & Interaction            │
 31  │    CLI (cmd/cli)  ·  WebUI (static)    │
 32  ├────────────────────────────────────────┤
 33  │          Task Scheduling               │
 34  │   Target splitting · Concurrency ctl   │
 35  ├────────────────────────────────────────┤
 36  │           Rule Engine                  │
 37  │  Fingerprint match · Vuln scoring      │
 38  ├────────────────────────────────────────┤
 39  │         Probe Execution                │
 40  │  HTTP probes · Service detection       │
 41  ├────────────────────────────────────────┤
 42  │           Rule Data Layer              │
 43  │  data/fingerprints/  · data/vuln/      │
 44  └────────────────────────────────────────┘
 45  ```
 46  
 47  ### Runtime Flow
 48  
 49  ```
 50  CLI/WebUI
 51    └─▶ Scheduler (runner.go)
 52          └─▶ Fingerprint engine  ──▶  Version extraction
 53                └─▶ Vuln matcher   ──▶  Structured result output
 54  ```
 55  
 56  ---
 57  
 58  ## v2.6 — Dual-Engine: Infrastructure Scan + MCP Code Analysis
 59  
 60  **Purpose**: Extend v0.1 with MCP Server security analysis. A single binary now ships both a rule-driven scanner and an LLM-assisted code auditor.
 61  
 62  ### New Capabilities
 63  
 64  - MCP Server code security analysis (static rules + LLM reasoning)
 65  - Unified CLI sub-commands (`scan` / `mcp`)
 66  - Integrated WebUI covering both scan types
 67  - Task management API via WebSocket/REST
 68  
 69  ### Technical Stack & Delivery
 70  
 71  | Aspect | Detail |
 72  |---|---|
 73  | Language | Go |
 74  | Delivery | Single compiled binary |
 75  | New modules | `internal/mcp/`, `common/websocket/` |
 76  | LLM integration | OpenAI-compatible API (`common/utils/models/openai.go`) |
 77  
 78  ### Architecture Layers
 79  
 80  ```
 81  ┌────────────────────────────────────────────────┐
 82  │              Entry & Interaction               │
 83  │   CLI sub-commands  ·  WebUI  ·  REST/WS API   │
 84  ├──────────────────────┬─────────────────────────┤
 85  │   AI Infra Scan      │   MCP Security Analysis  │
 86  │  Rule-driven engine  │  Agent-driven code audit │
 87  ├──────────────────────┴─────────────────────────┤
 88  │         Task & Utility Layer                   │
 89  │  Orchestration · Result aggregation · Utils    │
 90  ├────────────────────────────────────────────────┤
 91  │           Rule & Data Layer                    │
 92  │  Fingerprints · Vuln rules · MCP rule YAMLs    │
 93  └────────────────────────────────────────────────┘
 94  ```
 95  
 96  ### Runtime Flow
 97  
 98  ```
 99  CLI
100    ├─ scan ──▶ Rule Engine ──▶ Vuln Matcher ──▶ Report
101    └─ mcp  ──▶ MCP Agent   ──▶ Code Audit   ──▶ Risk Report
102  
103  WebUI / REST API
104    └─▶ Unified task interface ──▶ Result display
105  ```
106  
107  ---
108  
109  ## v3.6.0+ — Multi-Module AI Red Teaming Platform
110  
111  **Purpose**: Platform-scale AI security system. Go + Python hybrid. Three coordinated security modules with hot-pluggable scan engines and multi-deployment profiles (on-prem, SaaS, open-source).
112  
113  ### New Capabilities
114  
115  - **Prompt Security Evaluation** (`AIG-PromptSecurity/`): LLM jailbreak red-teaming with 10+ attack strategies (encoding, stego, role-play, etc.)
116  - **Agent Scan** (`common/agent/`): Multi-agent automated security assessment of AI agent workflows (Dify, Coze, etc.) — covers indirect prompt injection, SSRF, system prompt leakage
117  - **MCP Scan v2** (`mcp-scan/`): Rewritten as standalone Python agent with dynamic verification, code audit pipeline, and multi-stage LLM reasoning
118  - Docker Compose one-click deployment; multi-image architecture
119  - Plugin/feature extension framework
120  
121  ### Technical Stack & Delivery
122  
123  | Aspect | Detail |
124  |---|---|
125  | Languages | Go (core) + Python (agents, eval) |
126  | Delivery | Docker images + Compose, or single binary |
127  | Core service | `cmd/` + `common/` + `internal/` (Go) |
128  | MCP scan agent | `mcp-scan/` (Python, standalone) |
129  | Prompt eval | `AIG-PromptSecurity/` (Python, standalone) |
130  | LLM integration | OpenAI-compatible; configurable per module |
131  
132  ### Architecture Layers
133  
134  ```
135  ┌──────────────────────────────────────────────────────────────┐
136  │                   Deployment Profiles                        │
137  │    On-prem (SSO/intranet)  ·  SaaS  ·  Open-source release  │
138  ├──────────────────────────────────────────────────────────────┤
139  │                    Entry & Orchestration                     │
140  │         CLI / WebUI / REST API / WebSocket                   │
141  │              Agent runtime  ·  Task scheduler                │
142  ├───────────────────┬───────────────────┬──────────────────────┤
143  │  AI Infra Scan    │  MCP Security Scan │  Prompt Eval         │
144  │  Go rule engine   │  Python agent +   │  Python DeepTeam +   │
145  │  Fingerprint +    │  static rules +   │  10+ attack types    │
146  │  Vuln matching    │  LLM verification │  Dataset-driven      │
147  ├───────────────────┴───────────────────┴──────────────────────┤
148  │                  Task & Report Layer                         │
149  │     Task scheduling · Result aggregation · PDF report gen    │
150  ├──────────────────────────────────────────────────────────────┤
151  │                   Rule & Data Layer                          │
152  │  data/fingerprints/  ·  data/vuln(_en)/  ·  data/mcp/       │
153  │  data/eval/          ·  MCP rule YAMLs   ·  Eval datasets    │
154  └──────────────────────────────────────────────────────────────┘
155  ```
156  
157  ### Runtime Flow
158  
159  ```
160  WebUI / CLI / API
161    └─▶ Task Scheduler (common/websocket/)
162          ├─▶ AI Infra Scan (Go)
163          │     └─▶ Fingerprint engine ──▶ Vuln rules ──▶ CVE report
164165          ├─▶ MCP Scan (Python agent)
166          │     └─▶ Code audit ──▶ Static rules ──▶ LLM verification ──▶ Risk report
167168          ├─▶ Agent Scan (Go agent)
169          │     └─▶ Multi-agent pipeline ──▶ Injection/SSRF/leak tests ──▶ Report
170171          └─▶ Prompt Eval (Python)
172                └─▶ Attack generation ──▶ LLM target ──▶ Judge ──▶ Safety score
173  ```
174  
175  ---
176  
177  ## Version Comparison
178  
179  | Version | Core Capabilities | Tech Stack | Architecture Pattern |
180  |---|---|---|---|
181  | v0.1 | AI infra vuln scan + WebUI | Go single binary | Monolithic scanner, rule-driven |
182  | v2.6 | Infra scan + MCP code analysis | Go single binary | Dual-engine, unified WebUI |
183  | v3.6.0+ | Infra scan + MCP scan + Agent scan + Prompt eval | Go + Python, Docker | Platform, multi-module, pluggable engines |
184  
185  ---
186  
187  ## Key Design Decisions
188  
189  **Rule-first, LLM-augmented**: Core detection remains deterministic (YAML fingerprint + vuln rules) for speed and consistency. LLM reasoning is layered on top for complex cases (MCP code audit, agent simulation).
190  
191  **Language split by concern**: Go handles high-concurrency network probing and the web platform. Python handles LLM agent loops, evaluation frameworks, and ML-adjacent tooling.
192  
193  **Data as source of truth**: All detection rules live in `data/` as version-controlled YAML. No rules are embedded in compiled binaries. This enables out-of-band rule updates and CI validation.
194  
195  **Pluggable scan engines**: From v3.6.0+, each scan module (infra, MCP, agent, prompt) is independently deployable and versioned, connected through the task API.