/ docs / api_data_update.md
api_data_update.md
  1  # Data Auto-Sync API
  2  
  3  ## Overview
  4  
  5  A.I.G detects AI infrastructure vulnerabilities using rule files in the `data/` directory. The **Data Auto-Sync** API allows you to pull the latest rules from the official GitHub repository (`Tencent/AI-Infra-Guard`) without restarting the server or rebuilding the Docker image.
  6  
  7  - **Base URL**: `http://localhost:8088` (adjust to your deployment address)
  8  - **Authentication**: No authentication required
  9  
 10  The sync is performed by cloning the `main` branch into a temporary directory using `git clone --depth 1`, then copying all `data/` sub-directories into the working directory. No GitHub token is needed.
 11  
 12  ---
 13  
 14  ## API Endpoints
 15  
 16  Both operations share the same path `/api/v1/system/update-data` and are distinguished by HTTP method.
 17  
 18  ### 1. Trigger Data Sync — `POST /api/v1/system/update-data`
 19  
 20  | Item | Value |
 21  |---|---|
 22  | URL | `/api/v1/system/update-data` |
 23  | Method | `POST` |
 24  | Request Body | Not required |
 25  
 26  No request parameters. The sync always pulls from the `main` branch and updates all data directories.
 27  
 28  #### Response Fields (`data` object)
 29  
 30  | Field | Type | Description |
 31  |---|---|---|
 32  | `running` | bool | Whether a sync is currently in progress |
 33  | `success` | bool | Whether the last sync succeeded (`null` if never run) |
 34  | `started_at` | string | ISO-8601 timestamp when the sync started |
 35  | `finished_at` | string | ISO-8601 timestamp when the sync finished (`null` if still running) |
 36  | `message` | string | Human-readable status message |
 37  | `files_updated` | int | Number of files written to disk |
 38  | `ref` | string | Branch used for this sync (always `"main"`) |
 39  
 40  #### cURL Example
 41  
 42  ```bash
 43  curl -X POST http://localhost:8088/api/v1/system/update-data
 44  ```
 45  
 46  #### Example Response (sync started)
 47  
 48  ```json
 49  {
 50    "status": 0,
 51    "message": "sync started",
 52    "data": {
 53      "running": true,
 54      "started_at": "2026-04-20T10:00:00Z",
 55      "message": "cloning repository…",
 56      "files_updated": 0,
 57      "ref": "main"
 58    }
 59  }
 60  ```
 61  
 62  ---
 63  
 64  ### 2. Get Sync Status — `GET /api/v1/system/update-data`
 65  
 66  | Item | Value |
 67  |---|---|
 68  | URL | `/api/v1/system/update-data` |
 69  | Method | `GET` |
 70  
 71  #### Response Fields
 72  
 73  Same envelope `{status, message, data}` as the trigger endpoint. See above for `data` field definitions.
 74  
 75  #### cURL Example
 76  
 77  ```bash
 78  curl http://localhost:8088/api/v1/system/update-data
 79  ```
 80  
 81  #### Example Response (sync in progress)
 82  
 83  ```json
 84  {
 85    "status": 0,
 86    "message": "copying data directories…",
 87    "data": {
 88      "running": true,
 89      "started_at": "2026-04-20T10:00:00Z",
 90      "message": "copying data directories…",
 91      "files_updated": 0,
 92      "ref": "main"
 93    }
 94  }
 95  ```
 96  
 97  #### Example Response (sync complete)
 98  
 99  ```json
100  {
101    "status": 0,
102    "message": "sync complete — 312 file(s) updated from ref \"main\"",
103    "data": {
104      "running": false,
105      "success": true,
106      "started_at": "2026-04-20T10:00:00Z",
107      "finished_at": "2026-04-20T10:00:45Z",
108      "message": "sync complete — 312 file(s) updated from ref \"main\"",
109      "files_updated": 312,
110      "ref": "main"
111    }
112  }
113  ```
114  
115  #### Example Response (sync failed)
116  
117  ```json
118  {
119    "status": 1,
120    "message": "git clone failed: exit status 128\nfatal: unable to access 'https://github.com/...'",
121    "data": {
122      "running": false,
123      "success": false,
124      "started_at": "2026-04-20T10:00:00Z",
125      "finished_at": "2026-04-20T10:00:05Z",
126      "message": "git clone failed: exit status 128\nfatal: unable to access 'https://github.com/...'",
127      "files_updated": 0,
128      "ref": "main"
129    }
130  }
131  ```
132  
133  ---
134  
135  ## Typical Workflow
136  
137  1. **Trigger sync** — call `POST /api/v1/system/update-data`; returns immediately.
138  2. **Poll for completion** — call `GET /api/v1/system/update-data` until `data.running` is `false`.
139  3. **Check result** — inspect `data.success` and `data.message`.
140  4. **No restart needed** — updated rules take effect on the next scan.
141  
142  #### Python Example
143  
144  ```python
145  import requests
146  import time
147  
148  BASE_URL = "http://localhost:8088"
149  
150  # Trigger sync
151  resp = requests.post(f"{BASE_URL}/api/v1/system/update-data")
152  print(resp.json())
153  
154  # Poll until done
155  while True:
156      status = requests.get(f"{BASE_URL}/api/v1/system/update-data").json()
157      data = status["data"]
158      print(f"[{data['message']}] files_updated={data['files_updated']}")
159      if not data["running"]:
160          break
161      time.sleep(3)
162  
163  if data.get("success"):
164      print(f"Sync complete — {data['files_updated']} file(s) updated")
165  else:
166      print(f"Sync failed: {data['message']}")
167  ```
168  
169  ## Notes
170  
171  - Only one sync can run at a time. A concurrent trigger returns the current status.
172  - The `git` binary must be available in the server's `PATH`.
173  - The server must be able to reach `github.com` on port 443.