/ README.md
README.md
1 Cerastes API: Documentation and Deployment Guide 2 Table of Contents 3 Overview 4 Project Structure 5 API Endpoints 6 Integrated AI Models 7 Deployment Guide 8 Troubleshooting 9 Contribution 10 License 11 Advanced Features 12 Overview 13 Cerastes API is an AI-based video and audio analysis platform, offering advanced processing capabilities for videos, including: 14 15 Audio transcription (monologue and multi-speaker) 16 Manipulation strategy analysis 17 Non-verbal behavior analysis 18 Batch processing for large content volumes 19 This API is designed to be highly configurable, scalable, and usable in various environments (development, production, cloud). 20 21 Project Structure 22 Cerastes_Public_API/ 23 ├── api/ # All FastAPI routers 24 │ ├── __init__.py # Routers entry point 25 │ ├── auth_router.py # Authentication and API keys management 26 │ ├── error_handlers.py # Error handlers 27 │ ├── health_router.py # Monitoring endpoint 28 │ ├── inference_router.py # Generic inference endpoint 29 │ ├── response_models.py # Pydantic response models 30 │ ├── subscription_router.py # Subscription management 31 │ ├── task_router.py # Tasks and status management 32 │ ├── transcription_router.py # Transcription endpoints 33 │ └── video_router.py # Video analysis endpoints 34 ├── db/ # Database management 35 │ ├── __init__.py # SQLAlchemy configuration 36 │ ├── init_db.py # Database initialization 37 │ ├── migrations/ # Alembic migrations 38 │ └── models.py # SQLAlchemy models 39 ├── transcription_models/ # Transcription processing logic 40 ├── video_models/ # Video processing logic 41 ├── prompts/ # System prompts in text format 42 ├── inference_results/ # Inference results storage 43 ├── results/ # General results 44 ├── uploads/ # Temporary storage for uploaded files 45 ├── tests/ # Automated tests 46 ├── auth.py # Authentication functions 47 ├── auth_models.py # Authentication models 48 ├── config.py # Centralized configuration 49 ├── database.py # Database interface 50 ├── inference_engine.py # Centralized inference engine 51 ├── main.py # Main entry point 52 ├── middleware.py # FastAPI middleware 53 ├── model_manager.py # AI model manager 54 ├── Dockerfile # Docker configuration 55 ├── docker-compose.yml # Docker services composition 56 ├── requirements.txt # Python dependencies 57 └── startup.sh # Startup script 58 59 API Endpoints 60 Authentication 61 62 POST /auth/register # Register a new user 63 POST /auth/login # Login and token generation 64 GET /auth/me # Current user information 65 POST /auth/api-keys # Create a new API key 66 GET /auth/api-keys # List user's API keys 67 PUT /auth/api-keys/{id}/activate # Activate an API key 68 PUT /auth/api-keys/{id}/deactivate # Deactivate an API key 69 DELETE /auth/api-keys/{id} # Delete an API key 70 71 Transcription 72 73 POST /transcription/monologue # Start monologue transcription 74 POST /transcription/multispeaker # Start multi-speaker transcription 75 GET /transcription/tasks/{id} # Transcription task status 76 GET /transcription/tasks # List transcription tasks 77 78 Video Analysis 79 POST /video/manipulation-analysis # Start manipulation strategies analysis 80 POST /video/nonverbal-analysis # Start non-verbal behavior analysis 81 GET /video/tasks/{id} # Video analysis task status 82 GET /video/tasks # List video analysis tasks 83 84 Generic Inference 85 POST /inference/start # Start generic inference 86 POST /inference/batch # Start batch inference 87 GET /inference/tasks/{id} # Inference task status 88 GET /inference/tasks # List inference tasks 89 90 Tasks 91 GET /tasks # List all tasks 92 GET /tasks/{id} # Specific task details 93 DELETE /tasks/{id} # Delete a task 94 95 Subscriptions 96 GET /subscriptions/plans # List subscription plans 97 POST /subscriptions/checkout # Create payment session 98 POST /subscriptions/webhook # Webhook for Stripe events 99 GET /subscriptions/success # Redirect after successful payment 100 GET /subscriptions/cancel # Redirect after cancellation 101 102 Monitoring 103 GET /health # API health status 104 GET /metrics # Prometheus metrics (if enabled) 105 106 Integrated AI Models 107 108 Audio Transcription: Whisper (tiny, base, small, medium, large) 109 Diarization: PyAnnote Speaker Diarization 110 LLM: DeepSeek, Llama2, and other VLLM-compatible models 111 Vision: InternVideo for video analysis 112 Segmentation: Sentence Transformers for intelligent text splitting 113 Deployment Guide 114 115 Prerequisites 116 117 Docker and Docker Compose 118 GPU with CUDA (recommended for performance) 119 At least 16GB RAM (32GB recommended) 120 PostgreSQL 14+ (for persistent storage) 121 Configured environment variables 122 123 Deployment with Docker 124 Clone the repository 125 126 git clone https://github.com/your-repo/Cerastes_Public_API.git 127 cd Cerastes_Public_API 128 129 Create an .env file 130 131 cp .env.example .env 132 # Edit the .env file with your configurations 133 134 Minimum required configuration in .env 135 136 # Database 137 DB_USER=postgres 138 DB_PASSWORD=your_password 139 DB_HOST=mongo # service name in docker-compose 140 DB_NAME=cerastes 141 142 # Security 143 SECRET_KEY=your_complex_secret_key 144 145 # External API 146 HUGGINGFACE_TOKEN=your_huggingface_token 147 148 # Stripe options (optional) 149 STRIPE_API_KEY=your_stripe_key 150 STRIPE_WEBHOOK_SECRET=your_webhook_secret 151 152 153 Launch with Docker Compose 154 155 docker-compose up -d 156 157 Initialize the database (first run only) 158 159 docker-compose exec api python -m db.init_db 160 161 Configuration Parameters 162 All parameters are configurable via the config.py file and can be overridden by environment variables. The main sections include: 163 164 app: General application configuration 165 database: Database configuration 166 models: AI models configuration 167 video: Video processing parameters 168 audio: Audio processing parameters 169 segmentation: Text segmentation configuration 170 inference: Inference parameters 171 api: API configuration 172 auth: Authentication configuration 173 services: External services configuration 174 Scaling 175 For high-load environments, you can: 176 177 Increase the number of workers 178 179 # In .env 180 API_WORKERS=4 181 182 Use a load balancer like Nginx or Traefik in production 183 184 Use Tensor Parallelism for large models 185 186 # In .env 187 TENSOR_PARALLEL_SIZE=2 # Use 2 GPUs for a single model 188 189 Monitoring and Logs 190 Logs are stored in the logs/ folder 191 Prometheus metrics are available at /metrics if enabled 192 Health status can be monitored via /health 193 Troubleshooting 194 Common Issues 195 Database connection error 196 Check connection parameters in .env 197 Verify that PostgreSQL is running 198 Check logs with docker-compose logs postgres 199 CUDA out of memory error 200 Reduce GPU_MEMORY_UTILIZATION (e.g., to 0.8) 201 Use a smaller model 202 Increase available GPU memory 203 Slow API 204 Check system resources (CPU, RAM, GPU) 205 Increase MAX_CONCURRENT_TASKS if resources allow 206 Check segmentation of large texts 207 Authentication errors 208 Verify that SECRET_KEY is properly set 209 Check JWT token validity 210 Verify user permissions 211 For more assistance, consult the detailed logs in the logs/ folder or submit a ticket on the GitHub repository. 212 213 Contribution 214 Contributions are welcome! Please follow these steps: 215 216 Fork the repository 217 Create a feature branch (git checkout -b feature/my-feature) 218 Add your changes (git commit -am 'Add my feature') 219 Push to the branch (git push origin feature/my-feature) 220 Create a Pull Request 221 License 222 Dual License Model 223 This project uses a dual license model: 224 225 Main License (GPL v2) 226 The majority of source code and system prompts are distributed under the GNU General Public License version 2 (GPL v2). 227 This license allows use, modification, and distribution of the source code, provided that all modifications and derivative code are also distributed under GPL v2. 228 Video Manipulation Analysis Component (AGPL v3) 229 The videomanipulation_analyzer module is distributed under the GNU Affero General Public License version 3 (AGPL v3). 230 In addition to GPL requirements, AGPL requires that the complete source code be made available to users who interact with the software via a network. 231 Commercial Licenses 232 Commercial licenses are available for organizations that wish to use Cerastes API without the restrictions of GPL/AGPL licenses: 233 234 Standard Commercial License: Allows use of the software without the obligation to share the source code of modifications. 235 Extended Commercial License: Includes commercial usage rights for the video manipulation analysis module and dedicated technical support. 236 237 Note on AI Models Usage 238 Third-party AI models integrated into this platform (such as Whisper, InternVideo, etc.) are subject to their own licenses. Please consult the respective licenses before any commercial use. 239 240 Advanced Features 241 JSONSimplifier Post-processor 242 The JSONSimplifier is a post-processor that automatically converts complex JSON results into clear, easy-to-understand textual explanations. It uses an LLM model to transform structured data into natural language. 243 244 Configuration 245 The JSONSimplifier can be configured via environment variables or options in the startup.sh script: 246 247 Usage 248 Enable the JSONSimplifier: 249 250 Via environment variables: 251 252 export JSON_SIMPLIFIER_ENABLED=true 253 export JSON_SIMPLIFIER_MODEL="huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated-v2" 254 export JSON_SIMPLIFIER_SYSTEM_PROMPT="Translate this JSON {text} into simple English" 255 export JSON_SIMPLIFIER_APPLY_TO="inference,video" 256 257 Or directly in command line: 258 259 JSON_SIMPLIFIER_ENABLED=true JSON_SIMPLIFIER_APPLY_TO="inference,video" ./startup.sh 260 261 Results 262 When the JSONSimplifier is enabled, API responses will include an additional plain_explanation field containing the natural language explanation of the JSON results. 263 264 Examples 265 Instead of receiving only structured JSON results like: 266 267 { 268 "result": { 269 "analysis": { 270 "sentiment": "positive", 271 "key_points": ["Point 1", "Point 2", "Point 3"], 272 "complexity_score": 0.75 273 } 274 } 275 } 276 277 You will also receive a simplified explanation: 278 279 { 280 "result": { 281 "analysis": { ... }, 282 }, 283 "plain_explanation": "The text has a positive sentiment. It includes 3 key points and has a complexity score of 0.75." 284 } 285 286 This feature is particularly useful for end-user applications or for quick interpretation of analysis results. 287 288 Prompt Management System 289 The prompt management system provides a flexible way to use different types of prompts for your inference tasks. You can use predefined prompts or create custom ones with placeholders. 290 291 Available Prompts 292 The system includes several built-in prompts: 293 294 system_1: Basic analysis prompt 295 system_2: Jungian analysis prompt 296 system_3: Ethical analysis prompt 297 And others for specific use cases 298 Modular Design 299 Prompts are completely modular - you can use them individually or in sequence: 300 301 # Use a single prompt 302 inference_data = { 303 "text": "Your text here", 304 "prompt_name": "system_2" # Use only system_2 prompt 305 } 306 307 # Use a sequence of prompts 308 inference_data = { 309 "text": "Your text here", 310 "prompt_sequence": ["system_1", "system_2", "system_final"] 311 } 312 313 Custom Placeholders 314 You can use custom placeholders in prompts and provide their values at inference time: 315 316 # Use custom placeholders 317 inference_data = { 318 "text": "Your text here", 319 "prompt_name": "system_3", 320 "language": "english", 321 "context": "Academic analysis" 322 }