/ README.md
README.md
1 <div align="center"> 2 3  4  5  6  7  8  9  10  11  12  13  14  15  16  17 18 </div> 19 20 **AI-powered recruitment automation with LangGraph orchestration, LLM reasoning, and enterprise workflow automation.** 21 22 23 # Agentic AI-Powered HR Automation<br>Instant CV Intelligence for Modern Hiring Teams<br>Python + LangGraph + FastAPI 24 25 > π Enterprise-grade Agentic AI platform for HR automation, candidate intelligence, and recruitment workflows. 26 27 <br> 28 For Anyone:<br> 29 Interested In Learning Agentic AI for a real-world practical use-case<br><br> 30 Automated CV review and candidate evaluation system using LangChain, LangGraph, LlamaIndex, FastAPI.<br> 31 By AICampus - Agentic AI Research Community 32 33 ## π― Features 34 35 - β Automated CV processing to Candidate Review & Evaluation 36 - π€ AI-powered data extraction (personal info, experience, skills, qualifications etc.) 37 - π 200-word professional summaries 38 - β Candidate scoring (1-100) with detailed reasoning 39 - π Automatic Google Sheets logging 40 - π High-performance FastAPI with async support 41 - β Data Analytics with web-based HR Dashaboard support for management and visualization 42 - β Real-time notifications for HR teams 43 44 <img width="1691" height="1000" alt="Agentic AI HR Automation" src="https://github.com/user-attachments/assets/93257061-8520-4853-8991-0501a3146e64" /><br> 45 46 <img width="1602" height="1027" alt="AI HR Automation - 3" src="https://github.com/user-attachments/assets/d734715a-d11d-4f3d-86b3-e403624b3933" /> 47 48 49 <br><br> 50 51 ## Agentic AI-Powered HR Automation + Web-based HR Dashboard 52 53 ### π οΈ Tech Stack 54 55 | Category | Tools | 56 | :--- | :--- | 57 | **Agentic AI** |     | 58 | **Backend** |    | 59 | **Frontend** |    | 60 61 ### π§ LLM Providers 62     63 64 > Created by **AICampus** - Agentic AI Research Community 65 66 67 ### π§ Tech Features 68 - AI-Based Data Extraction 69 - Structured JSON outputs for reliable data parsing 70 - Json Data Handling for minimize AI Model Tokens Cost. improve accuracy of results 71 - Error handling with proper response codes 72 - Timestamped records with direct CV links 73 - Multi-LLM Support: OpenAI GPT, Anthropic Claude, Google Gemini, Opensource Qwen3 via Ollama 74 75 76 ### Automation Workflow 77 78 <img width="1536" height="1024" alt="AI HR Automation Workflow" src="https://github.com/user-attachments/assets/d6c6b065-25fc-4832-875c-63d6cb1cb388" /> 79 80 81 82 83 <br> 84 <img width="1789" height="1043" alt="AI HR Automation - LangGraph" src="https://github.com/user-attachments/assets/c8699540-8ac4-457a-852d-d166c93d9963" /> 85 <br> 86 87 ## π° Cost 88 5,000 tokens Γ ($0.30 / 1,000,000) = $0.0015 per resume <br> 89 - **$0.0015 per candidate** using GPT-4o-mini reasoning model 90 - 100 candidates ~ $0.15 91 - 1,000 candidates under $5 92 93 <img width="1778" height="963" alt="AI HR Automation - OpenAI GPT Usage" src="https://github.com/user-attachments/assets/c9928905-b0c6-4023-9f5e-20407c9d3f05" /> 94 95 <br> 96 97 98 ### πΊ Watch the Video 99 [](https://www.youtube.com/watch?v=J6V18FWbaqY) 100 101 ----------------------------------------------------------- 102 <br> 103 104 ## π Quick Start 105 106 Get the project complete source-code as zip file: 107 [Backend AI Agent Workflow + Frontend HR Dashboard](https://aicampusmagazines.gumroad.com/l/gscdiq) 108 109 ### 1. Installation 110 Download the complete project code [here](https://aicampusmagazines.gumroad.com/l/gscdiq) 111 ```bash 112 cd ai-hr-automation 113 114 # Install dependencies - Using uv dependency manager (Speed: 10β100x faster than pip), Solves nested dependencies issues for complex architectures like LangGraph 115 ## Install uv (a fast Python package manager) using official installers 116 ## macOS / Linux: 117 curl -LsSf https://astral.sh/uv/install.sh | sh 118 ## Verify Installation 119 uv --version 120 ## Install all dependencies listed in pyproject.toml: 121 ## virtual environment for the project will be automatically created 122 uv sync 123 124 # Configure environment 125 cp env.example .env 126 # Edit .env with your credentials 127 ``` 128 129 ### 2. Setup & Configure LLM 130 131 - LLM Provider to support multiple LLMs 132 - For development/testing, use LLM_PROVIDER=ollama (free) 133 - - Ollama requires 8GB+ RAM for larger models 134 - For production, use openai or anthropic based on your needs 135 - Claude (anthropic) has better reasoning for complex evaluations 136 - OpenAI gpt-4o-mini is faster and cheaper for simple tasks 137 - Setup API Keys in .env 138 139 140 ### 3. Google Cloud Setup 141 142 1. Create project at [console.cloud.google.com](https://console.cloud.google.com) 143 2. Enable APIs: 144 - Google Drive API 145 - Google Sheets API 146 3. Create Service Account for your project 147 4. Download as `credentials.json` 148 5. Rename the file as `google-service-account-credentials.json` 149 6. Enable & setup Google Cloud Storage 150 151 152 ### 4. Run API Server 153 ```bash 154 # Development 155 uvicorn src.fastapi_api:app --reload --port 8000 156 157 # Production 158 uvicorn src.fastapi_api:app --host 0.0.0.0 --port 8000 --workers 4 159 ``` 160 161 ### 5. Access Documentation 162 163 - Swagger UI: http://localhost:8000/docs 164 - ReDoc: http://localhost:8000/redoc 165 166 167 ## π Usage 168 169 ### HR Create Job Application - With Dynamic HTML Support 170 ```python 171 # Job endpoints 172 @app.post("/api/jobs") 173 async def create_job(hr_job_post: HRJobPost): 174 """ 175 Create a new job posting 176 - Accepts camelCase from frontend (Next.js) 177 - Stores in MongoDB as camelCase 178 - Python variables are snake_case 179 """ 180 # Convert to dict with camelCase aliases for MongoDB 181 hr_job_post_data = hr_job_post.model_dump(by_alias=True, exclude={"id"}) 182 hr_job_post_data["createdAt"] = datetime.now().isoformat() 183 184 # Insert into MongoDB 185 result = await db.hr_job_posts.insert_one(hr_job_post_data) 186 job_id = str(result.inserted_id) 187 188 return { 189 "success": True, 190 "jobId": job_id, 191 } 192 ``` 193 194 ### Process Single Candidate 195 ```python 196 @app.post("/api/candidate-application-submit") 197 async def candidate_job_application_submit( 198 job_id: str = Form(..., description="Job Id"), 199 name: str = Form(..., description="Candidate's full name"), 200 email: EmailStr = Form(..., description="Candidate's email address"), 201 cv_file: UploadFile = File(..., description="CV PDF file") 202 ): 203 """ 204 API endpoint for Job Form submissions 205 206 Receives form data with file upload: 207 - name: Candidate's full name 208 - email: Candidate's email address 209 - cv_file: Uploaded CV PDF file 210 211 Returns complete processing results including evaluation 212 """ 213 214 logger.info(f"β Processing complete - Score: {response.score}/100") 215 logger.info("=" * 80) 216 return response 217 ``` 218 219 ### HR Dashboard - Front-end Web Application 220 Next.js 16 + Typescript 221 ```bash 222 npm install 223 npm run dev 224 225 ``` 226 227 228 ## π³ Docker Deployment 229 ```bash 230 # Build and run 231 docker-compose up -d 232 233 # View logs 234 docker-compose logs -f 235 236 # Stop 237 docker-compose down 238 ``` 239 240 ## π Documentation.md - Detailed instructions to setup and configure the project 241 ### Step-by-Step Explanations 242 DOCUMENTATION.md 243 244 ----------------------------------------------------------- 245 246 247 ## Why uv package manager for Python3 projects 248 - Speed: 10β100x faster than pip. Solves nested dependencies and version conflict issues for complex architectures like LangGraph 249 - Automatic Setup:For most modern workflows, you do not need to create or activate a virtual environment manually. 250 - On-Demand Creation: When you run a command like uv run or uv sync in a project directory, uv checks for a virtual environment (typically in a .venv folder). If one doesn't exist, uv will automatically create it and install the required dependencies before executing your command. 251 - Automatic Updates: If you add a dependency using uv add <package>, uv updates your pyproject.toml, synchronizes the .venv, and updates the uv.lock file all in one step. 252 253 254 ## π§ Why OpenAI GPT-4o-mini Is a good option? 255 256 - Structured extraction 257 - Deterministic JSON output 258 - Fast response time 259 - Very low cost 260 - HR-grade reasoning quality 261 - Scales cleanly for large input tokens 262 263 264 ## Why LangChain and LangGraph for Agentic AI 265 We chose LangChain because its ecosystem offers mature abstractions for prompt handling and tool invocation. Its modular design allowed the team to integrate multiple model providers and build on a standard interface instead of rolling out their own. 266 LangChain provides the foundation to focus on what matters the most: safety, scalability, and developer experience. 267 268 Its node-and-edge model lets Remote represent complex workflowsβ ingestion, mapping, execution, validationβ as a directed graph. Each step becomes a node with explicit transitions for success, failure, or retry. This makes the agent's state transparent and recoverable, similar to how distributed systems engineers reason about pipelines. LangGraph's focus on long-running, stateful agents was a perfect match for our multi-step migration process. 269 270 271 ## Results and impact 272 By combining LLM reasoning with deterministic code execution, It has turned a manual process into an automated workflow. HR teams no longer need to process large amount of text β they simply plug data into the Code Execution Agent. The agent transforms diverse formats into a consistent JSON schema in seconds instead of days. 273 274 Beyond speed, the system has made everything more reliable. The LLM guides the process, but the actual data manipulation happens with trusted Python libraries, completely sidestepping hallucination issues. 275 276 ## Lessons learned 277 Building this AI agent taught AICampus several lessons that now inform how its team builds AI systems across different business use-cases: 278 279 - LLMs are planners, not processors. Use them to reason about tasks and choose tools, but offload heavy data processing to code. 280 - Validated JSON processing for ingestion. Large intermediate results never pass back to the model, keeping the context small. 281 - Structure beats improvisation. Orchestrating workflows as graphs makes them much easier to debug and extend. 282 - Context tokens are precious. Large intermediate results should stay in the execution environment where they belong. 283 - Python remains the analytics workhorse. Libraries & tools like LangChain and LlamaIndex offer fast, flexible data manipulation that's hard to beat. 284 285 286 287 # Local Testing 288 In 2026, context management is critical for agentic HR tasks, as evaluating multiple long-form CVs against a job description can quickly exceed standard 8k or 32k limits. For local testing with Ollama, here is how the top models compare in context capacity as of January 2026. 289 Context Window Comparison (2026) 290 291 ## Model Choice 292 Building an agentic HR automation system on a local machine requires balancing reasoning depth with the limitations of RAM. You must prioritize smaller, highly efficient models to ensure LangGraph agents can complete multi-step tasks without crashing. 293 <br> 294 Best Model Recommendation for 2026: 295 296 For this specific HR evaluation use case on low hardware, DeepSeek-R1-Distill-Qwen-7B or Qwen3-7B-Instruct are the superior choices. 297 298 - DeepSeek-R1-Distill-Qwen-7B (Primary Choice): 299 Reasoning Capability: This is a "reasoning-first" model that uses chain-of-thought (CoT). For HR tasks, it will "think" through a candidate's qualifications before outputting a final score, similar to GPT-4o's internal reasoning.<br> 300 Fit: A 4-bit quantized version requires approximately 4.5GB to 5GB of memory.<br> 301 LangGraph Performance: It is highly reliable for the structured output and "decisions" required in LangGraph nodes.<br> 302 303 - Qwen3-7B-Instruct (Alternative): 304 Reasoning Capability: Noted in 2026 as one of the most efficient models for tool-calling and structured data extraction (e.g., parsing a CV into JSON).<br> 305 Fit: Consumes roughly 4.8GB of memory in its standard 4-bit quantization, making it very fast for local testing on Intel Macs. 306 307 308 Avoid Large Models: Do not attempt to run 14B or 20B models. On small RAM with an Intel processor, these will offload too many layers to system memory, causing tokens-per-second to drop below 1β2, which is unusable for testing agent loops.<br> 309 Optimize Ollama Context: HR tasks involving long resumes require context. Limit your context window to 16,384 (16k) in your LangGraph configuration to prevent memory saturation on your i9 processor.<br> 310 LangGraph Integration: Use the Ollama Functions or Tool Calling wrappers in LangGraph. Qwen3-7B is specifically optimized for these "agentic" triggers in the 2026 library updates.<br> 311 Recommendation: Start with ollama run deepseek-r1:7b. If the "thinking" steps make your agent loops too slow for local testing, switch to ollama run qwen3:7b for faster, direct instruction execution.<br> 312 313 ----------------------------------------------------------- 314 315 316 # PROMPT GUIDE 317 318 The following prompt uses Chain-of-Thought (CoT) and Strict Grounding patterns optimized for reasoning models like OpenAI's GPT-4o. 319 Recommended HR System Prompt (2026) 320 321 ### ROLE 322 You are an Expert Technical Recruiter specializing in high-precision candidate evaluation. Your goal is to provide objective, evidence-based assessments of CVs against specific Job Descriptions (JD). 323 324 ### GUIDELINES (ANTI-HALLUCINATION) 325 1. GROUNDING: Use ONLY the provided CV text. Do not infer skills, companies, or dates that are not explicitly stated. 326 2. HONESTY: If a required skill from the JD is missing or ambiguous in the CV, explicitly state "Missing" or "Insufficient Evidence". 327 3. NO GUESSING: Never invent projects or experience to make a candidate seem like a better fit. If you are unsure, rate your confidence as "Low" for that specific skill. 328 4. REASONING: Always perform a step-by-step analysis before providing a final score. 329 330 ### EVALUATION PROCESS 331 Step 1: Extract core technical skills explicitly mentioned in the CV. 332 Step 2: Cross-reference extracted skills against the JD's 'Required' and 'Preferred' sections. 333 Step 3: Analyze "Years of Experience" for each skill. Calculate total relevant experience manually. 334 Step 4: Identify gaps where the CV fails to meet JD requirements. 335 Step 5: Provide a final Evaluation Score (0-100) and a justification summary. 336 337 338 Why this works for your setup:<br> 339 - Chain-of-Thought (CoT): By forcing the model to list reasoning_steps first, you utilize the modelβs limited reasoning capacity more effectively, reducing the likelihood of a "lazy" or incorrect final score. 340 - Structured Schema: LLLM models in 2026 are highly trained on JSON outputs. Using a clear JSON structure ensures your LangGraph nodes can parse the results programmatically without error. 341 - Confidence Scoring: Including a "Missing Skills" section forces the model to look for negatives, which counteracts the natural tendency of LLMs to be "agreeable" and over-rate candidates. 342 343 344 ----------------------------------------------------------- 345 346 # AI-Powered HR Automation with LangGraph 347 ## Complete CV Review to Candidate Evaluation System 348 349 > Developed By AICampus | Gateway for future AI research & learning 350 351 ----------------------------------------------------------- 352