/ docs / api_integrations.md
api_integrations.md
  1  # API Integrations
  2  
  3  This section details the external API integrations used by the AI-enhanced CV system, including authentication methods, key endpoints, and important considerations for their usage.
  4  
  5  ## GitHub API
  6  
  7  The GitHub API is central to the system's ability to collect real-time activity data and professional metrics. The `activity-analyzer.js` script interacts with this API.
  8  
  9  ### Authentication
 10  
 11  Authentication to the GitHub API is performed using a Personal Access Token (PAT) provided via the `GITHUB_TOKEN` environment variable. This token should have the necessary scopes to read user and repository data.
 12  
 13  *   **Environment Variable**: `GITHUB_TOKEN`
 14  *   **Scope Requirements**: Typically `public_repo` or `repo` (for private repositories) and `read:user`.
 15  
 16  ### Key Endpoints Used
 17  
 18  The `GitHubApiClient` in `activity-analyzer.js` makes requests to the following primary endpoints:
 19  
 20  *   **`/users/:username`**: Retrieves public profile information for a specified user.
 21      *   Example: `https://api.github.com/users/adrianwedd`
 22  *   **`/users/:username/repos`**: Fetches a list of public repositories for a user.
 23      *   Parameters: `per_page=100`, `sort=updated`.
 24      *   Example: `https://api.github.com/users/adrianwedd/repos?per_page=100&sort=updated`
 25  *   **`/users/:username/events/public`**: Retrieves public activity events for a user.
 26      *   Parameters: `per_page=100`.
 27      *   Example: `https://api.github.com/users/adrianwedd/events/public?per_page=30`
 28  
 29  ### Rate Limiting Considerations
 30  
 31  The GitHub API has strict rate limits. The `GitHubApiClient` includes built-in logic to handle rate limiting gracefully:
 32  
 33  *   **`x-ratelimit-remaining`**: Tracks the number of requests remaining in the current rate limit window.
 34  *   **`x-ratelimit-reset`**: Indicates the Unix timestamp when the current rate limit window resets.
 35  *   **Automatic Backoff**: The client pauses requests if the remaining limit is low, waiting until the reset time to avoid hitting the limit and incurring errors.
 36  
 37  ## Claude API
 38  
 39  The Claude API is utilized by the `claude-enhancer.js` script to perform AI-powered content optimization and generate strategic insights for the CV.
 40  
 41  ### Authentication
 42  
 43  Authentication to the Claude API is performed using an API key provided via the `ANTHROPIC_API_KEY` environment variable.
 44  
 45  *   **Environment Variable**: `ANTHROPIC_API_KEY`
 46  
 47  ### Key Endpoints Used
 48  
 49  The `ClaudeApiClient` in `claude-enhancer.js` primarily interacts with the following endpoint:
 50  
 51  *   **`/v1/messages`**: The main endpoint for interacting with Claude models to generate text completions based on a series of messages.
 52      *   Method: `POST`
 53      *   Headers: `Content-Type: application/json`, `x-api-key: <API_KEY>`, `anthropic-version: 2023-06-01`.
 54      *   Body: Contains `model`, `max_tokens`, `temperature`, and `messages` array.
 55  
 56  ### Token Usage and Cost Management
 57  
 58  Interactions with the Claude API consume tokens, which have associated costs. The `ClaudeApiClient` includes mechanisms to track and optimize token usage:
 59  
 60  *   **Token Tracking**: `input_tokens`, `output_tokens`, `cache_creation_tokens`, and `cache_read_tokens` are tracked.
 61  *   **Caching**: Responses from identical API requests are cached to reduce redundant API calls and save on token usage. The cache key is content-aware, ensuring that changes in source data invalidate the cache.
 62  *   **AI Budget**: The workflow incorporates an AI budget mechanism (`AI_BUDGET` environment variable) to control the scope of AI enhancement based on available token budget.
 63  
 64  ### Prompt Engineering Principles
 65  
 66  The `claude-enhancer.js` script employs sophisticated prompt engineering to guide the Claude AI in generating high-quality, relevant CV content. Key principles include:
 67  
 68  *   **Role-Playing**: Assigning specific roles to the AI (e.g., "professional CV enhancement specialist," "technical skills assessment expert").
 69  *   **Context Provision**: Providing rich context from GitHub activity metrics and existing CV data to inform the AI's responses.
 70  *   **Clear Requirements**: Explicitly defining the desired output format, tone, and content constraints.
 71  *   **Quantifiable Achievements**: Encouraging the AI to incorporate measurable impacts where possible.
 72  *   **Creativity Control**: Utilizing the `CREATIVITY_LEVEL` parameter to adjust the AI's generation style.
 73  
 74  ## Python External API Wrappers
 75  
 76  This section details the Python wrappers for external APIs, located in `src/python/api_wrappers/external_apis.py`. These wrappers provide a standardized way to interact with third-party services for data such as firmographics and funding information.
 77  
 78  ### Abstract API (Firmographics)
 79  
 80  **Purpose**: Used to retrieve firmographics data (e.g., company size, industry, location) based on a company domain.
 81  
 82  **Class**: `AbstractApiWrapper`
 83  
 84  **Authentication**: Requires an API key, which can be provided during initialization or set as an environment variable.
 85  
 86  *   **Environment Variable**: `ABSTRACT_API_KEY`
 87  
 88  **Key Method**: `get_company_info(domain)`
 89  
 90  *   **Description**: Fetches company information for the given domain.
 91  *   **Parameters**: `domain` (string) - The domain name of the company (e.g., "google.com").
 92  *   **Returns**: A dictionary containing company information, or `None` if the request fails.
 93  
 94  ### Intellizence API (Funding Data)
 95  
 96  **Purpose**: Used to retrieve startup funding data.
 97  
 98  **Class**: `IntellizenceApiWrapper`
 99  
100  **Authentication**: Requires an API key, which can be provided during initialization or set as an environment variable.
101  
102  *   **Environment Variable**: `INTELLIZENCE_API_KEY`
103  
104  **Key Method**: `get_funding_data(query_params=None)`
105  
106  *   **Description**: Fetches funding data based on specified query parameters.
107  *   **Parameters**: `query_params` (dictionary, optional) - A dictionary of parameters to filter the funding data (e.g., `{'country': 'USA', 'limit': 10}`).
108  *   **Returns**: A dictionary containing funding data, or `None` if the request fails.
109  
110  **Important Note**: For both wrappers, it is highly recommended to manage API keys securely using environment variables rather than hardcoding them in the application code.