/ docs / development.md
development.md
  1  # Development Guide
  2  
  3  This guide explains how to extend SMPTE-Copilot by adding new components to any module.
  4  
  5  ## Table of Contents
  6  
  7  - [How to Add New Components](#how-to-add-new-components)
  8  - [Process Summary](#process-summary)
  9  - [Module-Specific Notes](#module-specific-notes)
 10  
 11  ## How to Add New Components
 12  
 13  To add a new component to any module, follow these steps (we'll use the `embeddings` module as an example, but the process is identical for all modules):
 14  
 15  ### Step 1: Add the new type to the Enum
 16  
 17  Edit `src/embeddings/types.py` and add the new type:
 18  
 19  ```python
 20  class EmbeddingModelType(str, Enum):
 21      HUGGINGFACE = "huggingface"
 22      OPENAI = "openai"
 23      COHERE = "cohere"  # New type
 24  ```
 25  
 26  ### Step 2: Create the implementation file
 27  
 28  Create a new file, for example `src/embeddings/cohere.py`:
 29  
 30  ```python
 31  """Cohere embedding model implementation."""
 32  from __future__ import annotations
 33  
 34  from typing import Dict, Any
 35  from langchain_cohere import CohereEmbeddings
 36  
 37  from .protocol import Embeddings
 38  
 39  def create_cohere_embedding(config: Dict[str, Any]) -> Embeddings:
 40      """Create Cohere embedding model.
 41      Parameters
 42      ----------
 43      config
 44          Configuration dictionary. Common parameters include:
 45          - model: str (optional) - Model name
 46          - cohere_api_key: str (optional) - API key
 47          - Other parameters supported by CohereEmbeddings constructor.
 48      Returns
 49      -------
 50      Embeddings instance.
 51      """
 52      try:
 53          return CohereEmbeddings(**config)
 54      except Exception as e:
 55          raise ValueError(f"Failed to create Cohere embedding model: {e}") from e
 56  ```
 57  
 58  **Important**: The function must:
 59  - Receive a `Dict[str, Any]` as parameter
 60  - Return an instance that implements the module's Protocol (`Embeddings` in this case)
 61  - Handle errors appropriately
 62  
 63  ### Step 3: Register the implementation in the Factory
 64  
 65  Edit `src/embeddings/factory.py` and add the import and registration:
 66  
 67  ```python
 68  from .cohere import create_cohere_embedding  # Add import
 69  
 70  # At the end of the file, register the new implementation in the registry
 71  EmbeddingModelFactory.register(EmbeddingModelType.COHERE)(create_cohere_embedding)
 72  ```
 73  
 74  The registration happens automatically at **module import time** (see [Dynamic Factory Pattern with Registry](architecture.md#dynamic-factory-pattern-with-registry) for details on how the registry works).
 75  
 76  ### Step 4: Update exports (optional)
 77  
 78  If necessary, update `src/embeddings/__init__.py` to export any constants or helpers related to the new component.
 79  
 80  ### Step 5: Configure in `config.yaml`
 81  
 82  Add the configuration for the new component in `config.yaml`:
 83  
 84  ```yaml
 85  embedding:
 86    embed_name: cohere # Use the Enum value (must match the string value in types.py)
 87    embed_config:
 88      model: "embed-english-v3.0"
 89      cohere_api_key: "${COHERE_API_KEY}" # Can use environment variables
 90  ```
 91  
 92  ## Process Summary
 93  
 94  1. Add type to Enum in `types.py`
 95  2. Create implementation file with `create_*` function
 96  3. Import and register in `factory.py` (registry populated automatically)
 97  4. Configure in `config.yaml` (if applicable)
 98  
 99  ## Module-Specific Notes
100  
101  This same process applies to:
102  - **`chunkers/`**: Add new chunking algorithms
103  - **`loaders/`**: Add new loader types (DOCX, HTML, etc.)
104  - **`retrievers/`**: Add new retrieval strategies
105  - **`vector_stores/`**: Add new vector stores (Pinecone, Weaviate, etc.)
106  
107  For more information about the architecture patterns used, see the [Architecture](architecture.md) documentation. To understand how to configure new components, see the [Configuration](configuration.md) guide.