development.md
1 # Development Guide 2 3 This guide explains how to extend SMPTE-Copilot by adding new components to any module. 4 5 ## Table of Contents 6 7 - [How to Add New Components](#how-to-add-new-components) 8 - [Process Summary](#process-summary) 9 - [Module-Specific Notes](#module-specific-notes) 10 11 ## How to Add New Components 12 13 To add a new component to any module, follow these steps (we'll use the `embeddings` module as an example, but the process is identical for all modules): 14 15 ### Step 1: Add the new type to the Enum 16 17 Edit `src/embeddings/types.py` and add the new type: 18 19 ```python 20 class EmbeddingModelType(str, Enum): 21 HUGGINGFACE = "huggingface" 22 OPENAI = "openai" 23 COHERE = "cohere" # New type 24 ``` 25 26 ### Step 2: Create the implementation file 27 28 Create a new file, for example `src/embeddings/cohere.py`: 29 30 ```python 31 """Cohere embedding model implementation.""" 32 from __future__ import annotations 33 34 from typing import Dict, Any 35 from langchain_cohere import CohereEmbeddings 36 37 from .protocol import Embeddings 38 39 def create_cohere_embedding(config: Dict[str, Any]) -> Embeddings: 40 """Create Cohere embedding model. 41 Parameters 42 ---------- 43 config 44 Configuration dictionary. Common parameters include: 45 - model: str (optional) - Model name 46 - cohere_api_key: str (optional) - API key 47 - Other parameters supported by CohereEmbeddings constructor. 48 Returns 49 ------- 50 Embeddings instance. 51 """ 52 try: 53 return CohereEmbeddings(**config) 54 except Exception as e: 55 raise ValueError(f"Failed to create Cohere embedding model: {e}") from e 56 ``` 57 58 **Important**: The function must: 59 - Receive a `Dict[str, Any]` as parameter 60 - Return an instance that implements the module's Protocol (`Embeddings` in this case) 61 - Handle errors appropriately 62 63 ### Step 3: Register the implementation in the Factory 64 65 Edit `src/embeddings/factory.py` and add the import and registration: 66 67 ```python 68 from .cohere import create_cohere_embedding # Add import 69 70 # At the end of the file, register the new implementation in the registry 71 EmbeddingModelFactory.register(EmbeddingModelType.COHERE)(create_cohere_embedding) 72 ``` 73 74 The registration happens automatically at **module import time** (see [Dynamic Factory Pattern with Registry](architecture.md#dynamic-factory-pattern-with-registry) for details on how the registry works). 75 76 ### Step 4: Update exports (optional) 77 78 If necessary, update `src/embeddings/__init__.py` to export any constants or helpers related to the new component. 79 80 ### Step 5: Configure in `config.yaml` 81 82 Add the configuration for the new component in `config.yaml`: 83 84 ```yaml 85 embedding: 86 embed_name: cohere # Use the Enum value (must match the string value in types.py) 87 embed_config: 88 model: "embed-english-v3.0" 89 cohere_api_key: "${COHERE_API_KEY}" # Can use environment variables 90 ``` 91 92 ## Process Summary 93 94 1. Add type to Enum in `types.py` 95 2. Create implementation file with `create_*` function 96 3. Import and register in `factory.py` (registry populated automatically) 97 4. Configure in `config.yaml` (if applicable) 98 99 ## Module-Specific Notes 100 101 This same process applies to: 102 - **`chunkers/`**: Add new chunking algorithms 103 - **`loaders/`**: Add new loader types (DOCX, HTML, etc.) 104 - **`retrievers/`**: Add new retrieval strategies 105 - **`vector_stores/`**: Add new vector stores (Pinecone, Weaviate, etc.) 106 107 For more information about the architecture patterns used, see the [Architecture](architecture.md) documentation. To understand how to configure new components, see the [Configuration](configuration.md) guide.