/ driftkit-common / README.md
README.md
   1  # DriftKit Common Module
   2  
   3  ## Overview
   4  
   5  The `driftkit-common` module serves as the foundational layer for the DriftKit AI ETL framework, providing shared domain objects, utilities, and core services that other modules depend on. This module contains the essential building blocks for AI-powered applications including chat management, document processing, text analysis, and model integration.
   6  
   7  ## Spring Boot Initialization
   8  
   9  The common module doesn't require special Spring Boot configuration as it provides only domain objects and utilities. Simply include it as a dependency:
  10  
  11  ```java
  12  @SpringBootApplication
  13  public class YourApplication {
  14      public static void main(String[] args) {
  15          SpringApplication.run(YourApplication.class, args);
  16      }
  17  }
  18  ```
  19  
  20  The module provides:
  21  - **Domain objects**: No Spring annotations, pure POJOs
  22  - **Services**: Stateless utility services that can be instantiated directly
  23  - **Configuration**: `EtlConfig` can be used with `@ConfigurationProperties`
  24  
  25  ## Architecture
  26  
  27  ### Module Structure
  28  
  29  ```
  30  driftkit-common/
  31  ├── src/main/java/ai/driftkit/
  32  │   ├── common/
  33  │   │   ├── domain/           # Core domain objects
  34  │   │   ├── service/          # Core services
  35  │   │   └── utils/           # Utility classes
  36  │   └── config/              # Configuration classes
  37  ```
  38  
  39  ### Key Dependencies
  40  
  41  - **Lombok** - Code generation and boilerplate reduction
  42  - **Jackson** - JSON serialization with JSR310 support
  43  - **Apache Commons Lang3** - String and general utilities
  44  - **Apache Commons Collections4** - Enhanced collection operations
  45  - **SLF4J** - Logging facade
  46  - **Jakarta Validation** - Bean validation annotations
  47  
  48  ## Core Domain Objects
  49  
  50  ### Chat Management
  51  
  52  #### Chat
  53  Represents a chat session with comprehensive metadata:
  54  
  55  ```java
  56  @Data
  57  @Builder
  58  public class Chat {
  59      private String id;
  60      private String name;
  61      private String systemMessage;
  62      private Language language;
  63      private Integer memoryLength;
  64      private ModelRole modelRole;
  65      private boolean hidden;
  66      private LocalDateTime createdAt;
  67      private LocalDateTime updatedAt;
  68  }
  69  ```
  70  
  71  **Key Features:**
  72  - Unique identification and naming
  73  - System message configuration for AI behavior
  74  - Language specification (GENERAL, SPANISH, ENGLISH)
  75  - Memory length control for conversation context
  76  - Model role assignment (MAIN, ABTEST, CHECKER, NONE)
  77  - Visibility controls with hidden flag
  78  - Automatic timestamp tracking
  79  
  80  **Usage Example:**
  81  ```java
  82  Chat chat = Chat.builder()
  83      .id("chat-123")
  84      .name("Customer Support Session")
  85      .systemMessage("You are a helpful customer support agent")
  86      .language(Language.ENGLISH)
  87      .memoryLength(50)
  88      .modelRole(ModelRole.MAIN)
  89      .hidden(false)
  90      .createdAt(LocalDateTime.now())
  91      .build();
  92  ```
  93  
  94  #### Message
  95  Comprehensive message representation supporting multi-modal content:
  96  
  97  ```java
  98  @Data
  99  @Builder
 100  public class Message implements ChatItem {
 101      private String id;
 102      private String chatId;
 103      private String parentId;
 104      private String content;
 105      private ChatMessageType messageType;
 106      private MessageType contentType;
 107      private Grade grade;
 108      private Map<String, Object> workflowContext;
 109      private LogProbs logProbs;
 110      private LocalDateTime requestInitTime;
 111      private LocalDateTime responseTime;
 112      private Map<String, Object> variables;
 113      private List<String> imageUrls;
 114      private String audioUrl;
 115      private String videoUrl;
 116      private String fileUrl;
 117  }
 118  ```
 119  
 120  **Key Features:**
 121  - Multi-modal support (TEXT, IMAGE, AUDIO, VIDEO, FILE)
 122  - Hierarchical message structure with parent-child relationships
 123  - Grading system for quality assessment
 124  - Workflow context integration
 125  - Token log probabilities for advanced analysis
 126  - Performance timing tracking
 127  - Variable and media URL storage
 128  
 129  **Usage Example:**
 130  ```java
 131  Message message = Message.builder()
 132      .id("msg-456")
 133      .chatId("chat-123")
 134      .content("Hello, how can I help you today?")
 135      .messageType(ChatMessageType.AI)
 136      .contentType(MessageType.TEXT)
 137      .grade(Grade.EXCELLENT)
 138      .requestInitTime(LocalDateTime.now())
 139      .build();
 140  ```
 141  
 142  #### AITask
 143  Comprehensive task representation for AI operations:
 144  
 145  ```java
 146  @Data
 147  @Builder
 148  public class AITask {
 149      private String id;
 150      private String chatId;
 151      private String workflowId;
 152      private String prompt;
 153      private Map<String, Object> variables;
 154      private List<String> imageUrls;
 155      private String audioUrl;
 156      private String videoUrl;
 157      private String fileUrl;
 158      private Map<String, Object> workflowContext;
 159      private Grade grade;
 160      private LocalDateTime createdAt;
 161      private LocalDateTime completedAt;
 162      private String errorMessage;
 163      private Map<String, Object> metadata;
 164  }
 165  ```
 166  
 167  **Key Features:**
 168  - Multi-modal input support
 169  - Workflow integration with context preservation
 170  - Variable substitution support
 171  - Performance and error tracking
 172  - Metadata extensibility
 173  - Quality grading system
 174  
 175  ### Model Client Abstraction
 176  
 177  #### ModelClient
 178  Abstract base class for AI model integrations. The ModelClient provides unified access to different AI capabilities including text generation, image generation, and function calling.
 179  
 180  **Supported Capabilities:**
 181  - `TEXT_TO_TEXT` - Text completion and chat
 182  - `TEXT_TO_IMAGE` - Image generation from text
 183  - `IMAGE_TO_TEXT` - Image analysis and description
 184  - `FUNCTION_CALLING` - Tool use and function execution
 185  
 186  **Configuration Parameters:**
 187  - `temperature` - Randomness control (0.0-2.0)
 188  - `top_p` - Nucleus sampling threshold
 189  - `max_tokens` - Maximum response length
 190  - `frequency_penalty` - Repetition penalty
 191  - `presence_penalty` - Topic diversity control
 192  - `stop_sequences` - Generation stopping conditions
 193  
 194  **Usage Example:**
 195  ```java
 196  ModelTextRequest request = ModelTextRequest.builder()
 197      .prompt("Explain quantum computing in simple terms")
 198      .temperature(0.7)
 199      .maxTokens(500)
 200      .build();
 201  
 202  ModelTextResponse response = modelClient.generateText(request);
 203  System.out.println(response.getContent());
 204  ```
 205  
 206  ## Core Services
 207  
 208  ### Chat Memory Management
 209  
 210  #### ChatMemory
 211  Interface for managing conversation history with methods to add, retrieve, clear, and filter messages by ID or type.
 212  
 213  #### TokenWindowChatMemory
 214  Advanced memory management with token-based capacity control:
 215  
 216  ```java
 217  @Builder
 218  public class TokenWindowChatMemory implements ChatMemory {
 219      private final int maxTokens;
 220      private final Tokenizer tokenizer;
 221      private final ChatMemoryStore store;
 222      
 223      @Override
 224      public void add(ChatItem message) {
 225          store.add(message);
 226          ensureTokenLimit();
 227      }
 228      
 229      private void ensureTokenLimit() {
 230          // Evict older messages while preserving system messages
 231          // and maintaining conversation context
 232      }
 233  }
 234  ```
 235  
 236  **Key Features:**
 237  - Token-based capacity management
 238  - Automatic message eviction
 239  - System message preservation
 240  - Context-aware pruning
 241  - Multiple tokenizer support
 242  
 243  **Configuration Example:**
 244  ```java
 245  ChatMemory memory = TokenWindowChatMemory.builder()
 246      .maxTokens(4000)
 247      .tokenizer(new SimpleTokenizer())
 248      .store(new InMemoryChatMemoryStore())
 249      .build();
 250  
 251  // Add messages - automatic pruning when limit exceeded
 252  memory.add(systemMessage);
 253  memory.add(userMessage);
 254  memory.add(aiResponse);
 255  ```
 256  
 257  #### InMemoryChatMemoryStore
 258  Simple in-memory implementation for development and testing:
 259  
 260  ```java
 261  public class InMemoryChatMemoryStore implements ChatMemoryStore {
 262      private final Map<String, List<ChatItem>> conversations = new HashMap<>();
 263      
 264      @Override
 265      public void add(String chatId, ChatItem message) {
 266          conversations.computeIfAbsent(chatId, k -> new ArrayList<>()).add(message);
 267      }
 268      
 269      @Override
 270      public List<ChatItem> getMessages(String chatId) {
 271          return conversations.getOrDefault(chatId, new ArrayList<>());
 272      }
 273  }
 274  ```
 275  
 276  ## Utility Classes
 277  
 278  ### Document Processing
 279  
 280  #### DocumentSplitter
 281  Intelligent document chunking for RAG applications with configurable chunk sizes and overlap. The splitter uses a sentence-first splitting strategy to maintain context boundaries.
 282  
 283  **Key Features:**
 284  - Sentence-first splitting strategy
 285  - Configurable overlap between chunks
 286  - Token count validation
 287  - Oversized content handling
 288  - Context preservation
 289  
 290  **Usage Example:**
 291  ```java
 292  DocumentSplitter splitter = DocumentSplitter.builder()
 293      .maxChunkSize(512)
 294      .overlapSize(50)
 295      .tokenizer(new SimpleTokenizer())
 296      .build();
 297  
 298  List<String> chunks = splitter.split(longDocument);
 299  // Each chunk ≤ 512 tokens with 50-token overlap
 300  ```
 301  
 302  ### Text Analysis
 303  
 304  #### TextSimilarityUtil
 305  Comprehensive text similarity calculations with multiple algorithms including Levenshtein distance, Jaccard similarity, Cosine similarity, and a weighted combined similarity metric.
 306  
 307  **Supported Algorithms:**
 308  - **Levenshtein Distance** - Character-level edit distance
 309  - **Jaccard Similarity** - Set-based similarity using word overlap
 310  - **Cosine Similarity** - Vector-based similarity with TF-IDF
 311  - **Combined Similarity** - Weighted combination of multiple metrics
 312  
 313  **Usage Example:**
 314  ```java
 315  String text1 = "The quick brown fox jumps over the lazy dog";
 316  String text2 = "A quick brown fox leaps over a lazy dog";
 317  
 318  double similarity = TextSimilarityUtil.combinedSimilarity(text1, text2);
 319  // Returns weighted score: 0.4 * levenshtein + 0.3 * jaccard + 0.3 * cosine
 320  ```
 321  
 322  #### VariableExtractor
 323  Template variable extraction with advanced features for parsing templates. Supports simple variables, conditional blocks, list iterations, and nested properties with dot notation.
 324  
 325  **Supported Features:**
 326  - Simple variable extraction: `{{variable}}`
 327  - Conditional blocks: `{{#if condition}}...{{/if}}`
 328  - List iterations: `{{#each items}}...{{/each}}`
 329  - Nested properties: `{{user.profile.name}}`
 330  - Escape sequences: `{{{{literal}}}}`
 331  
 332  **Usage Example:**
 333  ```java
 334  String template = """
 335      Hello {{user.name}}!
 336      {{#if user.isPremium}}
 337          Welcome to our premium service.
 338      {{/if}}
 339      Your recent orders:
 340      {{#each orders}}
 341          - {{this.product}} ({{this.price}})
 342      {{/each}}
 343      """;
 344  
 345  Set<String> variables = VariableExtractor.extractVariables(template);
 346  // Returns: ["user.name", "user.isPremium", "orders", "this.product", "this.price"]
 347  ```
 348  
 349  ### JSON Processing
 350  
 351  #### JsonUtils
 352  Robust JSON parsing and repair utilities with support for malformed JSON, relaxed parsing (comments, trailing commas, single quotes), JSON extraction from mixed text, and safe type conversion.
 353  
 354  **Key Features:**
 355  - Automatic JSON repair for malformed input
 356  - Relaxed parsing with comments and trailing commas
 357  - JSON extraction from mixed text content
 358  - Safe type conversion with error handling
 359  - Support for single quotes and unquoted keys
 360  
 361  **Usage Example:**
 362  ```java
 363  String malformedJson = """
 364      {
 365          name: 'John Doe',  // User's full name
 366          age: 30,
 367          "active": true,    // Trailing comma
 368      }
 369      """;
 370  
 371  Optional<JsonNode> parsed = JsonUtils.parseJsonRelaxed(malformedJson);
 372  if (parsed.isPresent()) {
 373      String name = parsed.get().get("name").asText();
 374      int age = parsed.get().get("age").asInt();
 375  }
 376  ```
 377  
 378  ### Tokenization
 379  
 380  #### SimpleTokenizer
 381  Basic tokenization for estimation:
 382  
 383  ```java
 384  public class SimpleTokenizer implements Tokenizer {
 385      private final double tokenCostMultiplier;
 386      
 387      public SimpleTokenizer() {
 388          this(0.7); // Default multiplier
 389      }
 390      
 391      @Override
 392      public int estimateTokenCount(String text) {
 393          return (int) (text.length() * tokenCostMultiplier);
 394      }
 395      
 396      @Override
 397      public int estimateTokenCount(List<ChatItem> messages) {
 398          return messages.stream()
 399              .mapToInt(msg -> estimateTokenCount(msg.getContent()))
 400              .sum();
 401      }
 402  }
 403  ```
 404  
 405  ## Configuration
 406  
 407  ### EtlConfig
 408  Central configuration for the entire framework:
 409  
 410  ```java
 411  @Data
 412  @Builder
 413  public class EtlConfig {
 414      private List<VectorStoreConfig> vectorStores;
 415      private List<EmbeddingServiceConfig> embeddingServices;
 416      private List<PromptServiceConfig> promptServices;
 417      private List<VaultConfig> vault;
 418      private YoutubeProxyConfig youtubeProxy;
 419      
 420      // Nested configuration classes
 421      @Data
 422      public static class VectorStoreConfig {
 423          private String name;
 424          private String type; // "inmemory", "filebased", "pinecone"
 425          private String url;
 426          private String apiKey;
 427          private String environment;
 428          private String index;
 429          private Integer dimension;
 430          private String metric;
 431          private String filePath;
 432      }
 433      
 434      @Data
 435      public static class EmbeddingServiceConfig {
 436          private String name;
 437          private String type; // "openai", "cohere", "local"
 438          private String apiKey;
 439          private String model;
 440          private String url;
 441          private String modelPath;
 442          private Integer maxTokens;
 443          private Double temperature;
 444      }
 445      
 446      @Data
 447      public static class VaultConfig {
 448          private String name;
 449          private String type; // "openai", "anthropic", "google"
 450          private String apiKey;
 451          private String model;
 452          private String baseUrl;
 453          private Double temperature;
 454          private Integer maxTokens;
 455          private Double topP;
 456          private Double frequencyPenalty;
 457          private Double presencePenalty;
 458          private List<String> stopSequences;
 459      }
 460  }
 461  ```
 462  
 463  **Configuration Example (application.yml):**
 464  ```yaml
 465  driftkit:
 466    vault:
 467      - name: "primary-openai"
 468        type: "openai"
 469        apiKey: "${OPENAI_API_KEY}"
 470        model: "gpt-4"
 471        temperature: 0.7
 472        maxTokens: 2000
 473      - name: "claude"
 474        type: "claude"
 475        apiKey: "${CLAUDE_API_KEY}"
 476        model: "claude-sonnet-4-20250514"
 477        temperature: 0.7
 478        maxTokens: 2000
 479        
 480    vectorStores:
 481      - name: "main-vector-store"
 482        type: "pinecone"
 483        apiKey: "${PINECONE_API_KEY}"
 484        environment: "us-west1-gcp"
 485        index: "driftkit-vectors"
 486        dimension: 1536
 487        metric: "cosine"
 488        
 489    embeddingServices:
 490      - name: "primary-embedding"
 491        type: "openai"
 492        apiKey: "${OPENAI_API_KEY}"
 493        model: "text-embedding-ada-002"
 494        
 495    promptServices:
 496      - name: "file-prompts"
 497        type: "filesystem"
 498        basePath: "./prompts"
 499        
 500    youtubeProxy:
 501      proxyUrl: "http://proxy.example.com:8080"
 502      username: "proxyuser"
 503      password: "${PROXY_PASSWORD}"
 504  ```
 505  
 506  ## Usage Patterns
 507  
 508  ### Basic Chat Implementation
 509  
 510  ```java
 511  @Service
 512  public class ChatService {
 513      private final ChatMemory memory;
 514      private final ModelClient modelClient;
 515      
 516      public ChatService() {
 517          this.memory = TokenWindowChatMemory.builder()
 518              .maxTokens(4000)
 519              .tokenizer(new SimpleTokenizer())
 520              .store(new InMemoryChatMemoryStore())
 521              .build();
 522          
 523          this.modelClient = new OpenAIModelClient();
 524      }
 525      
 526      public String processMessage(String chatId, String userMessage) {
 527          // Add user message to memory
 528          Message userMsg = Message.builder()
 529              .id(UUID.randomUUID().toString())
 530              .chatId(chatId)
 531              .content(userMessage)
 532              .messageType(ChatMessageType.USER)
 533              .build();
 534          memory.add(userMsg);
 535          
 536          // Generate AI response
 537          ModelTextRequest request = ModelTextRequest.builder()
 538              .messages(memory.messages())
 539              .temperature(0.7)
 540              .maxTokens(1000)
 541              .build();
 542          
 543          ModelTextResponse response = modelClient.generateText(request);
 544          
 545          // Add AI response to memory
 546          Message aiMsg = Message.builder()
 547              .id(UUID.randomUUID().toString())
 548              .chatId(chatId)
 549              .content(response.getContent())
 550              .messageType(ChatMessageType.AI)
 551              .build();
 552          memory.add(aiMsg);
 553          
 554          return response.getContent();
 555      }
 556  }
 557  ```
 558  
 559  ### Document Processing Pipeline
 560  
 561  ```java
 562  @Service
 563  public class DocumentProcessor {
 564      private final DocumentSplitter splitter;
 565      private final TextSimilarityUtil similarity;
 566      
 567      public DocumentProcessor() {
 568          this.splitter = DocumentSplitter.builder()
 569              .maxChunkSize(512)
 570              .overlapSize(50)
 571              .tokenizer(new SimpleTokenizer())
 572              .build();
 573      }
 574      
 575      public List<String> processDocument(String document) {
 576          // Split document into chunks
 577          List<String> chunks = splitter.split(document);
 578          
 579          // Remove duplicate chunks based on similarity
 580          List<String> uniqueChunks = new ArrayList<>();
 581          for (String chunk : chunks) {
 582              boolean isDuplicate = uniqueChunks.stream()
 583                  .anyMatch(existing -> 
 584                      similarity.combinedSimilarity(chunk, existing) > 0.9);
 585              
 586              if (!isDuplicate) {
 587                  uniqueChunks.add(chunk);
 588              }
 589          }
 590          
 591          return uniqueChunks;
 592      }
 593  }
 594  ```
 595  
 596  ### Template Processing
 597  
 598  ```java
 599  @Service
 600  public class TemplateProcessor {
 601      private final VariableExtractor extractor;
 602      
 603      public String processTemplate(String template, Map<String, Object> variables) {
 604          // Extract required variables
 605          Set<String> requiredVars = extractor.extractVariables(template);
 606          
 607          // Validate all variables are provided
 608          for (String var : requiredVars) {
 609              if (!variables.containsKey(var)) {
 610                  throw new IllegalArgumentException("Missing variable: " + var);
 611              }
 612          }
 613          
 614          // Process template (simplified - use actual template engine)
 615          String result = template;
 616          for (Map.Entry<String, Object> entry : variables.entrySet()) {
 617              result = result.replace("{{" + entry.getKey() + "}}", 
 618                                    String.valueOf(entry.getValue()));
 619          }
 620          
 621          return result;
 622      }
 623  }
 624  ```
 625  
 626  ## Testing
 627  
 628  ### Unit Test Examples
 629  
 630  ```java
 631  @Test
 632  public void testDocumentSplitter() {
 633      DocumentSplitter splitter = DocumentSplitter.builder()
 634          .maxChunkSize(100)
 635          .overlapSize(20)
 636          .tokenizer(new SimpleTokenizer())
 637          .build();
 638      
 639      String document = "This is a long document that needs to be split...";
 640      List<String> chunks = splitter.split(document);
 641      
 642      assertThat(chunks).isNotEmpty();
 643      assertThat(chunks.get(0)).hasSizeLessThanOrEqualTo(100);
 644  }
 645  
 646  @Test
 647  public void testTextSimilarity() {
 648      String text1 = "The quick brown fox";
 649      String text2 = "A quick brown fox";
 650      
 651      double similarity = TextSimilarityUtil.combinedSimilarity(text1, text2);
 652      assertThat(similarity).isBetween(0.7, 1.0);
 653  }
 654  
 655  @Test
 656  public void testVariableExtraction() {
 657      String template = "Hello {{name}}! You have {{count}} messages.";
 658      Set<String> variables = VariableExtractor.extractVariables(template);
 659      
 660      assertThat(variables).containsExactlyInAnyOrder("name", "count");
 661  }
 662  ```
 663  
 664  ## Performance Considerations
 665  
 666  ### Memory Management
 667  - Use `TokenWindowChatMemory` for production to prevent memory leaks
 668  - Configure appropriate token limits based on model context windows
 669  - Consider persistent storage for long-term conversation history
 670  
 671  ### Text Processing
 672  - `TextSimilarityUtil` operations are O(n²) for large texts
 673  - Use caching for repeated similarity calculations
 674  - Consider approximate algorithms for very large document sets
 675  
 676  ### JSON Processing
 677  - `JsonUtils.repairJson()` has overhead - use sparingly
 678  - Cache parsed JSON for repeated access
 679  - Validate JSON structure before processing
 680  
 681  ## Integration with Other Modules
 682  
 683  ### With driftkit-clients
 684  ```java
 685  // Use common domain objects with model clients
 686  ModelTextRequest request = ModelTextRequest.builder()
 687      .messages(memory.messages()) // ChatItem list
 688      .temperature(0.7)
 689      .build();
 690  ```
 691  
 692  ### With driftkit-workflows
 693  ```java
 694  // AITask integrates with workflow context
 695  AITask task = AITask.builder()
 696      .workflowId("rag-workflow")
 697      .workflowContext(workflowContext)
 698      .variables(templateVariables)
 699      .build();
 700  ```
 701  
 702  ### With driftkit-vector
 703  ```java
 704  // Document processing for vector storage
 705  List<String> chunks = documentSplitter.split(document);
 706  // Chunks can be embedded and stored in vector databases
 707  ```
 708  
 709  ## Error Handling
 710  
 711  ### Common Exceptions
 712  - `IllegalArgumentException` - Invalid configuration or parameters
 713  - `JsonProcessingException` - JSON parsing failures
 714  - `ValidationException` - Bean validation failures
 715  - `TokenLimitExceededException` - Memory capacity exceeded
 716  
 717  ### Best Practices
 718  ```java
 719  // Always validate inputs
 720  ValidationUtils.requireNonNull(chatId, "Chat ID cannot be null");
 721  
 722  // Handle JSON parsing gracefully
 723  Optional<JsonNode> json = JsonUtils.parseJson(inputJson);
 724  if (json.isEmpty()) {
 725      log.warn("Failed to parse JSON: {}", inputJson);
 726      return defaultResponse;
 727  }
 728  
 729  // Use try-with-resources for cleanup
 730  try (var resource = acquireResource()) {
 731      // Process resource
 732  } catch (Exception e) {
 733      log.error("Processing failed", e);
 734      throw new ProcessingException("Failed to process request", e);
 735  }
 736  ```
 737  
 738  ## Migration Guide
 739  
 740  ### From Version 1.x to 2.x
 741  1. Update imports from `javax.annotation` to `jakarta.annotation`
 742  2. Replace deprecated `@Getter/@Setter` with `@Data` where appropriate
 743  3. Update `EtlConfig` usage to use Spring Boot configuration properties
 744  4. Replace manual JSON parsing with `JsonUtils` methods
 745  
 746  ### Configuration Changes
 747  ```yaml
 748  # Old format
 749  etl:
 750    openai:
 751      apiKey: "sk-..."
 752      
 753  # New format
 754  driftkit:
 755    vault:
 756      - name: "primary"
 757        type: "openai"
 758        apiKey: "sk-..."
 759  ```
 760  
 761  ## Real-World Demo Examples
 762  
 763  ### 1. Building a Customer Support Chatbot
 764  
 765  This example demonstrates building a complete customer support chatbot using the common module's chat management features.
 766  
 767  ```java
 768  @Service
 769  public class CustomerSupportBot {
 770      private final ChatMemory memory;
 771      private final ModelClient modelClient;
 772      private final TextSimilarityUtil similarity;
 773      
 774      public CustomerSupportBot() {
 775          // Initialize with 4K token window for GPT-4
 776          this.memory = TokenWindowChatMemory.builder()
 777              .maxTokens(4000)
 778              .tokenizer(new SimpleTokenizer())
 779              .store(new InMemoryChatMemoryStore())
 780              .build();
 781      }
 782      
 783      public String handleCustomerQuery(String chatId, String query) {
 784          // Add system message if it's a new conversation
 785          if (memory.messages().isEmpty()) {
 786              Message systemMsg = Message.builder()
 787                  .id(UUID.randomUUID().toString())
 788                  .chatId(chatId)
 789                  .content("You are a helpful customer support agent for an e-commerce platform. Be professional, empathetic, and solution-oriented.")
 790                  .messageType(ChatMessageType.SYSTEM)
 791                  .build();
 792              memory.add(systemMsg);
 793          }
 794          
 795          // Check for similar previous queries
 796          String similarResponse = findSimilarResponse(query);
 797          if (similarResponse != null) {
 798              return similarResponse;
 799          }
 800          
 801          // Add user message
 802          Message userMsg = Message.builder()
 803              .id(UUID.randomUUID().toString())
 804              .chatId(chatId)
 805              .content(query)
 806              .messageType(ChatMessageType.USER)
 807              .requestInitTime(LocalDateTime.now())
 808              .build();
 809          memory.add(userMsg);
 810          
 811          // Generate response
 812          ModelTextRequest request = ModelTextRequest.builder()
 813              .messages(memory.messages())
 814              .temperature(0.7)
 815              .maxTokens(500)
 816              .build();
 817          
 818          ModelTextResponse response = modelClient.textToText(request);
 819          String aiResponse = response.getChoices().get(0).getMessage().getContent();
 820          
 821          // Add AI response to memory
 822          Message aiMsg = Message.builder()
 823              .id(UUID.randomUUID().toString())
 824              .chatId(chatId)
 825              .content(aiResponse)
 826              .messageType(ChatMessageType.AI)
 827              .responseTime(LocalDateTime.now())
 828              .build();
 829          memory.add(aiMsg);
 830          
 831          return aiResponse;
 832      }
 833      
 834      private String findSimilarResponse(String query) {
 835          // Look for similar queries in past conversations
 836          List<ChatItem> userMessages = memory.findByType(ChatMessageType.USER);
 837          
 838          for (ChatItem msg : userMessages) {
 839              double sim = TextSimilarityUtil.combinedSimilarity(query, msg.getContent());
 840              if (sim > 0.85) {
 841                  // Find the AI response that followed this message
 842                  // Return it as a cached response
 843                  return "Based on a similar query...";
 844              }
 845          }
 846          return null;
 847      }
 848  }
 849  ```
 850  
 851  ### 2. Document Intelligence System for Legal Contracts
 852  
 853  This example shows how to build a document analysis system for legal contracts using document splitting and AI processing.
 854  
 855  ```java
 856  @Service
 857  public class LegalContractAnalyzer {
 858      private final DocumentSplitter splitter;
 859      private final ModelClient modelClient;
 860      private final Map<String, List<String>> contractClauses = new HashMap<>();
 861      
 862      public LegalContractAnalyzer() {
 863          this.splitter = DocumentSplitter.builder()
 864              .maxChunkSize(1024)  // Larger chunks for better context
 865              .overlapSize(100)    // Overlap to maintain clause continuity
 866              .tokenizer(new SimpleTokenizer())
 867              .build();
 868      }
 869      
 870      public ContractAnalysis analyzeContract(String contractId, String contractText) {
 871          // Split contract into analyzable chunks
 872          List<String> chunks = splitter.split(contractText);
 873          contractClauses.put(contractId, chunks);
 874          
 875          ContractAnalysis analysis = new ContractAnalysis();
 876          analysis.setContractId(contractId);
 877          analysis.setTotalClauses(chunks.size());
 878          
 879          // Analyze each chunk for specific legal elements
 880          for (int i = 0; i < chunks.size(); i++) {
 881              String chunk = chunks.get(i);
 882              
 883              // Extract key information from each chunk
 884              ClauseAnalysis clauseAnalysis = analyzeClause(chunk, i);
 885              analysis.addClause(clauseAnalysis);
 886              
 887              // Check for risk factors
 888              if (containsRiskIndicators(chunk)) {
 889                  analysis.addRiskFlag(new RiskFlag(i, chunk, assessRiskLevel(chunk)));
 890              }
 891          }
 892          
 893          // Generate executive summary
 894          analysis.setSummary(generateExecutiveSummary(analysis));
 895          
 896          return analysis;
 897      }
 898      
 899      private ClauseAnalysis analyzeClause(String clause, int index) {
 900          // Use AI to categorize and extract key information
 901          String prompt = """
 902              Analyze this legal clause and provide:
 903              1. Clause type (e.g., payment terms, liability, termination)
 904              2. Key obligations
 905              3. Important dates or deadlines
 906              4. Parties involved
 907              
 908              Clause: """ + clause;
 909              
 910          ModelTextRequest request = ModelTextRequest.builder()
 911              .prompt(prompt)
 912              .temperature(0.1)  // Low temperature for factual analysis
 913              .maxTokens(300)
 914              .jsonResponse(true)
 915              .build();
 916              
 917          ModelTextResponse response = modelClient.textToText(request);
 918          
 919          // Parse the structured response
 920          return parseClauseAnalysis(response.getContent());
 921      }
 922      
 923      private boolean containsRiskIndicators(String text) {
 924          String[] riskKeywords = {
 925              "unlimited liability", "indemnification", "penalty", 
 926              "liquidated damages", "non-compete", "exclusivity"
 927          };
 928          
 929          String normalizedText = text.toLowerCase();
 930          return Arrays.stream(riskKeywords)
 931              .anyMatch(normalizedText::contains);
 932      }
 933      
 934      public List<String> findSimilarClauses(String contractId, String searchClause) {
 935          List<String> clauses = contractClauses.get(contractId);
 936          if (clauses == null) return Collections.emptyList();
 937          
 938          return clauses.stream()
 939              .filter(clause -> TextSimilarityUtil.combinedSimilarity(searchClause, clause) > 0.7)
 940              .collect(Collectors.toList());
 941      }
 942  }
 943  
 944  @Data
 945  class ContractAnalysis {
 946      private String contractId;
 947      private int totalClauses;
 948      private List<ClauseAnalysis> clauses = new ArrayList<>();
 949      private List<RiskFlag> riskFlags = new ArrayList<>();
 950      private String summary;
 951      
 952      public void addClause(ClauseAnalysis clause) {
 953          clauses.add(clause);
 954      }
 955      
 956      public void addRiskFlag(RiskFlag flag) {
 957          riskFlags.add(flag);
 958      }
 959  }
 960  ```
 961  
 962  ### 3. Multi-Language Content Processing Pipeline
 963  
 964  This example demonstrates handling multi-language content with automatic translation and cultural adaptation.
 965  
 966  ```java
 967  @Service
 968  public class MultiLanguageContentProcessor {
 969      private final VariableExtractor extractor;
 970      private final ModelClient modelClient;
 971      private final Map<Language, Map<String, String>> translations = new HashMap<>();
 972      
 973      public ProcessedContent processContent(String template, Language targetLanguage, Map<String, Object> data) {
 974          // Extract all variables from template
 975          Set<String> variables = extractor.extractVariables(template);
 976          Set<String> conditionals = extractor.extractConditionalVariables(template);
 977          
 978          // Validate all required variables are present
 979          for (String var : variables) {
 980              if (!data.containsKey(var) && !conditionals.contains(var)) {
 981                  throw new IllegalArgumentException("Missing required variable: " + var);
 982              }
 983          }
 984          
 985          // Process template with language-specific adaptations
 986          String processed = processTemplate(template, targetLanguage, data);
 987          
 988          // Apply cultural adaptations
 989          processed = applyCulturalAdaptations(processed, targetLanguage);
 990          
 991          // Generate language-specific metadata
 992          ProcessedContent content = new ProcessedContent();
 993          content.setContent(processed);
 994          content.setLanguage(targetLanguage);
 995          content.setVariablesUsed(variables);
 996          content.setProcessingTime(LocalDateTime.now());
 997          
 998          return content;
 999      }
1000      
1001      private String processTemplate(String template, Language language, Map<String, Object> data) {
1002          // First, handle language-specific number and date formatting
1003          Map<String, Object> localizedData = localizeData(data, language);
1004          
1005          // Process the template
1006          String processed = TemplateEngine.renderTemplate(template, localizedData);
1007          
1008          // Translate if needed
1009          if (language != Language.ENGLISH) {
1010              processed = translateContent(processed, language);
1011          }
1012          
1013          return processed;
1014      }
1015      
1016      private Map<String, Object> localizeData(Map<String, Object> data, Language language) {
1017          Map<String, Object> localized = new HashMap<>(data);
1018          
1019          // Format numbers based on locale
1020          data.forEach((key, value) -> {
1021              if (value instanceof Number) {
1022                  localized.put(key, formatNumber((Number) value, language));
1023              } else if (value instanceof LocalDateTime) {
1024                  localized.put(key, formatDate((LocalDateTime) value, language));
1025              }
1026          });
1027          
1028          return localized;
1029      }
1030      
1031      private String translateContent(String content, Language targetLanguage) {
1032          // Use AI for context-aware translation
1033          String prompt = String.format(
1034              "Translate the following content to %s. Maintain the tone and context:\n\n%s",
1035              targetLanguage.name(), content
1036          );
1037          
1038          ModelTextRequest request = ModelTextRequest.builder()
1039              .prompt(prompt)
1040              .temperature(0.3)
1041              .maxTokens(1000)
1042              .build();
1043              
1044          ModelTextResponse response = modelClient.textToText(request);
1045          return response.getContent();
1046      }
1047      
1048      public void preloadTranslations(String key, Map<Language, String> translations) {
1049          translations.forEach((lang, translation) -> {
1050              this.translations.computeIfAbsent(lang, k -> new HashMap<>())
1051                  .put(key, translation);
1052          });
1053      }
1054  }
1055  ```
1056  
1057  ### 4. Intelligent Task Routing System
1058  
1059  This example shows how to build an intelligent task routing system using AITask and workflow context.
1060  
1061  ```java
1062  @Service
1063  public class IntelligentTaskRouter {
1064      private final Map<String, WorkflowHandler> handlers = new HashMap<>();
1065      private final ModelClient modelClient;
1066      
1067      public TaskResult routeTask(AITask task) {
1068          // Analyze task to determine best workflow
1069          String workflowId = determineWorkflow(task);
1070          task.setWorkflowId(workflowId);
1071          
1072          // Get appropriate handler
1073          WorkflowHandler handler = handlers.get(workflowId);
1074          if (handler == null) {
1075              task.setErrorMessage("No handler found for workflow: " + workflowId);
1076              task.setCompletedAt(LocalDateTime.now());
1077              return TaskResult.failure(task);
1078          }
1079          
1080          try {
1081              // Execute task with context preservation
1082              TaskResult result = handler.execute(task);
1083              
1084              // Update task with results
1085              task.setCompletedAt(LocalDateTime.now());
1086              task.setGrade(evaluateResult(result));
1087              
1088              return result;
1089          } catch (Exception e) {
1090              task.setErrorMessage("Task execution failed: " + e.getMessage());
1091              task.setCompletedAt(LocalDateTime.now());
1092              task.setGrade(Grade.POOR);
1093              return TaskResult.failure(task);
1094          }
1095      }
1096      
1097      private String determineWorkflow(AITask task) {
1098          // Use AI to classify the task
1099          String classificationPrompt = buildClassificationPrompt(task);
1100          
1101          ModelTextRequest request = ModelTextRequest.builder()
1102              .prompt(classificationPrompt)
1103              .temperature(0.1)
1104              .maxTokens(50)
1105              .jsonResponse(true)
1106              .build();
1107              
1108          ModelTextResponse response = modelClient.textToText(request);
1109          
1110          // Parse workflow ID from response
1111          JsonNode result = JsonUtils.parseJson(response.getContent()).orElse(null);
1112          return result != null ? result.get("workflow").asText() : "default";
1113      }
1114      
1115      private String buildClassificationPrompt(AITask task) {
1116          return String.format("""
1117              Classify this task into one of the following workflows:
1118              - customer-service: Customer inquiries and support
1119              - content-generation: Creating marketing or educational content
1120              - data-analysis: Analyzing data and generating reports
1121              - document-processing: Processing and extracting information from documents
1122              
1123              Task details:
1124              Prompt: %s
1125              Has Images: %s
1126              Has Audio: %s
1127              Variables: %s
1128              
1129              Respond with JSON: {"workflow": "workflow-id", "confidence": 0.0-1.0}
1130              """,
1131              task.getPrompt(),
1132              task.getImageUrls() != null && !task.getImageUrls().isEmpty(),
1133              task.getAudioUrl() != null,
1134              task.getVariables()
1135          );
1136      }
1137      
1138      private Grade evaluateResult(TaskResult result) {
1139          if (!result.isSuccess()) return Grade.POOR;
1140          
1141          double score = result.getConfidenceScore();
1142          if (score >= 0.9) return Grade.EXCELLENT;
1143          if (score >= 0.7) return Grade.GOOD;
1144          if (score >= 0.5) return Grade.AVERAGE;
1145          return Grade.POOR;
1146      }
1147      
1148      public void registerHandler(String workflowId, WorkflowHandler handler) {
1149          handlers.put(workflowId, handler);
1150      }
1151  }
1152  
1153  interface WorkflowHandler {
1154      TaskResult execute(AITask task);
1155  }
1156  
1157  @Data
1158  class TaskResult {
1159      private boolean success;
1160      private String output;
1161      private double confidenceScore;
1162      private Map<String, Object> metadata;
1163      
1164      public static TaskResult failure(AITask task) {
1165          TaskResult result = new TaskResult();
1166          result.setSuccess(false);
1167          result.setOutput(task.getErrorMessage());
1168          result.setConfidenceScore(0.0);
1169          return result;
1170      }
1171  }
1172  ```
1173  
1174  ### 5. Intelligent Chat Memory Optimization
1175  
1176  This example shows advanced memory management for long-running conversations.
1177  
1178  ```java
1179  @Service
1180  public class OptimizedChatMemoryService {
1181      private final Map<String, TokenWindowChatMemory> userMemories = new ConcurrentHashMap<>();
1182      private final ModelClient modelClient;
1183      
1184      public String processUserMessage(String userId, String message) {
1185          TokenWindowChatMemory memory = getUserMemory(userId);
1186          
1187          // Add user message
1188          Message userMsg = Message.builder()
1189              .id(UUID.randomUUID().toString())
1190              .chatId(userId)
1191              .content(message)
1192              .messageType(ChatMessageType.USER)
1193              .requestInitTime(LocalDateTime.now())
1194              .build();
1195          memory.add(userMsg);
1196          
1197          // Check if we need to summarize old messages
1198          if (shouldSummarize(memory)) {
1199              summarizeOldMessages(memory);
1200          }
1201          
1202          // Generate response with optimized context
1203          String response = generateResponse(memory);
1204          
1205          // Add AI response
1206          Message aiMsg = Message.builder()
1207              .id(UUID.randomUUID().toString())
1208              .chatId(userId)
1209              .content(response)
1210              .messageType(ChatMessageType.AI)
1211              .responseTime(LocalDateTime.now())
1212              .build();
1213          memory.add(aiMsg);
1214          
1215          return response;
1216      }
1217      
1218      private TokenWindowChatMemory getUserMemory(String userId) {
1219          return userMemories.computeIfAbsent(userId, k -> 
1220              TokenWindowChatMemory.builder()
1221                  .maxTokens(3500)  // Leave room for response
1222                  .tokenizer(new SimpleTokenizer())
1223                  .store(new InMemoryChatMemoryStore())
1224                  .build()
1225          );
1226      }
1227      
1228      private boolean shouldSummarize(TokenWindowChatMemory memory) {
1229          // Summarize when we have more than 10 message pairs
1230          long messageCount = memory.messages().stream()
1231              .filter(m -> m.getMessageType() != ChatMessageType.SYSTEM)
1232              .count();
1233          return messageCount > 20;
1234      }
1235      
1236      private void summarizeOldMessages(TokenWindowChatMemory memory) {
1237          List<ChatItem> messages = memory.messages();
1238          
1239          // Keep system message and recent messages
1240          List<ChatItem> toSummarize = messages.stream()
1241              .filter(m -> m.getMessageType() != ChatMessageType.SYSTEM)
1242              .limit(messages.size() - 10)  // Keep last 10 messages
1243              .collect(Collectors.toList());
1244          
1245          if (toSummarize.isEmpty()) return;
1246          
1247          // Generate summary
1248          String summaryPrompt = buildSummaryPrompt(toSummarize);
1249          ModelTextRequest request = ModelTextRequest.builder()
1250              .prompt(summaryPrompt)
1251              .temperature(0.3)
1252              .maxTokens(500)
1253              .build();
1254          
1255          ModelTextResponse response = modelClient.textToText(request);
1256          String summary = response.getContent();
1257          
1258          // Create summary message
1259          Message summaryMsg = Message.builder()
1260              .id(UUID.randomUUID().toString())
1261              .chatId(memory.messages().get(0).getChatId())
1262              .content("Previous conversation summary: " + summary)
1263              .messageType(ChatMessageType.SYSTEM)
1264              .createdTime(System.currentTimeMillis())
1265              .build();
1266          
1267          // Clear old messages and add summary
1268          memory.clear();
1269          
1270          // Re-add system message if exists
1271          messages.stream()
1272              .filter(m -> m.getMessageType() == ChatMessageType.SYSTEM)
1273              .findFirst()
1274              .ifPresent(memory::add);
1275              
1276          // Add summary
1277          memory.add(summaryMsg);
1278          
1279          // Add recent messages
1280          messages.stream()
1281              .skip(Math.max(0, messages.size() - 10))
1282              .forEach(memory::add);
1283      }
1284      
1285      private String buildSummaryPrompt(List<ChatItem> messages) {
1286          StringBuilder prompt = new StringBuilder();
1287          prompt.append("Summarize the following conversation, keeping key points and context:\n\n");
1288          
1289          for (ChatItem msg : messages) {
1290              String role = msg.getMessageType() == ChatMessageType.USER ? "User" : "Assistant";
1291              prompt.append(role).append(": ").append(msg.getContent()).append("\n\n");
1292          }
1293          
1294          prompt.append("Provide a concise summary that preserves important information and context.");
1295          return prompt.toString();
1296      }
1297  }
1298  ```
1299  
1300  This comprehensive documentation provides a complete reference for the driftkit-common module, covering all major components, usage patterns, and integration points. The module serves as the foundation for building sophisticated AI applications with robust chat management, document processing, and text analysis capabilities.