generators_api.md
1 --- 2 title: Generators 3 id: generators-api 4 description: Enables text generation using LLMs. 5 slug: "/generators-api" 6 --- 7 8 <a id="azure"></a> 9 10 # Module azure 11 12 <a id="azure.AzureOpenAIGenerator"></a> 13 14 ## AzureOpenAIGenerator 15 16 Generates text using OpenAI's large language models (LLMs). 17 18 It works with the gpt-4 - type models and supports streaming responses 19 from OpenAI API. 20 21 You can customize how the text is generated by passing parameters to the 22 OpenAI API. Use the `**generation_kwargs` argument when you initialize 23 the component or when you run it. Any parameter that works with 24 `openai.ChatCompletion.create` will work here too. 25 26 27 For details on OpenAI API parameters, see 28 [OpenAI documentation](https://platform.openai.com/docs/api-reference/chat). 29 30 31 ### Usage example 32 33 ```python 34 from haystack.components.generators import AzureOpenAIGenerator 35 from haystack.utils import Secret 36 client = AzureOpenAIGenerator( 37 azure_endpoint="<Your Azure endpoint e.g. `https://your-company.azure.openai.com/>", 38 api_key=Secret.from_token("<your-api-key>"), 39 azure_deployment="<this a model name, e.g. gpt-4o-mini>") 40 response = client.run("What's Natural Language Processing? Be brief.") 41 print(response) 42 ``` 43 44 ``` 45 >> {'replies': ['Natural Language Processing (NLP) is a branch of artificial intelligence that focuses on 46 >> the interaction between computers and human language. It involves enabling computers to understand, interpret, 47 >> and respond to natural human language in a way that is both meaningful and useful.'], 'meta': [{'model': 48 >> 'gpt-4o-mini', 'index': 0, 'finish_reason': 'stop', 'usage': {'prompt_tokens': 16, 49 >> 'completion_tokens': 49, 'total_tokens': 65}}]} 50 ``` 51 52 <a id="azure.AzureOpenAIGenerator.__init__"></a> 53 54 #### AzureOpenAIGenerator.\_\_init\_\_ 55 56 ```python 57 def __init__(azure_endpoint: Optional[str] = None, 58 api_version: Optional[str] = "2023-05-15", 59 azure_deployment: Optional[str] = "gpt-4o-mini", 60 api_key: Optional[Secret] = Secret.from_env_var( 61 "AZURE_OPENAI_API_KEY", strict=False), 62 azure_ad_token: Optional[Secret] = Secret.from_env_var( 63 "AZURE_OPENAI_AD_TOKEN", strict=False), 64 organization: Optional[str] = None, 65 streaming_callback: Optional[StreamingCallbackT] = None, 66 system_prompt: Optional[str] = None, 67 timeout: Optional[float] = None, 68 max_retries: Optional[int] = None, 69 http_client_kwargs: Optional[dict[str, Any]] = None, 70 generation_kwargs: Optional[dict[str, Any]] = None, 71 default_headers: Optional[dict[str, str]] = None, 72 *, 73 azure_ad_token_provider: Optional[AzureADTokenProvider] = None) 74 ``` 75 76 Initialize the Azure OpenAI Generator. 77 78 **Arguments**: 79 80 - `azure_endpoint`: The endpoint of the deployed model, for example `https://example-resource.azure.openai.com/`. 81 - `api_version`: The version of the API to use. Defaults to 2023-05-15. 82 - `azure_deployment`: The deployment of the model, usually the model name. 83 - `api_key`: The API key to use for authentication. 84 - `azure_ad_token`: [Azure Active Directory token](https://www.microsoft.com/en-us/security/business/identity-access/microsoft-entra-id). 85 - `organization`: Your organization ID, defaults to `None`. For help, see 86 [Setting up your organization](https://platform.openai.com/docs/guides/production-best-practices/setting-up-your-organization). 87 - `streaming_callback`: A callback function called when a new token is received from the stream. 88 It accepts [StreamingChunk](https://docs.haystack.deepset.ai/docs/data-classes#streamingchunk) 89 as an argument. 90 - `system_prompt`: The system prompt to use for text generation. If not provided, the Generator 91 omits the system prompt and uses the default system prompt. 92 - `timeout`: Timeout for AzureOpenAI client. If not set, it is inferred from the 93 `OPENAI_TIMEOUT` environment variable or set to 30. 94 - `max_retries`: Maximum retries to establish contact with AzureOpenAI if it returns an internal error. 95 If not set, it is inferred from the `OPENAI_MAX_RETRIES` environment variable or set to 5. 96 - `http_client_kwargs`: A dictionary of keyword arguments to configure a custom `httpx.Client`or `httpx.AsyncClient`. 97 For more information, see the [HTTPX documentation](https://www.python-httpx.org/api/`client`). 98 - `generation_kwargs`: Other parameters to use for the model, sent directly to 99 the OpenAI endpoint. See [OpenAI documentation](https://platform.openai.com/docs/api-reference/chat) for 100 more details. 101 Some of the supported parameters: 102 - `max_tokens`: The maximum number of tokens the output text can have. 103 - `temperature`: The sampling temperature to use. Higher values mean the model takes more risks. 104 Try 0.9 for more creative applications and 0 (argmax sampling) for ones with a well-defined answer. 105 - `top_p`: An alternative to sampling with temperature, called nucleus sampling, where the model 106 considers the results of the tokens with top_p probability mass. For example, 0.1 means only the tokens 107 comprising the top 10% probability mass are considered. 108 - `n`: The number of completions to generate for each prompt. For example, with 3 prompts and n=2, 109 the LLM will generate two completions per prompt, resulting in 6 completions total. 110 - `stop`: One or more sequences after which the LLM should stop generating tokens. 111 - `presence_penalty`: The penalty applied if a token is already present. 112 Higher values make the model less likely to repeat the token. 113 - `frequency_penalty`: Penalty applied if a token has already been generated. 114 Higher values make the model less likely to repeat the token. 115 - `logit_bias`: Adds a logit bias to specific tokens. The keys of the dictionary are tokens, and the 116 values are the bias to add to that token. 117 - `default_headers`: Default headers to use for the AzureOpenAI client. 118 - `azure_ad_token_provider`: A function that returns an Azure Active Directory token, will be invoked on 119 every request. 120 121 <a id="azure.AzureOpenAIGenerator.to_dict"></a> 122 123 #### AzureOpenAIGenerator.to\_dict 124 125 ```python 126 def to_dict() -> dict[str, Any] 127 ``` 128 129 Serialize this component to a dictionary. 130 131 **Returns**: 132 133 The serialized component as a dictionary. 134 135 <a id="azure.AzureOpenAIGenerator.from_dict"></a> 136 137 #### AzureOpenAIGenerator.from\_dict 138 139 ```python 140 @classmethod 141 def from_dict(cls, data: dict[str, Any]) -> "AzureOpenAIGenerator" 142 ``` 143 144 Deserialize this component from a dictionary. 145 146 **Arguments**: 147 148 - `data`: The dictionary representation of this component. 149 150 **Returns**: 151 152 The deserialized component instance. 153 154 <a id="azure.AzureOpenAIGenerator.run"></a> 155 156 #### AzureOpenAIGenerator.run 157 158 ```python 159 @component.output_types(replies=list[str], meta=list[dict[str, Any]]) 160 def run(prompt: str, 161 system_prompt: Optional[str] = None, 162 streaming_callback: Optional[StreamingCallbackT] = None, 163 generation_kwargs: Optional[dict[str, Any]] = None) 164 ``` 165 166 Invoke the text generation inference based on the provided messages and generation parameters. 167 168 **Arguments**: 169 170 - `prompt`: The string prompt to use for text generation. 171 - `system_prompt`: The system prompt to use for text generation. If this run time system prompt is omitted, the system 172 prompt, if defined at initialisation time, is used. 173 - `streaming_callback`: A callback function that is called when a new token is received from the stream. 174 - `generation_kwargs`: Additional keyword arguments for text generation. These parameters will potentially override the parameters 175 passed in the `__init__` method. For more details on the parameters supported by the OpenAI API, refer to 176 the OpenAI [documentation](https://platform.openai.com/docs/api-reference/chat/create). 177 178 **Returns**: 179 180 A list of strings containing the generated responses and a list of dictionaries containing the metadata 181 for each response. 182 183 <a id="hugging_face_local"></a> 184 185 # Module hugging\_face\_local 186 187 <a id="hugging_face_local.HuggingFaceLocalGenerator"></a> 188 189 ## HuggingFaceLocalGenerator 190 191 Generates text using models from Hugging Face that run locally. 192 193 LLMs running locally may need powerful hardware. 194 195 ### Usage example 196 197 ```python 198 from haystack.components.generators import HuggingFaceLocalGenerator 199 200 generator = HuggingFaceLocalGenerator( 201 model="google/flan-t5-large", 202 task="text2text-generation", 203 generation_kwargs={"max_new_tokens": 100, "temperature": 0.9}) 204 205 generator.warm_up() 206 207 print(generator.run("Who is the best American actor?")) 208 # {'replies': ['John Cusack']} 209 ``` 210 211 <a id="hugging_face_local.HuggingFaceLocalGenerator.__init__"></a> 212 213 #### HuggingFaceLocalGenerator.\_\_init\_\_ 214 215 ```python 216 def __init__(model: str = "google/flan-t5-base", 217 task: Optional[Literal["text-generation", 218 "text2text-generation"]] = None, 219 device: Optional[ComponentDevice] = None, 220 token: Optional[Secret] = Secret.from_env_var( 221 ["HF_API_TOKEN", "HF_TOKEN"], strict=False), 222 generation_kwargs: Optional[dict[str, Any]] = None, 223 huggingface_pipeline_kwargs: Optional[dict[str, Any]] = None, 224 stop_words: Optional[list[str]] = None, 225 streaming_callback: Optional[StreamingCallbackT] = None) 226 ``` 227 228 Creates an instance of a HuggingFaceLocalGenerator. 229 230 **Arguments**: 231 232 - `model`: The Hugging Face text generation model name or path. 233 - `task`: The task for the Hugging Face pipeline. Possible options: 234 - `text-generation`: Supported by decoder models, like GPT. 235 - `text2text-generation`: Supported by encoder-decoder models, like T5. 236 If the task is specified in `huggingface_pipeline_kwargs`, this parameter is ignored. 237 If not specified, the component calls the Hugging Face API to infer the task from the model name. 238 - `device`: The device for loading the model. If `None`, automatically selects the default device. 239 If a device or device map is specified in `huggingface_pipeline_kwargs`, it overrides this parameter. 240 - `token`: The token to use as HTTP bearer authorization for remote files. 241 If the token is specified in `huggingface_pipeline_kwargs`, this parameter is ignored. 242 - `generation_kwargs`: A dictionary with keyword arguments to customize text generation. 243 Some examples: `max_length`, `max_new_tokens`, `temperature`, `top_k`, `top_p`. 244 See Hugging Face's documentation for more information: 245 - [customize-text-generation](https://huggingface.co/docs/transformers/main/en/generation_strategies#customize-text-generation) 246 - [transformers.GenerationConfig](https://huggingface.co/docs/transformers/main/en/main_classes/text_generation#transformers.GenerationConfig) 247 - `huggingface_pipeline_kwargs`: Dictionary with keyword arguments to initialize the 248 Hugging Face pipeline for text generation. 249 These keyword arguments provide fine-grained control over the Hugging Face pipeline. 250 In case of duplication, these kwargs override `model`, `task`, `device`, and `token` init parameters. 251 For available kwargs, see [Hugging Face documentation](https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.pipeline.task). 252 In this dictionary, you can also include `model_kwargs` to specify the kwargs for model initialization: 253 [transformers.PreTrainedModel.from_pretrained](https://huggingface.co/docs/transformers/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) 254 - `stop_words`: If the model generates a stop word, the generation stops. 255 If you provide this parameter, don't specify the `stopping_criteria` in `generation_kwargs`. 256 For some chat models, the output includes both the new text and the original prompt. 257 In these cases, make sure your prompt has no stop words. 258 - `streaming_callback`: An optional callable for handling streaming responses. 259 260 <a id="hugging_face_local.HuggingFaceLocalGenerator.warm_up"></a> 261 262 #### HuggingFaceLocalGenerator.warm\_up 263 264 ```python 265 def warm_up() 266 ``` 267 268 Initializes the component. 269 270 <a id="hugging_face_local.HuggingFaceLocalGenerator.to_dict"></a> 271 272 #### HuggingFaceLocalGenerator.to\_dict 273 274 ```python 275 def to_dict() -> dict[str, Any] 276 ``` 277 278 Serializes the component to a dictionary. 279 280 **Returns**: 281 282 Dictionary with serialized data. 283 284 <a id="hugging_face_local.HuggingFaceLocalGenerator.from_dict"></a> 285 286 #### HuggingFaceLocalGenerator.from\_dict 287 288 ```python 289 @classmethod 290 def from_dict(cls, data: dict[str, Any]) -> "HuggingFaceLocalGenerator" 291 ``` 292 293 Deserializes the component from a dictionary. 294 295 **Arguments**: 296 297 - `data`: The dictionary to deserialize from. 298 299 **Returns**: 300 301 The deserialized component. 302 303 <a id="hugging_face_local.HuggingFaceLocalGenerator.run"></a> 304 305 #### HuggingFaceLocalGenerator.run 306 307 ```python 308 @component.output_types(replies=list[str]) 309 def run(prompt: str, 310 streaming_callback: Optional[StreamingCallbackT] = None, 311 generation_kwargs: Optional[dict[str, Any]] = None) 312 ``` 313 314 Run the text generation model on the given prompt. 315 316 **Arguments**: 317 318 - `prompt`: A string representing the prompt. 319 - `streaming_callback`: A callback function that is called when a new token is received from the stream. 320 - `generation_kwargs`: Additional keyword arguments for text generation. 321 322 **Returns**: 323 324 A dictionary containing the generated replies. 325 - replies: A list of strings representing the generated replies. 326 327 <a id="hugging_face_api"></a> 328 329 # Module hugging\_face\_api 330 331 <a id="hugging_face_api.HuggingFaceAPIGenerator"></a> 332 333 ## HuggingFaceAPIGenerator 334 335 Generates text using Hugging Face APIs. 336 337 Use it with the following Hugging Face APIs: 338 - [Paid Inference Endpoints](https://huggingface.co/inference-endpoints) 339 - [Self-hosted Text Generation Inference](https://github.com/huggingface/text-generation-inference) 340 341 **Note:** As of July 2025, the Hugging Face Inference API no longer offers generative models through the 342 `text_generation` endpoint. Generative models are now only available through providers supporting the 343 `chat_completion` endpoint. As a result, this component might no longer work with the Hugging Face Inference API. 344 Use the `HuggingFaceAPIChatGenerator` component, which supports the `chat_completion` endpoint. 345 346 ### Usage examples 347 348 #### With Hugging Face Inference Endpoints 349 350 351 #### With self-hosted text generation inference 352 353 #### With the free serverless inference API 354 355 Be aware that this example might not work as the Hugging Face Inference API no longer offer models that support the 356 `text_generation` endpoint. Use the `HuggingFaceAPIChatGenerator` for generative models through the 357 `chat_completion` endpoint. 358 359 ```python 360 from haystack.components.generators import HuggingFaceAPIGenerator 361 from haystack.utils import Secret 362 363 generator = HuggingFaceAPIGenerator(api_type="inference_endpoints", 364 api_params={"url": "<your-inference-endpoint-url>"}, 365 token=Secret.from_token("<your-api-key>")) 366 367 result = generator.run(prompt="What's Natural Language Processing?") 368 print(result) 369 ``` 370 ```python 371 from haystack.components.generators import HuggingFaceAPIGenerator 372 373 generator = HuggingFaceAPIGenerator(api_type="text_generation_inference", 374 api_params={"url": "http://localhost:8080"}) 375 376 result = generator.run(prompt="What's Natural Language Processing?") 377 print(result) 378 ``` 379 ```python 380 from haystack.components.generators import HuggingFaceAPIGenerator 381 from haystack.utils import Secret 382 383 generator = HuggingFaceAPIGenerator(api_type="serverless_inference_api", 384 api_params={"model": "HuggingFaceH4/zephyr-7b-beta"}, 385 token=Secret.from_token("<your-api-key>")) 386 387 result = generator.run(prompt="What's Natural Language Processing?") 388 print(result) 389 ``` 390 391 <a id="hugging_face_api.HuggingFaceAPIGenerator.__init__"></a> 392 393 #### HuggingFaceAPIGenerator.\_\_init\_\_ 394 395 ```python 396 def __init__(api_type: Union[HFGenerationAPIType, str], 397 api_params: dict[str, str], 398 token: Optional[Secret] = Secret.from_env_var( 399 ["HF_API_TOKEN", "HF_TOKEN"], strict=False), 400 generation_kwargs: Optional[dict[str, Any]] = None, 401 stop_words: Optional[list[str]] = None, 402 streaming_callback: Optional[StreamingCallbackT] = None) 403 ``` 404 405 Initialize the HuggingFaceAPIGenerator instance. 406 407 **Arguments**: 408 409 - `api_type`: The type of Hugging Face API to use. Available types: 410 - `text_generation_inference`: See [TGI](https://github.com/huggingface/text-generation-inference). 411 - `inference_endpoints`: See [Inference Endpoints](https://huggingface.co/inference-endpoints). 412 - `serverless_inference_api`: See [Serverless Inference API](https://huggingface.co/inference-api). 413 This might no longer work due to changes in the models offered in the Hugging Face Inference API. 414 Please use the `HuggingFaceAPIChatGenerator` component instead. 415 - `api_params`: A dictionary with the following keys: 416 - `model`: Hugging Face model ID. Required when `api_type` is `SERVERLESS_INFERENCE_API`. 417 - `url`: URL of the inference endpoint. Required when `api_type` is `INFERENCE_ENDPOINTS` or 418 `TEXT_GENERATION_INFERENCE`. 419 - Other parameters specific to the chosen API type, such as `timeout`, `headers`, `provider` etc. 420 - `token`: The Hugging Face token to use as HTTP bearer authorization. 421 Check your HF token in your [account settings](https://huggingface.co/settings/tokens). 422 - `generation_kwargs`: A dictionary with keyword arguments to customize text generation. Some examples: `max_new_tokens`, 423 `temperature`, `top_k`, `top_p`. 424 For details, see [Hugging Face documentation](https://huggingface.co/docs/huggingface_hub/en/package_reference/inference_client#huggingface_hub.InferenceClient.text_generation) 425 for more information. 426 - `stop_words`: An optional list of strings representing the stop words. 427 - `streaming_callback`: An optional callable for handling streaming responses. 428 429 <a id="hugging_face_api.HuggingFaceAPIGenerator.to_dict"></a> 430 431 #### HuggingFaceAPIGenerator.to\_dict 432 433 ```python 434 def to_dict() -> dict[str, Any] 435 ``` 436 437 Serialize this component to a dictionary. 438 439 **Returns**: 440 441 A dictionary containing the serialized component. 442 443 <a id="hugging_face_api.HuggingFaceAPIGenerator.from_dict"></a> 444 445 #### HuggingFaceAPIGenerator.from\_dict 446 447 ```python 448 @classmethod 449 def from_dict(cls, data: dict[str, Any]) -> "HuggingFaceAPIGenerator" 450 ``` 451 452 Deserialize this component from a dictionary. 453 454 <a id="hugging_face_api.HuggingFaceAPIGenerator.run"></a> 455 456 #### HuggingFaceAPIGenerator.run 457 458 ```python 459 @component.output_types(replies=list[str], meta=list[dict[str, Any]]) 460 def run(prompt: str, 461 streaming_callback: Optional[StreamingCallbackT] = None, 462 generation_kwargs: Optional[dict[str, Any]] = None) 463 ``` 464 465 Invoke the text generation inference for the given prompt and generation parameters. 466 467 **Arguments**: 468 469 - `prompt`: A string representing the prompt. 470 - `streaming_callback`: A callback function that is called when a new token is received from the stream. 471 - `generation_kwargs`: Additional keyword arguments for text generation. 472 473 **Returns**: 474 475 A dictionary with the generated replies and metadata. Both are lists of length n. 476 - replies: A list of strings representing the generated replies. 477 478 <a id="openai"></a> 479 480 # Module openai 481 482 <a id="openai.OpenAIGenerator"></a> 483 484 ## OpenAIGenerator 485 486 Generates text using OpenAI's large language models (LLMs). 487 488 It works with the gpt-4 and o-series models and supports streaming responses 489 from OpenAI API. It uses strings as input and output. 490 491 You can customize how the text is generated by passing parameters to the 492 OpenAI API. Use the `**generation_kwargs` argument when you initialize 493 the component or when you run it. Any parameter that works with 494 `openai.ChatCompletion.create` will work here too. 495 496 497 For details on OpenAI API parameters, see 498 [OpenAI documentation](https://platform.openai.com/docs/api-reference/chat). 499 500 ### Usage example 501 502 ```python 503 from haystack.components.generators import OpenAIGenerator 504 client = OpenAIGenerator() 505 response = client.run("What's Natural Language Processing? Be brief.") 506 print(response) 507 508 >> {'replies': ['Natural Language Processing (NLP) is a branch of artificial intelligence that focuses on 509 >> the interaction between computers and human language. It involves enabling computers to understand, interpret, 510 >> and respond to natural human language in a way that is both meaningful and useful.'], 'meta': [{'model': 511 >> 'gpt-4o-mini', 'index': 0, 'finish_reason': 'stop', 'usage': {'prompt_tokens': 16, 512 >> 'completion_tokens': 49, 'total_tokens': 65}}]} 513 ``` 514 515 <a id="openai.OpenAIGenerator.__init__"></a> 516 517 #### OpenAIGenerator.\_\_init\_\_ 518 519 ```python 520 def __init__(api_key: Secret = Secret.from_env_var("OPENAI_API_KEY"), 521 model: str = "gpt-4o-mini", 522 streaming_callback: Optional[StreamingCallbackT] = None, 523 api_base_url: Optional[str] = None, 524 organization: Optional[str] = None, 525 system_prompt: Optional[str] = None, 526 generation_kwargs: Optional[dict[str, Any]] = None, 527 timeout: Optional[float] = None, 528 max_retries: Optional[int] = None, 529 http_client_kwargs: Optional[dict[str, Any]] = None) 530 ``` 531 532 Creates an instance of OpenAIGenerator. Unless specified otherwise in `model`, uses OpenAI's gpt-4o-mini 533 534 By setting the 'OPENAI_TIMEOUT' and 'OPENAI_MAX_RETRIES' you can change the timeout and max_retries parameters 535 in the OpenAI client. 536 537 **Arguments**: 538 539 - `api_key`: The OpenAI API key to connect to OpenAI. 540 - `model`: The name of the model to use. 541 - `streaming_callback`: A callback function that is called when a new token is received from the stream. 542 The callback function accepts StreamingChunk as an argument. 543 - `api_base_url`: An optional base URL. 544 - `organization`: The Organization ID, defaults to `None`. 545 - `system_prompt`: The system prompt to use for text generation. If not provided, the system prompt is 546 omitted, and the default system prompt of the model is used. 547 - `generation_kwargs`: Other parameters to use for the model. These parameters are all sent directly to 548 the OpenAI endpoint. See OpenAI [documentation](https://platform.openai.com/docs/api-reference/chat) for 549 more details. 550 Some of the supported parameters: 551 - `max_tokens`: The maximum number of tokens the output text can have. 552 - `temperature`: What sampling temperature to use. Higher values mean the model will take more risks. 553 Try 0.9 for more creative applications and 0 (argmax sampling) for ones with a well-defined answer. 554 - `top_p`: An alternative to sampling with temperature, called nucleus sampling, where the model 555 considers the results of the tokens with top_p probability mass. So, 0.1 means only the tokens 556 comprising the top 10% probability mass are considered. 557 - `n`: How many completions to generate for each prompt. For example, if the LLM gets 3 prompts and n is 2, 558 it will generate two completions for each of the three prompts, ending up with 6 completions in total. 559 - `stop`: One or more sequences after which the LLM should stop generating tokens. 560 - `presence_penalty`: What penalty to apply if a token is already present at all. Bigger values mean 561 the model will be less likely to repeat the same token in the text. 562 - `frequency_penalty`: What penalty to apply if a token has already been generated in the text. 563 Bigger values mean the model will be less likely to repeat the same token in the text. 564 - `logit_bias`: Add a logit bias to specific tokens. The keys of the dictionary are tokens, and the 565 values are the bias to add to that token. 566 - `timeout`: Timeout for OpenAI Client calls, if not set it is inferred from the `OPENAI_TIMEOUT` environment variable 567 or set to 30. 568 - `max_retries`: Maximum retries to establish contact with OpenAI if it returns an internal error, if not set it is inferred 569 from the `OPENAI_MAX_RETRIES` environment variable or set to 5. 570 - `http_client_kwargs`: A dictionary of keyword arguments to configure a custom `httpx.Client`or `httpx.AsyncClient`. 571 For more information, see the [HTTPX documentation](https://www.python-httpx.org/api/`client`). 572 573 <a id="openai.OpenAIGenerator.to_dict"></a> 574 575 #### OpenAIGenerator.to\_dict 576 577 ```python 578 def to_dict() -> dict[str, Any] 579 ``` 580 581 Serialize this component to a dictionary. 582 583 **Returns**: 584 585 The serialized component as a dictionary. 586 587 <a id="openai.OpenAIGenerator.from_dict"></a> 588 589 #### OpenAIGenerator.from\_dict 590 591 ```python 592 @classmethod 593 def from_dict(cls, data: dict[str, Any]) -> "OpenAIGenerator" 594 ``` 595 596 Deserialize this component from a dictionary. 597 598 **Arguments**: 599 600 - `data`: The dictionary representation of this component. 601 602 **Returns**: 603 604 The deserialized component instance. 605 606 <a id="openai.OpenAIGenerator.run"></a> 607 608 #### OpenAIGenerator.run 609 610 ```python 611 @component.output_types(replies=list[str], meta=list[dict[str, Any]]) 612 def run(prompt: str, 613 system_prompt: Optional[str] = None, 614 streaming_callback: Optional[StreamingCallbackT] = None, 615 generation_kwargs: Optional[dict[str, Any]] = None) 616 ``` 617 618 Invoke the text generation inference based on the provided messages and generation parameters. 619 620 **Arguments**: 621 622 - `prompt`: The string prompt to use for text generation. 623 - `system_prompt`: The system prompt to use for text generation. If this run time system prompt is omitted, the system 624 prompt, if defined at initialisation time, is used. 625 - `streaming_callback`: A callback function that is called when a new token is received from the stream. 626 - `generation_kwargs`: Additional keyword arguments for text generation. These parameters will potentially override the parameters 627 passed in the `__init__` method. For more details on the parameters supported by the OpenAI API, refer to 628 the OpenAI [documentation](https://platform.openai.com/docs/api-reference/chat/create). 629 630 **Returns**: 631 632 A list of strings containing the generated responses and a list of dictionaries containing the metadata 633 for each response. 634 635 <a id="openai_dalle"></a> 636 637 # Module openai\_dalle 638 639 <a id="openai_dalle.DALLEImageGenerator"></a> 640 641 ## DALLEImageGenerator 642 643 Generates images using OpenAI's DALL-E model. 644 645 For details on OpenAI API parameters, see 646 [OpenAI documentation](https://platform.openai.com/docs/api-reference/images/create). 647 648 ### Usage example 649 650 ```python 651 from haystack.components.generators import DALLEImageGenerator 652 image_generator = DALLEImageGenerator() 653 response = image_generator.run("Show me a picture of a black cat.") 654 print(response) 655 ``` 656 657 <a id="openai_dalle.DALLEImageGenerator.__init__"></a> 658 659 #### DALLEImageGenerator.\_\_init\_\_ 660 661 ```python 662 def __init__(model: str = "dall-e-3", 663 quality: Literal["standard", "hd"] = "standard", 664 size: Literal["256x256", "512x512", "1024x1024", "1792x1024", 665 "1024x1792"] = "1024x1024", 666 response_format: Literal["url", "b64_json"] = "url", 667 api_key: Secret = Secret.from_env_var("OPENAI_API_KEY"), 668 api_base_url: Optional[str] = None, 669 organization: Optional[str] = None, 670 timeout: Optional[float] = None, 671 max_retries: Optional[int] = None, 672 http_client_kwargs: Optional[dict[str, Any]] = None) 673 ``` 674 675 Creates an instance of DALLEImageGenerator. Unless specified otherwise in `model`, uses OpenAI's dall-e-3. 676 677 **Arguments**: 678 679 - `model`: The model to use for image generation. Can be "dall-e-2" or "dall-e-3". 680 - `quality`: The quality of the generated image. Can be "standard" or "hd". 681 - `size`: The size of the generated images. 682 Must be one of 256x256, 512x512, or 1024x1024 for dall-e-2. 683 Must be one of 1024x1024, 1792x1024, or 1024x1792 for dall-e-3 models. 684 - `response_format`: The format of the response. Can be "url" or "b64_json". 685 - `api_key`: The OpenAI API key to connect to OpenAI. 686 - `api_base_url`: An optional base URL. 687 - `organization`: The Organization ID, defaults to `None`. 688 - `timeout`: Timeout for OpenAI Client calls. If not set, it is inferred from the `OPENAI_TIMEOUT` environment variable 689 or set to 30. 690 - `max_retries`: Maximum retries to establish contact with OpenAI if it returns an internal error. If not set, it is inferred 691 from the `OPENAI_MAX_RETRIES` environment variable or set to 5. 692 - `http_client_kwargs`: A dictionary of keyword arguments to configure a custom `httpx.Client`or `httpx.AsyncClient`. 693 For more information, see the [HTTPX documentation](https://www.python-httpx.org/api/`client`). 694 695 <a id="openai_dalle.DALLEImageGenerator.warm_up"></a> 696 697 #### DALLEImageGenerator.warm\_up 698 699 ```python 700 def warm_up() -> None 701 ``` 702 703 Warm up the OpenAI client. 704 705 <a id="openai_dalle.DALLEImageGenerator.run"></a> 706 707 #### DALLEImageGenerator.run 708 709 ```python 710 @component.output_types(images=list[str], revised_prompt=str) 711 def run(prompt: str, 712 size: Optional[Literal["256x256", "512x512", "1024x1024", "1792x1024", 713 "1024x1792"]] = None, 714 quality: Optional[Literal["standard", "hd"]] = None, 715 response_format: Optional[Optional[Literal["url", 716 "b64_json"]]] = None) 717 ``` 718 719 Invokes the image generation inference based on the provided prompt and generation parameters. 720 721 **Arguments**: 722 723 - `prompt`: The prompt to generate the image. 724 - `size`: If provided, overrides the size provided during initialization. 725 - `quality`: If provided, overrides the quality provided during initialization. 726 - `response_format`: If provided, overrides the response format provided during initialization. 727 728 **Returns**: 729 730 A dictionary containing the generated list of images and the revised prompt. 731 Depending on the `response_format` parameter, the list of images can be URLs or base64 encoded JSON strings. 732 The revised prompt is the prompt that was used to generate the image, if there was any revision 733 to the prompt made by OpenAI. 734 735 <a id="openai_dalle.DALLEImageGenerator.to_dict"></a> 736 737 #### DALLEImageGenerator.to\_dict 738 739 ```python 740 def to_dict() -> dict[str, Any] 741 ``` 742 743 Serialize this component to a dictionary. 744 745 **Returns**: 746 747 The serialized component as a dictionary. 748 749 <a id="openai_dalle.DALLEImageGenerator.from_dict"></a> 750 751 #### DALLEImageGenerator.from\_dict 752 753 ```python 754 @classmethod 755 def from_dict(cls, data: dict[str, Any]) -> "DALLEImageGenerator" 756 ``` 757 758 Deserialize this component from a dictionary. 759 760 **Arguments**: 761 762 - `data`: The dictionary representation of this component. 763 764 **Returns**: 765 766 The deserialized component instance. 767 768 <a id="chat/azure"></a> 769 770 # Module chat/azure 771 772 <a id="chat/azure.AzureOpenAIChatGenerator"></a> 773 774 ## AzureOpenAIChatGenerator 775 776 Generates text using OpenAI's models on Azure. 777 778 It works with the gpt-4 - type models and supports streaming responses 779 from OpenAI API. It uses [ChatMessage](https://docs.haystack.deepset.ai/docs/chatmessage) 780 format in input and output. 781 782 You can customize how the text is generated by passing parameters to the 783 OpenAI API. Use the `**generation_kwargs` argument when you initialize 784 the component or when you run it. Any parameter that works with 785 `openai.ChatCompletion.create` will work here too. 786 787 For details on OpenAI API parameters, see 788 [OpenAI documentation](https://platform.openai.com/docs/api-reference/chat). 789 790 ### Usage example 791 792 ```python 793 from haystack.components.generators.chat import AzureOpenAIChatGenerator 794 from haystack.dataclasses import ChatMessage 795 from haystack.utils import Secret 796 797 messages = [ChatMessage.from_user("What's Natural Language Processing?")] 798 799 client = AzureOpenAIChatGenerator( 800 azure_endpoint="<Your Azure endpoint e.g. `https://your-company.azure.openai.com/>", 801 api_key=Secret.from_token("<your-api-key>"), 802 azure_deployment="<this a model name, e.g. gpt-4o-mini>") 803 response = client.run(messages) 804 print(response) 805 ``` 806 807 ``` 808 {'replies': 809 [ChatMessage(_role=<ChatRole.ASSISTANT: 'assistant'>, _content=[TextContent(text= 810 "Natural Language Processing (NLP) is a branch of artificial intelligence that focuses on 811 enabling computers to understand, interpret, and generate human language in a way that is useful.")], 812 _name=None, 813 _meta={'model': 'gpt-4o-mini', 'index': 0, 'finish_reason': 'stop', 814 'usage': {'prompt_tokens': 15, 'completion_tokens': 36, 'total_tokens': 51}})] 815 } 816 ``` 817 818 <a id="chat/azure.AzureOpenAIChatGenerator.__init__"></a> 819 820 #### AzureOpenAIChatGenerator.\_\_init\_\_ 821 822 ```python 823 def __init__(azure_endpoint: Optional[str] = None, 824 api_version: Optional[str] = "2023-05-15", 825 azure_deployment: Optional[str] = "gpt-4o-mini", 826 api_key: Optional[Secret] = Secret.from_env_var( 827 "AZURE_OPENAI_API_KEY", strict=False), 828 azure_ad_token: Optional[Secret] = Secret.from_env_var( 829 "AZURE_OPENAI_AD_TOKEN", strict=False), 830 organization: Optional[str] = None, 831 streaming_callback: Optional[StreamingCallbackT] = None, 832 timeout: Optional[float] = None, 833 max_retries: Optional[int] = None, 834 generation_kwargs: Optional[dict[str, Any]] = None, 835 default_headers: Optional[dict[str, str]] = None, 836 tools: Optional[Union[list[Tool], Toolset]] = None, 837 tools_strict: bool = False, 838 *, 839 azure_ad_token_provider: Optional[Union[ 840 AzureADTokenProvider, AsyncAzureADTokenProvider]] = None, 841 http_client_kwargs: Optional[dict[str, Any]] = None) 842 ``` 843 844 Initialize the Azure OpenAI Chat Generator component. 845 846 **Arguments**: 847 848 - `azure_endpoint`: The endpoint of the deployed model, for example `"https://example-resource.azure.openai.com/"`. 849 - `api_version`: The version of the API to use. Defaults to 2023-05-15. 850 - `azure_deployment`: The deployment of the model, usually the model name. 851 - `api_key`: The API key to use for authentication. 852 - `azure_ad_token`: [Azure Active Directory token](https://www.microsoft.com/en-us/security/business/identity-access/microsoft-entra-id). 853 - `organization`: Your organization ID, defaults to `None`. For help, see 854 [Setting up your organization](https://platform.openai.com/docs/guides/production-best-practices/setting-up-your-organization). 855 - `streaming_callback`: A callback function called when a new token is received from the stream. 856 It accepts [StreamingChunk](https://docs.haystack.deepset.ai/docs/data-classes#streamingchunk) 857 as an argument. 858 - `timeout`: Timeout for OpenAI client calls. If not set, it defaults to either the 859 `OPENAI_TIMEOUT` environment variable, or 30 seconds. 860 - `max_retries`: Maximum number of retries to contact OpenAI after an internal error. 861 If not set, it defaults to either the `OPENAI_MAX_RETRIES` environment variable, or set to 5. 862 - `generation_kwargs`: Other parameters to use for the model. These parameters are sent directly to 863 the OpenAI endpoint. For details, see [OpenAI documentation](https://platform.openai.com/docs/api-reference/chat). 864 Some of the supported parameters: 865 - `max_tokens`: The maximum number of tokens the output text can have. 866 - `temperature`: The sampling temperature to use. Higher values mean the model takes more risks. 867 Try 0.9 for more creative applications and 0 (argmax sampling) for ones with a well-defined answer. 868 - `top_p`: Nucleus sampling is an alternative to sampling with temperature, where the model considers 869 tokens with a top_p probability mass. For example, 0.1 means only the tokens comprising 870 the top 10% probability mass are considered. 871 - `n`: The number of completions to generate for each prompt. For example, with 3 prompts and n=2, 872 the LLM will generate two completions per prompt, resulting in 6 completions total. 873 - `stop`: One or more sequences after which the LLM should stop generating tokens. 874 - `presence_penalty`: The penalty applied if a token is already present. 875 Higher values make the model less likely to repeat the token. 876 - `frequency_penalty`: Penalty applied if a token has already been generated. 877 Higher values make the model less likely to repeat the token. 878 - `logit_bias`: Adds a logit bias to specific tokens. The keys of the dictionary are tokens, and the 879 values are the bias to add to that token. 880 - `response_format`: A JSON schema or a Pydantic model that enforces the structure of the model's response. 881 If provided, the output will always be validated against this 882 format (unless the model returns a tool call). 883 For details, see the [OpenAI Structured Outputs documentation](https://platform.openai.com/docs/guides/structured-outputs). 884 Notes: 885 - This parameter accepts Pydantic models and JSON schemas for latest models starting from GPT-4o. 886 Older models only support basic version of structured outputs through `{"type": "json_object"}`. 887 For detailed information on JSON mode, see the [OpenAI Structured Outputs documentation](https://platform.openai.com/docs/guides/structured-outputs#json-mode). 888 - For structured outputs with streaming, 889 the `response_format` must be a JSON schema and not a Pydantic model. 890 - `default_headers`: Default headers to use for the AzureOpenAI client. 891 - `tools`: A list of tools or a Toolset for which the model can prepare calls. This parameter can accept either a 892 list of `Tool` objects or a `Toolset` instance. 893 - `tools_strict`: Whether to enable strict schema adherence for tool calls. If set to `True`, the model will follow exactly 894 the schema provided in the `parameters` field of the tool definition, but this may increase latency. 895 - `azure_ad_token_provider`: A function that returns an Azure Active Directory token, will be invoked on 896 every request. 897 - `http_client_kwargs`: A dictionary of keyword arguments to configure a custom `httpx.Client`or `httpx.AsyncClient`. 898 For more information, see the [HTTPX documentation](https://www.python-httpx.org/api/`client`). 899 900 <a id="chat/azure.AzureOpenAIChatGenerator.to_dict"></a> 901 902 #### AzureOpenAIChatGenerator.to\_dict 903 904 ```python 905 def to_dict() -> dict[str, Any] 906 ``` 907 908 Serialize this component to a dictionary. 909 910 **Returns**: 911 912 The serialized component as a dictionary. 913 914 <a id="chat/azure.AzureOpenAIChatGenerator.from_dict"></a> 915 916 #### AzureOpenAIChatGenerator.from\_dict 917 918 ```python 919 @classmethod 920 def from_dict(cls, data: dict[str, Any]) -> "AzureOpenAIChatGenerator" 921 ``` 922 923 Deserialize this component from a dictionary. 924 925 **Arguments**: 926 927 - `data`: The dictionary representation of this component. 928 929 **Returns**: 930 931 The deserialized component instance. 932 933 <a id="chat/azure.AzureOpenAIChatGenerator.run"></a> 934 935 #### AzureOpenAIChatGenerator.run 936 937 ```python 938 @component.output_types(replies=list[ChatMessage]) 939 def run(messages: list[ChatMessage], 940 streaming_callback: Optional[StreamingCallbackT] = None, 941 generation_kwargs: Optional[dict[str, Any]] = None, 942 *, 943 tools: Optional[Union[list[Tool], Toolset]] = None, 944 tools_strict: Optional[bool] = None) 945 ``` 946 947 Invokes chat completion based on the provided messages and generation parameters. 948 949 **Arguments**: 950 951 - `messages`: A list of ChatMessage instances representing the input messages. 952 - `streaming_callback`: A callback function that is called when a new token is received from the stream. 953 - `generation_kwargs`: Additional keyword arguments for text generation. These parameters will 954 override the parameters passed during component initialization. 955 For details on OpenAI API parameters, see [OpenAI documentation](https://platform.openai.com/docs/api-reference/chat/create). 956 - `tools`: A list of tools or a Toolset for which the model can prepare calls. If set, it will override the 957 `tools` parameter set during component initialization. This parameter can accept either a list of 958 `Tool` objects or a `Toolset` instance. 959 - `tools_strict`: Whether to enable strict schema adherence for tool calls. If set to `True`, the model will follow exactly 960 the schema provided in the `parameters` field of the tool definition, but this may increase latency. 961 If set, it will override the `tools_strict` parameter set during component initialization. 962 963 **Returns**: 964 965 A dictionary with the following key: 966 - `replies`: A list containing the generated responses as ChatMessage instances. 967 968 <a id="chat/azure.AzureOpenAIChatGenerator.run_async"></a> 969 970 #### AzureOpenAIChatGenerator.run\_async 971 972 ```python 973 @component.output_types(replies=list[ChatMessage]) 974 async def run_async(messages: list[ChatMessage], 975 streaming_callback: Optional[StreamingCallbackT] = None, 976 generation_kwargs: Optional[dict[str, Any]] = None, 977 *, 978 tools: Optional[Union[list[Tool], Toolset]] = None, 979 tools_strict: Optional[bool] = None) 980 ``` 981 982 Asynchronously invokes chat completion based on the provided messages and generation parameters. 983 984 This is the asynchronous version of the `run` method. It has the same parameters and return values 985 but can be used with `await` in async code. 986 987 **Arguments**: 988 989 - `messages`: A list of ChatMessage instances representing the input messages. 990 - `streaming_callback`: A callback function that is called when a new token is received from the stream. 991 Must be a coroutine. 992 - `generation_kwargs`: Additional keyword arguments for text generation. These parameters will 993 override the parameters passed during component initialization. 994 For details on OpenAI API parameters, see [OpenAI documentation](https://platform.openai.com/docs/api-reference/chat/create). 995 - `tools`: A list of tools or a Toolset for which the model can prepare calls. If set, it will override the 996 `tools` parameter set during component initialization. This parameter can accept either a list of 997 `Tool` objects or a `Toolset` instance. 998 - `tools_strict`: Whether to enable strict schema adherence for tool calls. If set to `True`, the model will follow exactly 999 the schema provided in the `parameters` field of the tool definition, but this may increase latency. 1000 If set, it will override the `tools_strict` parameter set during component initialization. 1001 1002 **Returns**: 1003 1004 A dictionary with the following key: 1005 - `replies`: A list containing the generated responses as ChatMessage instances. 1006 1007 <a id="chat/hugging_face_local"></a> 1008 1009 # Module chat/hugging\_face\_local 1010 1011 <a id="chat/hugging_face_local.default_tool_parser"></a> 1012 1013 #### default\_tool\_parser 1014 1015 ```python 1016 def default_tool_parser(text: str) -> Optional[list[ToolCall]] 1017 ``` 1018 1019 Default implementation for parsing tool calls from model output text. 1020 1021 Uses DEFAULT_TOOL_PATTERN to extract tool calls. 1022 1023 **Arguments**: 1024 1025 - `text`: The text to parse for tool calls. 1026 1027 **Returns**: 1028 1029 A list containing a single ToolCall if a valid tool call is found, None otherwise. 1030 1031 <a id="chat/hugging_face_local.HuggingFaceLocalChatGenerator"></a> 1032 1033 ## HuggingFaceLocalChatGenerator 1034 1035 Generates chat responses using models from Hugging Face that run locally. 1036 1037 Use this component with chat-based models, 1038 such as `HuggingFaceH4/zephyr-7b-beta` or `meta-llama/Llama-2-7b-chat-hf`. 1039 LLMs running locally may need powerful hardware. 1040 1041 ### Usage example 1042 1043 ```python 1044 from haystack.components.generators.chat import HuggingFaceLocalChatGenerator 1045 from haystack.dataclasses import ChatMessage 1046 1047 generator = HuggingFaceLocalChatGenerator(model="HuggingFaceH4/zephyr-7b-beta") 1048 generator.warm_up() 1049 messages = [ChatMessage.from_user("What's Natural Language Processing? Be brief.")] 1050 print(generator.run(messages)) 1051 ``` 1052 1053 ``` 1054 {'replies': 1055 [ChatMessage(_role=<ChatRole.ASSISTANT: 'assistant'>, _content=[TextContent(text= 1056 "Natural Language Processing (NLP) is a subfield of artificial intelligence that deals 1057 with the interaction between computers and human language. It enables computers to understand, interpret, and 1058 generate human language in a valuable way. NLP involves various techniques such as speech recognition, text 1059 analysis, sentiment analysis, and machine translation. The ultimate goal is to make it easier for computers to 1060 process and derive meaning from human language, improving communication between humans and machines.")], 1061 _name=None, 1062 _meta={'finish_reason': 'stop', 'index': 0, 'model': 1063 'mistralai/Mistral-7B-Instruct-v0.2', 1064 'usage': {'completion_tokens': 90, 'prompt_tokens': 19, 'total_tokens': 109}}) 1065 ] 1066 } 1067 ``` 1068 1069 <a id="chat/hugging_face_local.HuggingFaceLocalChatGenerator.__init__"></a> 1070 1071 #### HuggingFaceLocalChatGenerator.\_\_init\_\_ 1072 1073 ```python 1074 def __init__(model: str = "HuggingFaceH4/zephyr-7b-beta", 1075 task: Optional[Literal["text-generation", 1076 "text2text-generation"]] = None, 1077 device: Optional[ComponentDevice] = None, 1078 token: Optional[Secret] = Secret.from_env_var( 1079 ["HF_API_TOKEN", "HF_TOKEN"], strict=False), 1080 chat_template: Optional[str] = None, 1081 generation_kwargs: Optional[dict[str, Any]] = None, 1082 huggingface_pipeline_kwargs: Optional[dict[str, Any]] = None, 1083 stop_words: Optional[list[str]] = None, 1084 streaming_callback: Optional[StreamingCallbackT] = None, 1085 tools: Optional[Union[list[Tool], Toolset]] = None, 1086 tool_parsing_function: Optional[Callable[ 1087 [str], Optional[list[ToolCall]]]] = None, 1088 async_executor: Optional[ThreadPoolExecutor] = None) -> None 1089 ``` 1090 1091 Initializes the HuggingFaceLocalChatGenerator component. 1092 1093 **Arguments**: 1094 1095 - `model`: The Hugging Face text generation model name or path, 1096 for example, `mistralai/Mistral-7B-Instruct-v0.2` or `TheBloke/OpenHermes-2.5-Mistral-7B-16k-AWQ`. 1097 The model must be a chat model supporting the ChatML messaging 1098 format. 1099 If the model is specified in `huggingface_pipeline_kwargs`, this parameter is ignored. 1100 - `task`: The task for the Hugging Face pipeline. Possible options: 1101 - `text-generation`: Supported by decoder models, like GPT. 1102 - `text2text-generation`: Supported by encoder-decoder models, like T5. 1103 If the task is specified in `huggingface_pipeline_kwargs`, this parameter is ignored. 1104 If not specified, the component calls the Hugging Face API to infer the task from the model name. 1105 - `device`: The device for loading the model. If `None`, automatically selects the default device. 1106 If a device or device map is specified in `huggingface_pipeline_kwargs`, it overrides this parameter. 1107 - `token`: The token to use as HTTP bearer authorization for remote files. 1108 If the token is specified in `huggingface_pipeline_kwargs`, this parameter is ignored. 1109 - `chat_template`: Specifies an optional Jinja template for formatting chat 1110 messages. Most high-quality chat models have their own templates, but for models without this 1111 feature or if you prefer a custom template, use this parameter. 1112 - `generation_kwargs`: A dictionary with keyword arguments to customize text generation. 1113 Some examples: `max_length`, `max_new_tokens`, `temperature`, `top_k`, `top_p`. 1114 See Hugging Face's documentation for more information: 1115 - - [customize-text-generation](https://huggingface.co/docs/transformers/main/en/generation_strategies#customize-text-generation) 1116 - - [GenerationConfig](https://huggingface.co/docs/transformers/main/en/main_classes/text_generation#transformers.GenerationConfig) 1117 The only `generation_kwargs` set by default is `max_new_tokens`, which is set to 512 tokens. 1118 - `huggingface_pipeline_kwargs`: Dictionary with keyword arguments to initialize the 1119 Hugging Face pipeline for text generation. 1120 These keyword arguments provide fine-grained control over the Hugging Face pipeline. 1121 In case of duplication, these kwargs override `model`, `task`, `device`, and `token` init parameters. 1122 For kwargs, see [Hugging Face documentation](https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.pipeline.task). 1123 In this dictionary, you can also include `model_kwargs` to specify the kwargs for [model initialization](https://huggingface.co/docs/transformers/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) 1124 - `stop_words`: A list of stop words. If the model generates a stop word, the generation stops. 1125 If you provide this parameter, don't specify the `stopping_criteria` in `generation_kwargs`. 1126 For some chat models, the output includes both the new text and the original prompt. 1127 In these cases, make sure your prompt has no stop words. 1128 - `streaming_callback`: An optional callable for handling streaming responses. 1129 - `tools`: A list of tools or a Toolset for which the model can prepare calls. 1130 This parameter can accept either a list of `Tool` objects or a `Toolset` instance. 1131 - `tool_parsing_function`: A callable that takes a string and returns a list of ToolCall objects or None. 1132 If None, the default_tool_parser will be used which extracts tool calls using a predefined pattern. 1133 - `async_executor`: Optional ThreadPoolExecutor to use for async calls. If not provided, a single-threaded executor will be 1134 initialized and used 1135 1136 <a id="chat/hugging_face_local.HuggingFaceLocalChatGenerator.__del__"></a> 1137 1138 #### HuggingFaceLocalChatGenerator.\_\_del\_\_ 1139 1140 ```python 1141 def __del__() -> None 1142 ``` 1143 1144 Cleanup when the instance is being destroyed. 1145 1146 <a id="chat/hugging_face_local.HuggingFaceLocalChatGenerator.shutdown"></a> 1147 1148 #### HuggingFaceLocalChatGenerator.shutdown 1149 1150 ```python 1151 def shutdown() -> None 1152 ``` 1153 1154 Explicitly shutdown the executor if we own it. 1155 1156 <a id="chat/hugging_face_local.HuggingFaceLocalChatGenerator.warm_up"></a> 1157 1158 #### HuggingFaceLocalChatGenerator.warm\_up 1159 1160 ```python 1161 def warm_up() -> None 1162 ``` 1163 1164 Initializes the component. 1165 1166 <a id="chat/hugging_face_local.HuggingFaceLocalChatGenerator.to_dict"></a> 1167 1168 #### HuggingFaceLocalChatGenerator.to\_dict 1169 1170 ```python 1171 def to_dict() -> dict[str, Any] 1172 ``` 1173 1174 Serializes the component to a dictionary. 1175 1176 **Returns**: 1177 1178 Dictionary with serialized data. 1179 1180 <a id="chat/hugging_face_local.HuggingFaceLocalChatGenerator.from_dict"></a> 1181 1182 #### HuggingFaceLocalChatGenerator.from\_dict 1183 1184 ```python 1185 @classmethod 1186 def from_dict(cls, data: dict[str, Any]) -> "HuggingFaceLocalChatGenerator" 1187 ``` 1188 1189 Deserializes the component from a dictionary. 1190 1191 **Arguments**: 1192 1193 - `data`: The dictionary to deserialize from. 1194 1195 **Returns**: 1196 1197 The deserialized component. 1198 1199 <a id="chat/hugging_face_local.HuggingFaceLocalChatGenerator.run"></a> 1200 1201 #### HuggingFaceLocalChatGenerator.run 1202 1203 ```python 1204 @component.output_types(replies=list[ChatMessage]) 1205 def run( 1206 messages: list[ChatMessage], 1207 generation_kwargs: Optional[dict[str, Any]] = None, 1208 streaming_callback: Optional[StreamingCallbackT] = None, 1209 tools: Optional[Union[list[Tool], Toolset]] = None 1210 ) -> dict[str, list[ChatMessage]] 1211 ``` 1212 1213 Invoke text generation inference based on the provided messages and generation parameters. 1214 1215 **Arguments**: 1216 1217 - `messages`: A list of ChatMessage objects representing the input messages. 1218 - `generation_kwargs`: Additional keyword arguments for text generation. 1219 - `streaming_callback`: An optional callable for handling streaming responses. 1220 - `tools`: A list of tools or a Toolset for which the model can prepare calls. If set, it will override 1221 the `tools` parameter provided during initialization. This parameter can accept either a list 1222 of `Tool` objects or a `Toolset` instance. 1223 1224 **Returns**: 1225 1226 A dictionary with the following keys: 1227 - `replies`: A list containing the generated responses as ChatMessage instances. 1228 1229 <a id="chat/hugging_face_local.HuggingFaceLocalChatGenerator.create_message"></a> 1230 1231 #### HuggingFaceLocalChatGenerator.create\_message 1232 1233 ```python 1234 def create_message(text: str, 1235 index: int, 1236 tokenizer: Union["PreTrainedTokenizer", 1237 "PreTrainedTokenizerFast"], 1238 prompt: str, 1239 generation_kwargs: dict[str, Any], 1240 parse_tool_calls: bool = False) -> ChatMessage 1241 ``` 1242 1243 Create a ChatMessage instance from the provided text, populated with metadata. 1244 1245 **Arguments**: 1246 1247 - `text`: The generated text. 1248 - `index`: The index of the generated text. 1249 - `tokenizer`: The tokenizer used for generation. 1250 - `prompt`: The prompt used for generation. 1251 - `generation_kwargs`: The generation parameters. 1252 - `parse_tool_calls`: Whether to attempt parsing tool calls from the text. 1253 1254 **Returns**: 1255 1256 A ChatMessage instance. 1257 1258 <a id="chat/hugging_face_local.HuggingFaceLocalChatGenerator.run_async"></a> 1259 1260 #### HuggingFaceLocalChatGenerator.run\_async 1261 1262 ```python 1263 @component.output_types(replies=list[ChatMessage]) 1264 async def run_async( 1265 messages: list[ChatMessage], 1266 generation_kwargs: Optional[dict[str, Any]] = None, 1267 streaming_callback: Optional[StreamingCallbackT] = None, 1268 tools: Optional[Union[list[Tool], Toolset]] = None 1269 ) -> dict[str, list[ChatMessage]] 1270 ``` 1271 1272 Asynchronously invokes text generation inference based on the provided messages and generation parameters. 1273 1274 This is the asynchronous version of the `run` method. It has the same parameters 1275 and return values but can be used with `await` in an async code. 1276 1277 **Arguments**: 1278 1279 - `messages`: A list of ChatMessage objects representing the input messages. 1280 - `generation_kwargs`: Additional keyword arguments for text generation. 1281 - `streaming_callback`: An optional callable for handling streaming responses. 1282 - `tools`: A list of tools or a Toolset for which the model can prepare calls. 1283 This parameter can accept either a list of `Tool` objects or a `Toolset` instance. 1284 1285 **Returns**: 1286 1287 A dictionary with the following keys: 1288 - `replies`: A list containing the generated responses as ChatMessage instances. 1289 1290 <a id="chat/hugging_face_api"></a> 1291 1292 # Module chat/hugging\_face\_api 1293 1294 <a id="chat/hugging_face_api.HuggingFaceAPIChatGenerator"></a> 1295 1296 ## HuggingFaceAPIChatGenerator 1297 1298 Completes chats using Hugging Face APIs. 1299 1300 HuggingFaceAPIChatGenerator uses the [ChatMessage](https://docs.haystack.deepset.ai/docs/chatmessage) 1301 format for input and output. Use it to generate text with Hugging Face APIs: 1302 - [Serverless Inference API (Inference Providers)](https://huggingface.co/docs/inference-providers) 1303 - [Paid Inference Endpoints](https://huggingface.co/inference-endpoints) 1304 - [Self-hosted Text Generation Inference](https://github.com/huggingface/text-generation-inference) 1305 1306 ### Usage examples 1307 1308 #### With the serverless inference API (Inference Providers) - free tier available 1309 1310 ```python 1311 from haystack.components.generators.chat import HuggingFaceAPIChatGenerator 1312 from haystack.dataclasses import ChatMessage 1313 from haystack.utils import Secret 1314 from haystack.utils.hf import HFGenerationAPIType 1315 1316 messages = [ChatMessage.from_system("\nYou are a helpful, respectful and honest assistant"), 1317 ChatMessage.from_user("What's Natural Language Processing?")] 1318 1319 # the api_type can be expressed using the HFGenerationAPIType enum or as a string 1320 api_type = HFGenerationAPIType.SERVERLESS_INFERENCE_API 1321 api_type = "serverless_inference_api" # this is equivalent to the above 1322 1323 generator = HuggingFaceAPIChatGenerator(api_type=api_type, 1324 api_params={"model": "Qwen/Qwen2.5-7B-Instruct", 1325 "provider": "together"}, 1326 token=Secret.from_token("<your-api-key>")) 1327 1328 result = generator.run(messages) 1329 print(result) 1330 ``` 1331 1332 #### With the serverless inference API (Inference Providers) and text+image input 1333 1334 ```python 1335 from haystack.components.generators.chat import HuggingFaceAPIChatGenerator 1336 from haystack.dataclasses import ChatMessage, ImageContent 1337 from haystack.utils import Secret 1338 from haystack.utils.hf import HFGenerationAPIType 1339 1340 # Create an image from file path, URL, or base64 1341 image = ImageContent.from_file_path("path/to/your/image.jpg") 1342 1343 # Create a multimodal message with both text and image 1344 messages = [ChatMessage.from_user(content_parts=["Describe this image in detail", image])] 1345 1346 generator = HuggingFaceAPIChatGenerator( 1347 api_type=HFGenerationAPIType.SERVERLESS_INFERENCE_API, 1348 api_params={ 1349 "model": "Qwen/Qwen2.5-VL-7B-Instruct", # Vision Language Model 1350 "provider": "hyperbolic" 1351 }, 1352 token=Secret.from_token("<your-api-key>") 1353 ) 1354 1355 result = generator.run(messages) 1356 print(result) 1357 ``` 1358 1359 #### With paid inference endpoints 1360 1361 ```python 1362 from haystack.components.generators.chat import HuggingFaceAPIChatGenerator 1363 from haystack.dataclasses import ChatMessage 1364 from haystack.utils import Secret 1365 1366 messages = [ChatMessage.from_system("\nYou are a helpful, respectful and honest assistant"), 1367 ChatMessage.from_user("What's Natural Language Processing?")] 1368 1369 generator = HuggingFaceAPIChatGenerator(api_type="inference_endpoints", 1370 api_params={"url": "<your-inference-endpoint-url>"}, 1371 token=Secret.from_token("<your-api-key>")) 1372 1373 result = generator.run(messages) 1374 print(result) 1375 1376 #### With self-hosted text generation inference 1377 1378 ```python 1379 from haystack.components.generators.chat import HuggingFaceAPIChatGenerator 1380 from haystack.dataclasses import ChatMessage 1381 1382 messages = [ChatMessage.from_system("\nYou are a helpful, respectful and honest assistant"), 1383 ChatMessage.from_user("What's Natural Language Processing?")] 1384 1385 generator = HuggingFaceAPIChatGenerator(api_type="text_generation_inference", 1386 api_params={"url": "http://localhost:8080"}) 1387 1388 result = generator.run(messages) 1389 print(result) 1390 ``` 1391 1392 <a id="chat/hugging_face_api.HuggingFaceAPIChatGenerator.__init__"></a> 1393 1394 #### HuggingFaceAPIChatGenerator.\_\_init\_\_ 1395 1396 ```python 1397 def __init__(api_type: Union[HFGenerationAPIType, str], 1398 api_params: dict[str, str], 1399 token: Optional[Secret] = Secret.from_env_var( 1400 ["HF_API_TOKEN", "HF_TOKEN"], strict=False), 1401 generation_kwargs: Optional[dict[str, Any]] = None, 1402 stop_words: Optional[list[str]] = None, 1403 streaming_callback: Optional[StreamingCallbackT] = None, 1404 tools: Optional[Union[list[Tool], Toolset]] = None) 1405 ``` 1406 1407 Initialize the HuggingFaceAPIChatGenerator instance. 1408 1409 **Arguments**: 1410 1411 - `api_type`: The type of Hugging Face API to use. Available types: 1412 - `text_generation_inference`: See [TGI](https://github.com/huggingface/text-generation-inference). 1413 - `inference_endpoints`: See [Inference Endpoints](https://huggingface.co/inference-endpoints). 1414 - `serverless_inference_api`: See 1415 [Serverless Inference API - Inference Providers](https://huggingface.co/docs/inference-providers). 1416 - `api_params`: A dictionary with the following keys: 1417 - `model`: Hugging Face model ID. Required when `api_type` is `SERVERLESS_INFERENCE_API`. 1418 - `provider`: Provider name. Recommended when `api_type` is `SERVERLESS_INFERENCE_API`. 1419 - `url`: URL of the inference endpoint. Required when `api_type` is `INFERENCE_ENDPOINTS` or 1420 `TEXT_GENERATION_INFERENCE`. 1421 - Other parameters specific to the chosen API type, such as `timeout`, `headers`, etc. 1422 - `token`: The Hugging Face token to use as HTTP bearer authorization. 1423 Check your HF token in your [account settings](https://huggingface.co/settings/tokens). 1424 - `generation_kwargs`: A dictionary with keyword arguments to customize text generation. 1425 Some examples: `max_tokens`, `temperature`, `top_p`. 1426 For details, see [Hugging Face chat_completion documentation](https://huggingface.co/docs/huggingface_hub/package_reference/inference_client#huggingface_hub.InferenceClient.chat_completion). 1427 - `stop_words`: An optional list of strings representing the stop words. 1428 - `streaming_callback`: An optional callable for handling streaming responses. 1429 - `tools`: A list of tools or a Toolset for which the model can prepare calls. 1430 The chosen model should support tool/function calling, according to the model card. 1431 Support for tools in the Hugging Face API and TGI is not yet fully refined and you may experience 1432 unexpected behavior. This parameter can accept either a list of `Tool` objects or a `Toolset` instance. 1433 1434 <a id="chat/hugging_face_api.HuggingFaceAPIChatGenerator.to_dict"></a> 1435 1436 #### HuggingFaceAPIChatGenerator.to\_dict 1437 1438 ```python 1439 def to_dict() -> dict[str, Any] 1440 ``` 1441 1442 Serialize this component to a dictionary. 1443 1444 **Returns**: 1445 1446 A dictionary containing the serialized component. 1447 1448 <a id="chat/hugging_face_api.HuggingFaceAPIChatGenerator.from_dict"></a> 1449 1450 #### HuggingFaceAPIChatGenerator.from\_dict 1451 1452 ```python 1453 @classmethod 1454 def from_dict(cls, data: dict[str, Any]) -> "HuggingFaceAPIChatGenerator" 1455 ``` 1456 1457 Deserialize this component from a dictionary. 1458 1459 <a id="chat/hugging_face_api.HuggingFaceAPIChatGenerator.run"></a> 1460 1461 #### HuggingFaceAPIChatGenerator.run 1462 1463 ```python 1464 @component.output_types(replies=list[ChatMessage]) 1465 def run(messages: list[ChatMessage], 1466 generation_kwargs: Optional[dict[str, Any]] = None, 1467 tools: Optional[Union[list[Tool], Toolset]] = None, 1468 streaming_callback: Optional[StreamingCallbackT] = None) 1469 ``` 1470 1471 Invoke the text generation inference based on the provided messages and generation parameters. 1472 1473 **Arguments**: 1474 1475 - `messages`: A list of ChatMessage objects representing the input messages. 1476 - `generation_kwargs`: Additional keyword arguments for text generation. 1477 - `tools`: A list of tools or a Toolset for which the model can prepare calls. If set, it will override 1478 the `tools` parameter set during component initialization. This parameter can accept either a 1479 list of `Tool` objects or a `Toolset` instance. 1480 - `streaming_callback`: An optional callable for handling streaming responses. If set, it will override the `streaming_callback` 1481 parameter set during component initialization. 1482 1483 **Returns**: 1484 1485 A dictionary with the following keys: 1486 - `replies`: A list containing the generated responses as ChatMessage objects. 1487 1488 <a id="chat/hugging_face_api.HuggingFaceAPIChatGenerator.run_async"></a> 1489 1490 #### HuggingFaceAPIChatGenerator.run\_async 1491 1492 ```python 1493 @component.output_types(replies=list[ChatMessage]) 1494 async def run_async(messages: list[ChatMessage], 1495 generation_kwargs: Optional[dict[str, Any]] = None, 1496 tools: Optional[Union[list[Tool], Toolset]] = None, 1497 streaming_callback: Optional[StreamingCallbackT] = None) 1498 ``` 1499 1500 Asynchronously invokes the text generation inference based on the provided messages and generation parameters. 1501 1502 This is the asynchronous version of the `run` method. It has the same parameters 1503 and return values but can be used with `await` in an async code. 1504 1505 **Arguments**: 1506 1507 - `messages`: A list of ChatMessage objects representing the input messages. 1508 - `generation_kwargs`: Additional keyword arguments for text generation. 1509 - `tools`: A list of tools or a Toolset for which the model can prepare calls. If set, it will override the `tools` 1510 parameter set during component initialization. This parameter can accept either a list of `Tool` objects 1511 or a `Toolset` instance. 1512 - `streaming_callback`: An optional callable for handling streaming responses. If set, it will override the `streaming_callback` 1513 parameter set during component initialization. 1514 1515 **Returns**: 1516 1517 A dictionary with the following keys: 1518 - `replies`: A list containing the generated responses as ChatMessage objects. 1519 1520 <a id="chat/openai"></a> 1521 1522 # Module chat/openai 1523 1524 <a id="chat/openai.OpenAIChatGenerator"></a> 1525 1526 ## OpenAIChatGenerator 1527 1528 Completes chats using OpenAI's large language models (LLMs). 1529 1530 It works with the gpt-4 and o-series models and supports streaming responses 1531 from OpenAI API. It uses [ChatMessage](https://docs.haystack.deepset.ai/docs/chatmessage) 1532 format in input and output. 1533 1534 You can customize how the text is generated by passing parameters to the 1535 OpenAI API. Use the `**generation_kwargs` argument when you initialize 1536 the component or when you run it. Any parameter that works with 1537 `openai.ChatCompletion.create` will work here too. 1538 1539 For details on OpenAI API parameters, see 1540 [OpenAI documentation](https://platform.openai.com/docs/api-reference/chat). 1541 1542 ### Usage example 1543 1544 ```python 1545 from haystack.components.generators.chat import OpenAIChatGenerator 1546 from haystack.dataclasses import ChatMessage 1547 1548 messages = [ChatMessage.from_user("What's Natural Language Processing?")] 1549 1550 client = OpenAIChatGenerator() 1551 response = client.run(messages) 1552 print(response) 1553 ``` 1554 Output: 1555 ``` 1556 {'replies': 1557 [ChatMessage(_role=<ChatRole.ASSISTANT: 'assistant'>, _content= 1558 [TextContent(text="Natural Language Processing (NLP) is a branch of artificial intelligence 1559 that focuses on enabling computers to understand, interpret, and generate human language in 1560 a way that is meaningful and useful.")], 1561 _name=None, 1562 _meta={'model': 'gpt-4o-mini', 'index': 0, 'finish_reason': 'stop', 1563 'usage': {'prompt_tokens': 15, 'completion_tokens': 36, 'total_tokens': 51}}) 1564 ] 1565 } 1566 ``` 1567 1568 <a id="chat/openai.OpenAIChatGenerator.__init__"></a> 1569 1570 #### OpenAIChatGenerator.\_\_init\_\_ 1571 1572 ```python 1573 def __init__(api_key: Secret = Secret.from_env_var("OPENAI_API_KEY"), 1574 model: str = "gpt-4o-mini", 1575 streaming_callback: Optional[StreamingCallbackT] = None, 1576 api_base_url: Optional[str] = None, 1577 organization: Optional[str] = None, 1578 generation_kwargs: Optional[dict[str, Any]] = None, 1579 timeout: Optional[float] = None, 1580 max_retries: Optional[int] = None, 1581 tools: Optional[Union[list[Tool], Toolset]] = None, 1582 tools_strict: bool = False, 1583 http_client_kwargs: Optional[dict[str, Any]] = None) 1584 ``` 1585 1586 Creates an instance of OpenAIChatGenerator. Unless specified otherwise in `model`, uses OpenAI's gpt-4o-mini 1587 1588 Before initializing the component, you can set the 'OPENAI_TIMEOUT' and 'OPENAI_MAX_RETRIES' 1589 environment variables to override the `timeout` and `max_retries` parameters respectively 1590 in the OpenAI client. 1591 1592 **Arguments**: 1593 1594 - `api_key`: The OpenAI API key. 1595 You can set it with an environment variable `OPENAI_API_KEY`, or pass with this parameter 1596 during initialization. 1597 - `model`: The name of the model to use. 1598 - `streaming_callback`: A callback function that is called when a new token is received from the stream. 1599 The callback function accepts [StreamingChunk](https://docs.haystack.deepset.ai/docs/data-classes#streamingchunk) 1600 as an argument. 1601 - `api_base_url`: An optional base URL. 1602 - `organization`: Your organization ID, defaults to `None`. See 1603 [production best practices](https://platform.openai.com/docs/guides/production-best-practices/setting-up-your-organization). 1604 - `generation_kwargs`: Other parameters to use for the model. These parameters are sent directly to 1605 the OpenAI endpoint. See OpenAI [documentation](https://platform.openai.com/docs/api-reference/chat) for 1606 more details. 1607 Some of the supported parameters: 1608 - `max_tokens`: The maximum number of tokens the output text can have. 1609 - `temperature`: What sampling temperature to use. Higher values mean the model will take more risks. 1610 Try 0.9 for more creative applications and 0 (argmax sampling) for ones with a well-defined answer. 1611 - `top_p`: An alternative to sampling with temperature, called nucleus sampling, where the model 1612 considers the results of the tokens with top_p probability mass. For example, 0.1 means only the tokens 1613 comprising the top 10% probability mass are considered. 1614 - `n`: How many completions to generate for each prompt. For example, if the LLM gets 3 prompts and n is 2, 1615 it will generate two completions for each of the three prompts, ending up with 6 completions in total. 1616 - `stop`: One or more sequences after which the LLM should stop generating tokens. 1617 - `presence_penalty`: What penalty to apply if a token is already present at all. Bigger values mean 1618 the model will be less likely to repeat the same token in the text. 1619 - `frequency_penalty`: What penalty to apply if a token has already been generated in the text. 1620 Bigger values mean the model will be less likely to repeat the same token in the text. 1621 - `logit_bias`: Add a logit bias to specific tokens. The keys of the dictionary are tokens, and the 1622 values are the bias to add to that token. 1623 - `response_format`: A JSON schema or a Pydantic model that enforces the structure of the model's response. 1624 If provided, the output will always be validated against this 1625 format (unless the model returns a tool call). 1626 For details, see the [OpenAI Structured Outputs documentation](https://platform.openai.com/docs/guides/structured-outputs). 1627 Notes: 1628 - This parameter accepts Pydantic models and JSON schemas for latest models starting from GPT-4o. 1629 Older models only support basic version of structured outputs through `{"type": "json_object"}`. 1630 For detailed information on JSON mode, see the [OpenAI Structured Outputs documentation](https://platform.openai.com/docs/guides/structured-outputs#json-mode). 1631 - For structured outputs with streaming, 1632 the `response_format` must be a JSON schema and not a Pydantic model. 1633 - `timeout`: Timeout for OpenAI client calls. If not set, it defaults to either the 1634 `OPENAI_TIMEOUT` environment variable, or 30 seconds. 1635 - `max_retries`: Maximum number of retries to contact OpenAI after an internal error. 1636 If not set, it defaults to either the `OPENAI_MAX_RETRIES` environment variable, or set to 5. 1637 - `tools`: A list of tools or a Toolset for which the model can prepare calls. This parameter can accept either a 1638 list of `Tool` objects or a `Toolset` instance. 1639 - `tools_strict`: Whether to enable strict schema adherence for tool calls. If set to `True`, the model will follow exactly 1640 the schema provided in the `parameters` field of the tool definition, but this may increase latency. 1641 - `http_client_kwargs`: A dictionary of keyword arguments to configure a custom `httpx.Client`or `httpx.AsyncClient`. 1642 For more information, see the [HTTPX documentation](https://www.python-httpx.org/api/`client`). 1643 1644 <a id="chat/openai.OpenAIChatGenerator.to_dict"></a> 1645 1646 #### OpenAIChatGenerator.to\_dict 1647 1648 ```python 1649 def to_dict() -> dict[str, Any] 1650 ``` 1651 1652 Serialize this component to a dictionary. 1653 1654 **Returns**: 1655 1656 The serialized component as a dictionary. 1657 1658 <a id="chat/openai.OpenAIChatGenerator.from_dict"></a> 1659 1660 #### OpenAIChatGenerator.from\_dict 1661 1662 ```python 1663 @classmethod 1664 def from_dict(cls, data: dict[str, Any]) -> "OpenAIChatGenerator" 1665 ``` 1666 1667 Deserialize this component from a dictionary. 1668 1669 **Arguments**: 1670 1671 - `data`: The dictionary representation of this component. 1672 1673 **Returns**: 1674 1675 The deserialized component instance. 1676 1677 <a id="chat/openai.OpenAIChatGenerator.run"></a> 1678 1679 #### OpenAIChatGenerator.run 1680 1681 ```python 1682 @component.output_types(replies=list[ChatMessage]) 1683 def run(messages: list[ChatMessage], 1684 streaming_callback: Optional[StreamingCallbackT] = None, 1685 generation_kwargs: Optional[dict[str, Any]] = None, 1686 *, 1687 tools: Optional[Union[list[Tool], Toolset]] = None, 1688 tools_strict: Optional[bool] = None) 1689 ``` 1690 1691 Invokes chat completion based on the provided messages and generation parameters. 1692 1693 **Arguments**: 1694 1695 - `messages`: A list of ChatMessage instances representing the input messages. 1696 - `streaming_callback`: A callback function that is called when a new token is received from the stream. 1697 - `generation_kwargs`: Additional keyword arguments for text generation. These parameters will 1698 override the parameters passed during component initialization. 1699 For details on OpenAI API parameters, see [OpenAI documentation](https://platform.openai.com/docs/api-reference/chat/create). 1700 - `tools`: A list of tools or a Toolset for which the model can prepare calls. If set, it will override the 1701 `tools` parameter set during component initialization. This parameter can accept either a list of 1702 `Tool` objects or a `Toolset` instance. 1703 - `tools_strict`: Whether to enable strict schema adherence for tool calls. If set to `True`, the model will follow exactly 1704 the schema provided in the `parameters` field of the tool definition, but this may increase latency. 1705 If set, it will override the `tools_strict` parameter set during component initialization. 1706 1707 **Returns**: 1708 1709 A dictionary with the following key: 1710 - `replies`: A list containing the generated responses as ChatMessage instances. 1711 1712 <a id="chat/openai.OpenAIChatGenerator.run_async"></a> 1713 1714 #### OpenAIChatGenerator.run\_async 1715 1716 ```python 1717 @component.output_types(replies=list[ChatMessage]) 1718 async def run_async(messages: list[ChatMessage], 1719 streaming_callback: Optional[StreamingCallbackT] = None, 1720 generation_kwargs: Optional[dict[str, Any]] = None, 1721 *, 1722 tools: Optional[Union[list[Tool], Toolset]] = None, 1723 tools_strict: Optional[bool] = None) 1724 ``` 1725 1726 Asynchronously invokes chat completion based on the provided messages and generation parameters. 1727 1728 This is the asynchronous version of the `run` method. It has the same parameters and return values 1729 but can be used with `await` in async code. 1730 1731 **Arguments**: 1732 1733 - `messages`: A list of ChatMessage instances representing the input messages. 1734 - `streaming_callback`: A callback function that is called when a new token is received from the stream. 1735 Must be a coroutine. 1736 - `generation_kwargs`: Additional keyword arguments for text generation. These parameters will 1737 override the parameters passed during component initialization. 1738 For details on OpenAI API parameters, see [OpenAI documentation](https://platform.openai.com/docs/api-reference/chat/create). 1739 - `tools`: A list of tools or a Toolset for which the model can prepare calls. If set, it will override the 1740 `tools` parameter set during component initialization. This parameter can accept either a list of 1741 `Tool` objects or a `Toolset` instance. 1742 - `tools_strict`: Whether to enable strict schema adherence for tool calls. If set to `True`, the model will follow exactly 1743 the schema provided in the `parameters` field of the tool definition, but this may increase latency. 1744 If set, it will override the `tools_strict` parameter set during component initialization. 1745 1746 **Returns**: 1747 1748 A dictionary with the following key: 1749 - `replies`: A list containing the generated responses as ChatMessage instances.