amazon_sagemaker.md
1 --- 2 title: "Amazon Sagemaker" 3 id: integrations-amazon-sagemaker 4 description: "Amazon Sagemaker integration for Haystack" 5 slug: "/integrations-amazon-sagemaker" 6 --- 7 8 <a id="haystack_integrations.components.generators.amazon_sagemaker.sagemaker"></a> 9 10 ## Module haystack\_integrations.components.generators.amazon\_sagemaker.sagemaker 11 12 <a id="haystack_integrations.components.generators.amazon_sagemaker.sagemaker.SagemakerGenerator"></a> 13 14 ### SagemakerGenerator 15 16 Enables text generation using Amazon Sagemaker. 17 18 SagemakerGenerator supports Large Language Models (LLMs) hosted and deployed on a SageMaker Inference Endpoint. 19 For guidance on how to deploy a model to SageMaker, refer to the 20 [SageMaker JumpStart foundation models documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-foundation-models-use.html). 21 22 Usage example: 23 ```python 24 # Make sure your AWS credentials are set up correctly. You can use environment variables or a shared credentials 25 # file. Then you can use the generator as follows: 26 from haystack_integrations.components.generators.amazon_sagemaker import SagemakerGenerator 27 28 generator = SagemakerGenerator(model="jumpstart-dft-hf-llm-falcon-7b-bf16") 29 response = generator.run("What's Natural Language Processing? Be brief.") 30 print(response) 31 >>> {'replies': ['Natural Language Processing (NLP) is a branch of artificial intelligence that focuses on 32 >>> the interaction between computers and human language. It involves enabling computers to understand, interpret, 33 >>> and respond to natural human language in a way that is both meaningful and useful.'], 'meta': [{}]} 34 ``` 35 36 <a id="haystack_integrations.components.generators.amazon_sagemaker.sagemaker.SagemakerGenerator.__init__"></a> 37 38 #### SagemakerGenerator.\_\_init\_\_ 39 40 ```python 41 def __init__( 42 model: str, 43 aws_access_key_id: Secret | None = Secret.from_env_var( 44 ["AWS_ACCESS_KEY_ID"], strict=False), 45 aws_secret_access_key: Secret 46 | None = Secret.from_env_var( # noqa: B008 47 ["AWS_SECRET_ACCESS_KEY"], strict=False), 48 aws_session_token: Secret | None = Secret.from_env_var( 49 ["AWS_SESSION_TOKEN"], strict=False), 50 aws_region_name: Secret | None = Secret.from_env_var( 51 ["AWS_DEFAULT_REGION"], strict=False), 52 aws_profile_name: Secret | None = Secret.from_env_var(["AWS_PROFILE"], 53 strict=False), 54 aws_custom_attributes: dict[str, Any] | None = None, 55 generation_kwargs: dict[str, Any] | None = None) 56 ``` 57 58 Instantiates the session with SageMaker. 59 60 **Arguments**: 61 62 - `aws_access_key_id`: The `Secret` for AWS access key ID. 63 - `aws_secret_access_key`: The `Secret` for AWS secret access key. 64 - `aws_session_token`: The `Secret` for AWS session token. 65 - `aws_region_name`: The `Secret` for AWS region name. If not provided, the default region will be used. 66 - `aws_profile_name`: The `Secret` for AWS profile name. If not provided, the default profile will be used. 67 - `model`: The name for SageMaker Model Endpoint. 68 - `aws_custom_attributes`: Custom attributes to be passed to SageMaker, for example `{"accept_eula": True}` 69 in case of Llama-2 models. 70 - `generation_kwargs`: Additional keyword arguments for text generation. For a list of supported parameters 71 see your model's documentation page, for example here for HuggingFace models: 72 https://huggingface.co/blog/sagemaker-huggingface-llm#4-run-inference-and-chat-with-our-model 73 74 Specifically, Llama-2 models support the following inference payload parameters: 75 76 - `max_new_tokens`: Model generates text until the output length (excluding the input context length) 77 reaches `max_new_tokens`. If specified, it must be a positive integer. 78 - `temperature`: Controls the randomness in the output. Higher temperature results in output sequence with 79 low-probability words and lower temperature results in output sequence with high-probability words. 80 If `temperature=0`, it results in greedy decoding. If specified, it must be a positive float. 81 - `top_p`: In each step of text generation, sample from the smallest possible set of words with cumulative 82 probability `top_p`. If specified, it must be a float between 0 and 1. 83 - `return_full_text`: If `True`, input text will be part of the output generated text. If specified, it must 84 be boolean. The default value for it is `False`. 85 86 <a id="haystack_integrations.components.generators.amazon_sagemaker.sagemaker.SagemakerGenerator.to_dict"></a> 87 88 #### SagemakerGenerator.to\_dict 89 90 ```python 91 def to_dict() -> dict[str, Any] 92 ``` 93 94 Serializes the component to a dictionary. 95 96 **Returns**: 97 98 Dictionary with serialized data. 99 100 <a id="haystack_integrations.components.generators.amazon_sagemaker.sagemaker.SagemakerGenerator.from_dict"></a> 101 102 #### SagemakerGenerator.from\_dict 103 104 ```python 105 @classmethod 106 def from_dict(cls, data: dict[str, Any]) -> "SagemakerGenerator" 107 ``` 108 109 Deserializes the component from a dictionary. 110 111 **Arguments**: 112 113 - `data`: Dictionary to deserialize from. 114 115 **Returns**: 116 117 Deserialized component. 118 119 <a id="haystack_integrations.components.generators.amazon_sagemaker.sagemaker.SagemakerGenerator.run"></a> 120 121 #### SagemakerGenerator.run 122 123 ```python 124 @component.output_types(replies=list[str], meta=list[dict[str, Any]]) 125 def run( 126 prompt: str, 127 generation_kwargs: dict[str, Any] | None = None 128 ) -> dict[str, list[str] | list[dict[str, Any]]] 129 ``` 130 131 Invoke the text generation inference based on the provided prompt and generation parameters. 132 133 **Arguments**: 134 135 - `prompt`: The string prompt to use for text generation. 136 - `generation_kwargs`: Additional keyword arguments for text generation. These parameters will 137 potentially override the parameters passed in the `__init__` method. 138 139 **Raises**: 140 141 - `ValueError`: If the model response type is not a list of dictionaries or a single dictionary. 142 - `SagemakerNotReadyError`: If the SageMaker model is not ready to accept requests. 143 - `SagemakerInferenceError`: If the SageMaker Inference returns an error. 144 145 **Returns**: 146 147 A dictionary with the following keys: 148 - `replies`: A list of strings containing the generated responses 149 - `meta`: A list of dictionaries containing the metadata for each response. 150