amazon_sagemaker.md
  1  ---
  2  title: "Amazon Sagemaker"
  3  id: integrations-amazon-sagemaker
  4  description: "Amazon Sagemaker integration for Haystack"
  5  slug: "/integrations-amazon-sagemaker"
  6  ---
  7  
  8  <a id="haystack_integrations.components.generators.amazon_sagemaker.sagemaker"></a>
  9  
 10  ## Module haystack\_integrations.components.generators.amazon\_sagemaker.sagemaker
 11  
 12  <a id="haystack_integrations.components.generators.amazon_sagemaker.sagemaker.SagemakerGenerator"></a>
 13  
 14  ### SagemakerGenerator
 15  
 16  Enables text generation using Amazon Sagemaker.
 17  
 18  SagemakerGenerator supports Large Language Models (LLMs) hosted and deployed on a SageMaker Inference Endpoint.
 19  For guidance on how to deploy a model to SageMaker, refer to the
 20  [SageMaker JumpStart foundation models documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-foundation-models-use.html).
 21  
 22  Usage example:
 23  ```python
 24  # Make sure your AWS credentials are set up correctly. You can use environment variables or a shared credentials
 25  # file. Then you can use the generator as follows:
 26  from haystack_integrations.components.generators.amazon_sagemaker import SagemakerGenerator
 27  
 28  generator = SagemakerGenerator(model="jumpstart-dft-hf-llm-falcon-7b-bf16")
 29  response = generator.run("What's Natural Language Processing? Be brief.")
 30  print(response)
 31  >>> {'replies': ['Natural Language Processing (NLP) is a branch of artificial intelligence that focuses on
 32  >>> the interaction between computers and human language. It involves enabling computers to understand, interpret,
 33  >>> and respond to natural human language in a way that is both meaningful and useful.'], 'meta': [{}]}
 34  ```
 35  
 36  <a id="haystack_integrations.components.generators.amazon_sagemaker.sagemaker.SagemakerGenerator.__init__"></a>
 37  
 38  #### SagemakerGenerator.\_\_init\_\_
 39  
 40  ```python
 41  def __init__(
 42          model: str,
 43          aws_access_key_id: Secret | None = Secret.from_env_var(
 44              ["AWS_ACCESS_KEY_ID"], strict=False),
 45          aws_secret_access_key: Secret
 46      | None = Secret.from_env_var(  # noqa: B008
 47          ["AWS_SECRET_ACCESS_KEY"], strict=False),
 48          aws_session_token: Secret | None = Secret.from_env_var(
 49              ["AWS_SESSION_TOKEN"], strict=False),
 50          aws_region_name: Secret | None = Secret.from_env_var(
 51              ["AWS_DEFAULT_REGION"], strict=False),
 52          aws_profile_name: Secret | None = Secret.from_env_var(["AWS_PROFILE"],
 53                                                                strict=False),
 54          aws_custom_attributes: dict[str, Any] | None = None,
 55          generation_kwargs: dict[str, Any] | None = None)
 56  ```
 57  
 58  Instantiates the session with SageMaker.
 59  
 60  **Arguments**:
 61  
 62  - `aws_access_key_id`: The `Secret` for AWS access key ID.
 63  - `aws_secret_access_key`: The `Secret` for AWS secret access key.
 64  - `aws_session_token`: The `Secret` for AWS session token.
 65  - `aws_region_name`: The `Secret` for AWS region name. If not provided, the default region will be used.
 66  - `aws_profile_name`: The `Secret` for AWS profile name. If not provided, the default profile will be used.
 67  - `model`: The name for SageMaker Model Endpoint.
 68  - `aws_custom_attributes`: Custom attributes to be passed to SageMaker, for example `{"accept_eula": True}`
 69  in case of Llama-2 models.
 70  - `generation_kwargs`: Additional keyword arguments for text generation. For a list of supported parameters
 71  see your model's documentation page, for example here for HuggingFace models:
 72  https://huggingface.co/blog/sagemaker-huggingface-llm#4-run-inference-and-chat-with-our-model
 73  
 74  Specifically, Llama-2 models support the following inference payload parameters:
 75  
 76  - `max_new_tokens`: Model generates text until the output length (excluding the input context length)
 77      reaches `max_new_tokens`. If specified, it must be a positive integer.
 78  - `temperature`: Controls the randomness in the output. Higher temperature results in output sequence with
 79      low-probability words and lower temperature results in output sequence with high-probability words.
 80      If `temperature=0`, it results in greedy decoding. If specified, it must be a positive float.
 81  - `top_p`: In each step of text generation, sample from the smallest possible set of words with cumulative
 82      probability `top_p`. If specified, it must be a float between 0 and 1.
 83  - `return_full_text`: If `True`, input text will be part of the output generated text. If specified, it must
 84      be boolean. The default value for it is `False`.
 85  
 86  <a id="haystack_integrations.components.generators.amazon_sagemaker.sagemaker.SagemakerGenerator.to_dict"></a>
 87  
 88  #### SagemakerGenerator.to\_dict
 89  
 90  ```python
 91  def to_dict() -> dict[str, Any]
 92  ```
 93  
 94  Serializes the component to a dictionary.
 95  
 96  **Returns**:
 97  
 98  Dictionary with serialized data.
 99  
100  <a id="haystack_integrations.components.generators.amazon_sagemaker.sagemaker.SagemakerGenerator.from_dict"></a>
101  
102  #### SagemakerGenerator.from\_dict
103  
104  ```python
105  @classmethod
106  def from_dict(cls, data: dict[str, Any]) -> "SagemakerGenerator"
107  ```
108  
109  Deserializes the component from a dictionary.
110  
111  **Arguments**:
112  
113  - `data`: Dictionary to deserialize from.
114  
115  **Returns**:
116  
117  Deserialized component.
118  
119  <a id="haystack_integrations.components.generators.amazon_sagemaker.sagemaker.SagemakerGenerator.run"></a>
120  
121  #### SagemakerGenerator.run
122  
123  ```python
124  @component.output_types(replies=list[str], meta=list[dict[str, Any]])
125  def run(
126      prompt: str,
127      generation_kwargs: dict[str, Any] | None = None
128  ) -> dict[str, list[str] | list[dict[str, Any]]]
129  ```
130  
131  Invoke the text generation inference based on the provided prompt and generation parameters.
132  
133  **Arguments**:
134  
135  - `prompt`: The string prompt to use for text generation.
136  - `generation_kwargs`: Additional keyword arguments for text generation. These parameters will
137  potentially override the parameters passed in the `__init__` method.
138  
139  **Raises**:
140  
141  - `ValueError`: If the model response type is not a list of dictionaries or a single dictionary.
142  - `SagemakerNotReadyError`: If the SageMaker model is not ready to accept requests.
143  - `SagemakerInferenceError`: If the SageMaker Inference returns an error.
144  
145  **Returns**:
146  
147  A dictionary with the following keys:
148  - `replies`: A list of strings containing the generated responses
149  - `meta`: A list of dictionaries containing the metadata for each response.