/ docs-website / versioned_docs / version-2.18 / pipeline-components / generators / metallamachatgenerator.mdx
metallamachatgenerator.mdx
1 --- 2 title: "MetaLlamaChatGenerator" 3 id: metallamachatgenerator 4 slug: "/metallamachatgenerator" 5 description: "This component enables chat completion with any model hosted available with Meta Llama API." 6 --- 7 8 # MetaLlamaChatGenerator 9 10 This component enables chat completion with any model hosted available with Meta Llama API. 11 12 | | | 13 | -------------------------------------- | ----------------------------------------------------------------------------------------------------------- | 14 | **Most common position in a pipeline** | After a [ChatPromptBuilder](../builders/chatpromptbuilder.mdx) | 15 | **Mandatory init variables** | “api_key”: A Meta Llama API key. Can be set with `LLAMA_API_KEY` env variable or passed to `init()` method. | 16 | **Mandatory run variables** | “messages:” A list of [ChatMessage](../../concepts/data-classes/chatmessage.mdx) objects | 17 | **Output variables** | “replies”: A list of [ChatMessage](../../concepts/data-classes/chatmessage.mdx) objects | 18 | **API reference** | [Meta Llama API](/reference/integrations-meta-llama) | 19 | **GitHub link** | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/meta_llama | 20 21 ## Overview 22 23 The `MetaLlamaChatGenerator` enables you to use multiple Meta Llama models by making chat completion calls to the Meta [Llama API](https://llama.developer.meta.com/?utm_source=partner-haystack&utm_medium=website). The default model is `Llama-4-Scout-17B-16E-Instruct-FP8`. 24 25 Currently available models are: 26 27 | Model ID | Input context length | Output context length | Input Modalities | Output Modalities | 28 | ---------------------------------------- | -------------------- | --------------------- | ---------------- | ----------------- | 29 | `Llama-4-Scout-17B-16E-Instruct-FP8` | 128k | 4028 | Text, Image | Text | 30 | `Llama-4-Maverick-17B-128E-Instruct-FP8` | 128k | 4028 | Text, Image | Text | 31 | `Llama-3.3-70B-Instruct` | 128k | 4028 | Text | Text | 32 | `Llama-3.3-8B-Instruct` | 128k | 4028 | Text | Text | 33 34 This component uses the same `ChatMessage` format as other Haystack Chat Generators for structured input and output. For more information, see the [ChatMessage documentation](../../concepts/data-classes/chatmessage.mdx). 35 36 It is also fully compatible with Haystack [Tools](../../tools/tool.mdx) and [Toolsets](../../tools/toolset.mdx) that allow function-calling capabilities with supported models. 37 38 ### Initialization 39 40 To use this integration, you must have a Meta Llama API key. You can provide it with the `LLAMA_API_KEY` environment variable or by using a [Secret](../../concepts/secret-management.mdx). 41 42 Then, install the `meta-llama-haystack` integration: 43 44 ```shell 45 pip install meta-llama-haystack 46 ``` 47 48 ### Streaming 49 50 `MetaLlamaChatGenerator` supports [streaming](guides-to-generators/choosing-the-right-generator.mdx#streaming-support) responses from the LLM, allowing tokens to be emitted as they are generated. To enable streaming, pass a callable to the `streaming_callback` parameter during initialization. 51 52 ## Usage 53 54 ### On its own 55 56 ```python 57 from haystack.dataclasses import ChatMessage 58 from haystack_integrations.components.generators.meta_llama import ( 59 MetaLlamaChatGenerator, 60 ) 61 62 llm = MetaLlamaChatGenerator() 63 response = llm.run([ChatMessage.from_user("What are Agentic Pipelines? Be brief.")]) 64 print(response["replies"][0].text) 65 ``` 66 67 With streaming and model routing: 68 69 ```python 70 from haystack.dataclasses import ChatMessage 71 from haystack_integrations.components.generators.meta_llama import ( 72 MetaLlamaChatGenerator, 73 ) 74 75 llm = MetaLlamaChatGenerator( 76 model="Llama-3.3-8B-Instruct", 77 streaming_callback=lambda chunk: print(chunk.content, end="", flush=True), 78 ) 79 80 response = llm.run([ChatMessage.from_user("What are Agentic Pipelines? Be brief.")]) 81 82 ## check the model used for the response 83 print("\n\n Model used: ", response["replies"][0].meta["model"]) 84 ``` 85 86 ### In a pipeline 87 88 ```python 89 ## To run this example, you will need to set a `LLAMA_API_KEY` environment variable. 90 91 from haystack import Document, Pipeline 92 from haystack.components.builders.chat_prompt_builder import ChatPromptBuilder 93 from haystack.components.generators.utils import print_streaming_chunk 94 from haystack.components.retrievers.in_memory import InMemoryBM25Retriever 95 from haystack.dataclasses import ChatMessage 96 from haystack.document_stores.in_memory import InMemoryDocumentStore 97 from haystack.utils import Secret 98 99 from haystack_integrations.components.generators.meta_llama import ( 100 MetaLlamaChatGenerator, 101 ) 102 103 ## Write documents to InMemoryDocumentStore 104 document_store = InMemoryDocumentStore() 105 document_store.write_documents( 106 [ 107 Document(content="My name is Jean and I live in Paris."), 108 Document(content="My name is Mark and I live in Berlin."), 109 Document(content="My name is Giorgio and I live in Rome."), 110 ], 111 ) 112 113 ## Build a RAG pipeline 114 prompt_template = [ 115 ChatMessage.from_user( 116 "Given these documents, answer the question.\n" 117 "Documents:\n{% for doc in documents %}{{ doc.content }}{% endfor %}\n" 118 "Question: {{question}}\n" 119 "Answer:", 120 ), 121 ] 122 123 ## Define required variables explicitly 124 prompt_builder = ChatPromptBuilder( 125 template=prompt_template, 126 required_variables={"question", "documents"}, 127 ) 128 129 retriever = InMemoryBM25Retriever(document_store=document_store) 130 llm = MetaLlamaChatGenerator( 131 api_key=Secret.from_env_var("LLAMA_API_KEY"), 132 streaming_callback=print_streaming_chunk, 133 ) 134 135 rag_pipeline = Pipeline() 136 rag_pipeline.add_component("retriever", retriever) 137 rag_pipeline.add_component("prompt_builder", prompt_builder) 138 rag_pipeline.add_component("llm", llm) 139 rag_pipeline.connect("retriever", "prompt_builder.documents") 140 rag_pipeline.connect("prompt_builder", "llm.messages") 141 142 ## Ask a question 143 question = "Who lives in Paris?" 144 rag_pipeline.run( 145 { 146 "retriever": {"query": question}, 147 "prompt_builder": {"question": question}, 148 }, 149 ) 150 ```