/ docs-website / versioned_docs / version-2.20 / pipeline-components / embedders / mistraltextembedder.mdx
mistraltextembedder.mdx
1 --- 2 title: "MistralTextEmbedder" 3 id: mistraltextembedder 4 slug: "/mistraltextembedder" 5 description: "This component transforms a string into a vector using the Mistral API and models. Use it for embedding retrieval to transform your query into an embedding." 6 --- 7 8 # MistralTextEmbedder 9 10 This component transforms a string into a vector using the Mistral API and models. Use it for embedding retrieval to transform your query into an embedding. 11 12 <div className="key-value-table"> 13 14 | | | 15 | --- | --- | 16 | **Most common position in a pipeline** | Before an embedding [Retriever](../retrievers.mdx) in a query/RAG pipeline | 17 | **Mandatory init variables** | `api_key`: The Mistral API key. Can be set with `MISTRAL_API_KEY` env var. | 18 | **Mandatory run variables** | `text`: A string | 19 | **Output variables** | `embedding`: A list of float numbers (vectors) <br /> <br />`meta`: A dictionary of metadata strings | 20 | **API reference** | [Mistral](/reference/integrations-mistral) | 21 | **GitHub link** | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/mistral | 22 23 </div> 24 25 Use `MistalTextEmbedder` to embed a simple string (such as a query) into a vector. For embedding lists of documents, use the [`MistralDocumentEmbedder`](mistraldocumentembedder.mdx), which enriches the document with the computed embedding, also known as vector. 26 27 ## Overview 28 29 `MistralTextEmbedder` transforms a string into a vector that captures its semantics using a Mistral embedding model. 30 31 The component currently supports the `mistral-embed` embedding model. The list of all supported models can be found in Mistral’s [embedding models documentation](https://docs.mistral.ai/platform/endpoints/#embedding-models). 32 33 To start using this integration with Haystack, install it with: 34 35 ```shell 36 pip install mistral-haystack 37 ``` 38 39 `MistralTextEmbedder` needs a Mistral API key to work. It uses a `MISTRAL_API_KEY` environment variable by default. Otherwise, you can pass an API key at initialization with `api_key`: 40 41 ```python 42 embedder = MistralTextEmbedder( 43 api_key=Secret.from_token("<your-api-key>"), 44 model="mistral-embed", 45 ) 46 ``` 47 48 ## Usage 49 50 ### On its own 51 52 Remember to set the`MISTRAL_API_KEY` as an environment variable first or pass it in directly. 53 54 Here is how you can use the component on its own: 55 56 ```python 57 58 from haystack_integrations.components.embedders.mistral.text_embedder import ( 59 MistralTextEmbedder, 60 ) 61 62 embedder = MistralTextEmbedder( 63 api_key=Secret.from_token("<your-api-key>"), 64 model="mistral-embed", 65 ) 66 67 result = embedder.run(text="How can I ise the Mistral embedding models with Haystack?") 68 69 print(result["embedding"]) 70 ## [-0.0015687942504882812, 0.052154541015625, 0.037109375...] 71 ``` 72 73 ### In a pipeline 74 75 Below is an example of the `MistralTextEmbedder` in a document search pipeline. We are building this pipeline on top of an `InMemoryDocumentStore` where we index the contents of two URLs. 76 77 ```python 78 from haystack import Document, Pipeline 79 from haystack.utils import Secret 80 from haystack.components.builders.chat_prompt_builder import ChatPromptBuilder 81 from haystack.components.fetchers import LinkContentFetcher 82 from haystack.components.converters import HTMLToDocument 83 from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever 84 from haystack.components.writers import DocumentWriter 85 from haystack.document_stores.in_memory import InMemoryDocumentStore 86 from haystack_integrations.components.embedders.mistral.document_embedder import ( 87 MistralDocumentEmbedder, 88 ) 89 from haystack_integrations.components.embedders.mistral.text_embedder import ( 90 MistralTextEmbedder, 91 ) 92 from haystack.components.generators.chat import OpenAIChatGenerator 93 from haystack.dataclasses import ChatMessage 94 95 ## Initialize document store 96 document_store = InMemoryDocumentStore(embedding_similarity_function="cosine") 97 98 ## Indexing components 99 fetcher = LinkContentFetcher() 100 converter = HTMLToDocument() 101 embedder = MistralDocumentEmbedder() 102 writer = DocumentWriter(document_store=document_store) 103 104 indexing = Pipeline() 105 indexing.add_component(name="fetcher", instance=fetcher) 106 indexing.add_component(name="converter", instance=converter) 107 indexing.add_component(name="embedder", instance=embedder) 108 indexing.add_component(name="writer", instance=writer) 109 110 indexing.connect("fetcher", "converter") 111 indexing.connect("converter", "embedder") 112 indexing.connect("embedder", "writer") 113 114 indexing.run( 115 data={ 116 "fetcher": { 117 "urls": [ 118 "https://docs.mistral.ai/self-deployment/cloudflare/", 119 "https://docs.mistral.ai/platform/endpoints/", 120 ], 121 }, 122 }, 123 ) 124 125 ## Retrieval components 126 text_embedder = MistralTextEmbedder() 127 retriever = InMemoryEmbeddingRetriever(document_store=document_store) 128 129 ## Define prompt template 130 prompt_template = [ 131 ChatMessage.from_system("You are a helpful assistant."), 132 ChatMessage.from_user( 133 "Given the retrieved documents, answer the question.\nDocuments:\n" 134 "{% for document in documents %}{{ document.content }}{% endfor %}\n" 135 "Question: {{ query }}\nAnswer:", 136 ), 137 ] 138 139 prompt_builder = ChatPromptBuilder( 140 template=prompt_template, 141 required_variables={"query", "documents"}, 142 ) 143 llm = OpenAIChatGenerator( 144 model="gpt-4o-mini", 145 api_key=Secret.from_token("<your-api-key>"), 146 ) 147 148 doc_search = Pipeline() 149 doc_search.add_component("text_embedder", text_embedder) 150 doc_search.add_component("retriever", retriever) 151 doc_search.add_component("prompt_builder", prompt_builder) 152 doc_search.add_component("llm", llm) 153 154 doc_search.connect("text_embedder.embedding", "retriever.query_embedding") 155 doc_search.connect("retriever.documents", "prompt_builder.documents") 156 doc_search.connect("prompt_builder.messages", "llm.messages") 157 158 query = "How can I deploy Mistral models with Cloudflare?" 159 160 result = doc_search.run( 161 { 162 "text_embedder": {"text": query}, 163 "retriever": {"top_k": 1}, 164 "prompt_builder": {"query": query}, 165 }, 166 ) 167 168 print(result["llm"]["replies"]) 169 ```