Cradicle Explorer

/ docs-website / versioned_docs / version-2.20 / pipeline-components / embedders / mistraltextembedder.mdx
mistraltextembedder.mdx
  1  ---
  2  title: "MistralTextEmbedder"
  3  id: mistraltextembedder
  4  slug: "/mistraltextembedder"
  5  description: "This component transforms a string into a vector using the Mistral API and models. Use it for embedding retrieval to transform your query into an embedding."
  6  ---
  7  
  8  # MistralTextEmbedder
  9  
 10  This component transforms a string into a vector using the Mistral API and models. Use it for embedding retrieval to transform your query into an embedding.
 11  
 12  <div className="key-value-table">
 13  
 14  |  |  |
 15  | --- | --- |
 16  | **Most common position in a pipeline** | Before an embedding [Retriever](../retrievers.mdx)  in a query/RAG pipeline |
 17  | **Mandatory init variables** | `api_key`: The Mistral API key. Can be set with `MISTRAL_API_KEY` env var. |
 18  | **Mandatory run variables** | `text`: A string |
 19  | **Output variables** | `embedding`: A list of float numbers (vectors)  <br /> <br />`meta`: A dictionary of metadata strings |
 20  | **API reference** | [Mistral](/reference/integrations-mistral) |
 21  | **GitHub link** | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/mistral |
 22  
 23  </div>
 24  
 25  Use `MistalTextEmbedder` to embed a simple string (such as a query) into a vector. For embedding lists of documents, use the [`MistralDocumentEmbedder`](mistraldocumentembedder.mdx), which enriches the document with the computed embedding, also known as vector.
 26  
 27  ## Overview
 28  
 29  `MistralTextEmbedder` transforms a string into a vector that captures its semantics using a Mistral embedding model.
 30  
 31  The component currently supports the `mistral-embed` embedding model. The list of all supported models can be found in Mistral’s [embedding models documentation](https://docs.mistral.ai/platform/endpoints/#embedding-models).
 32  
 33  To start using this integration with Haystack, install it with:
 34  
 35  ```shell
 36  pip install mistral-haystack
 37  ```
 38  
 39  `MistralTextEmbedder` needs a Mistral API key to work. It uses a `MISTRAL_API_KEY` environment variable by default. Otherwise, you can pass an API key at initialization with `api_key`:
 40  
 41  ```python
 42  embedder = MistralTextEmbedder(
 43      api_key=Secret.from_token("<your-api-key>"),
 44      model="mistral-embed",
 45  )
 46  ```
 47  
 48  ## Usage
 49  
 50  ### On its own
 51  
 52  Remember to set the`MISTRAL_API_KEY` as an environment variable first or pass it in directly.
 53  
 54  Here is how you can use the component on its own:
 55  
 56  ```python
 57  
 58  from haystack_integrations.components.embedders.mistral.text_embedder import (
 59      MistralTextEmbedder,
 60  )
 61  
 62  embedder = MistralTextEmbedder(
 63      api_key=Secret.from_token("<your-api-key>"),
 64      model="mistral-embed",
 65  )
 66  
 67  result = embedder.run(text="How can I ise the Mistral embedding models with Haystack?")
 68  
 69  print(result["embedding"])
 70  ## [-0.0015687942504882812, 0.052154541015625, 0.037109375...]
 71  ```
 72  
 73  ### In a pipeline
 74  
 75  Below is an example of the `MistralTextEmbedder` in a document search pipeline. We are building this pipeline on top of an `InMemoryDocumentStore` where we index the contents of two URLs.
 76  
 77  ```python
 78  from haystack import Document, Pipeline
 79  from haystack.utils import Secret
 80  from haystack.components.builders.chat_prompt_builder import ChatPromptBuilder
 81  from haystack.components.fetchers import LinkContentFetcher
 82  from haystack.components.converters import HTMLToDocument
 83  from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever
 84  from haystack.components.writers import DocumentWriter
 85  from haystack.document_stores.in_memory import InMemoryDocumentStore
 86  from haystack_integrations.components.embedders.mistral.document_embedder import (
 87      MistralDocumentEmbedder,
 88  )
 89  from haystack_integrations.components.embedders.mistral.text_embedder import (
 90      MistralTextEmbedder,
 91  )
 92  from haystack.components.generators.chat import OpenAIChatGenerator
 93  from haystack.dataclasses import ChatMessage
 94  
 95  ## Initialize document store
 96  document_store = InMemoryDocumentStore(embedding_similarity_function="cosine")
 97  
 98  ## Indexing components
 99  fetcher = LinkContentFetcher()
100  converter = HTMLToDocument()
101  embedder = MistralDocumentEmbedder()
102  writer = DocumentWriter(document_store=document_store)
103  
104  indexing = Pipeline()
105  indexing.add_component(name="fetcher", instance=fetcher)
106  indexing.add_component(name="converter", instance=converter)
107  indexing.add_component(name="embedder", instance=embedder)
108  indexing.add_component(name="writer", instance=writer)
109  
110  indexing.connect("fetcher", "converter")
111  indexing.connect("converter", "embedder")
112  indexing.connect("embedder", "writer")
113  
114  indexing.run(
115      data={
116          "fetcher": {
117              "urls": [
118                  "https://docs.mistral.ai/self-deployment/cloudflare/",
119                  "https://docs.mistral.ai/platform/endpoints/",
120              ],
121          },
122      },
123  )
124  
125  ## Retrieval components
126  text_embedder = MistralTextEmbedder()
127  retriever = InMemoryEmbeddingRetriever(document_store=document_store)
128  
129  ## Define prompt template
130  prompt_template = [
131      ChatMessage.from_system("You are a helpful assistant."),
132      ChatMessage.from_user(
133          "Given the retrieved documents, answer the question.\nDocuments:\n"
134          "{% for document in documents %}{{ document.content }}{% endfor %}\n"
135          "Question: {{ query }}\nAnswer:",
136      ),
137  ]
138  
139  prompt_builder = ChatPromptBuilder(
140      template=prompt_template,
141      required_variables={"query", "documents"},
142  )
143  llm = OpenAIChatGenerator(
144      model="gpt-4o-mini",
145      api_key=Secret.from_token("<your-api-key>"),
146  )
147  
148  doc_search = Pipeline()
149  doc_search.add_component("text_embedder", text_embedder)
150  doc_search.add_component("retriever", retriever)
151  doc_search.add_component("prompt_builder", prompt_builder)
152  doc_search.add_component("llm", llm)
153  
154  doc_search.connect("text_embedder.embedding", "retriever.query_embedding")
155  doc_search.connect("retriever.documents", "prompt_builder.documents")
156  doc_search.connect("prompt_builder.messages", "llm.messages")
157  
158  query = "How can I deploy Mistral models with Cloudflare?"
159  
160  result = doc_search.run(
161      {
162          "text_embedder": {"text": query},
163          "retriever": {"top_k": 1},
164          "prompt_builder": {"query": query},
165      },
166  )
167  
168  print(result["llm"]["replies"])
169  ```