/ docs-website / docs / pipeline-components / embedders / mistraldocumentembedder.mdx
mistraldocumentembedder.mdx
  1  ---
  2  title: "MistralDocumentEmbedder"
  3  id: mistraldocumentembedder
  4  slug: "/mistraldocumentembedder"
  5  description: "This component computes the embeddings of a list of documents using the Mistral API and models."
  6  ---
  7  
  8  # MistralDocumentEmbedder
  9  
 10  This component computes the embeddings of a list of documents using the Mistral API and models.
 11  
 12  <div className="key-value-table">
 13  
 14  |  |  |
 15  | --- | --- |
 16  | **Most common position in a pipeline** | Before a [`DocumentWriter`](../writers/documentwriter.mdx)  in an indexing pipeline |
 17  | **Mandatory init variables** | `api_key`: The Mistral API key. Can be set with `MISTRAL_API_KEY` env var. |
 18  | **Mandatory run variables** | `documents`: A list of documents to be embedded |
 19  | **Output variables** | `documents`: A list of documents (enriched with embeddings)  <br /> <br />`meta`: A dictionary of metadata strings |
 20  | **API reference** | [Mistral](/reference/integrations-mistral) |
 21  | **GitHub link** | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/mistral |
 22  
 23  </div>
 24  
 25  This component should be used to embed a list of Documents. To embed a string, use the [`MistralTextEmbedder`](mistraltextembedder.mdx).
 26  
 27  ## Overview
 28  
 29  `MistralDocumentEmbedder` computes the embeddings of a list of documents and stores the obtained vectors in the embedding field of each document. It uses the Mistral API and its embedding models.
 30  
 31  The component currently supports the `mistral-embed` embedding model. The list of all supported models can be found in Mistral’s [embedding models documentation](https://docs.mistral.ai/platform/endpoints/#embedding-models).
 32  
 33  To start using this integration with Haystack, install it with:
 34  
 35  ```shell
 36  pip install mistral-haystack
 37  ```
 38  
 39  `MistralDocumentEmbedder` needs a Mistral API key to work. It uses an `MISTRAL_API_KEY` environment variable by default. Otherwise, you can pass an API key at initialization with `api_key`:
 40  
 41  ```python
 42  embedder = MistralDocumentEmbedder(
 43      api_key=Secret.from_token("<your-api-key>"),
 44      model="mistral-embed",
 45  )
 46  ```
 47  
 48  ## Usage
 49  
 50  ### On its own
 51  
 52  Remember first to set the`MISTRAL_API_KEY` as an environment variable or pass it in directly.
 53  
 54  Here is how you can use the component on its own:
 55  
 56  ```python
 57  from haystack import Document
 58  from haystack_integrations.components.embedders.mistral.document_embedder import (
 59      MistralDocumentEmbedder,
 60  )
 61  
 62  doc = Document(content="I love pizza!")
 63  
 64  embedder = MistralDocumentEmbedder(
 65      api_key=Secret.from_token("<your-api-key>"),
 66      model="mistral-embed",
 67  )
 68  
 69  result = embedder.run([doc])
 70  print(result["documents"][0].embedding)
 71  ## [-0.453125, 1.2236328, 2.0058594, 0.67871094...]
 72  ```
 73  
 74  ### In a pipeline
 75  
 76  Below is an example of the `MistralDocumentEmbedder` in an indexing pipeline. We are indexing the contents of a webpage into an `InMemoryDocumentStore`.
 77  
 78  ```python
 79  from haystack import Pipeline
 80  from haystack.components.converters import HTMLToDocument
 81  from haystack.components.fetchers import LinkContentFetcher
 82  from haystack.components.preprocessors import DocumentSplitter
 83  from haystack.components.writers import DocumentWriter
 84  from haystack.document_stores.in_memory import InMemoryDocumentStore
 85  from haystack_integrations.components.embedders.mistral.document_embedder import (
 86      MistralDocumentEmbedder,
 87  )
 88  
 89  document_store = InMemoryDocumentStore()
 90  fetcher = LinkContentFetcher()
 91  converter = HTMLToDocument()
 92  chunker = DocumentSplitter()
 93  embedder = MistralDocumentEmbedder()
 94  writer = DocumentWriter(document_store=document_store)
 95  
 96  indexing = Pipeline()
 97  
 98  indexing.add_component(name="fetcher", instance=fetcher)
 99  indexing.add_component(name="converter", instance=converter)
100  indexing.add_component(name="chunker", instance=chunker)
101  indexing.add_component(name="embedder", instance=embedder)
102  indexing.add_component(name="writer", instance=writer)
103  
104  indexing.connect("fetcher", "converter")
105  indexing.connect("converter", "chunker")
106  indexing.connect("chunker", "embedder")
107  indexing.connect("embedder", "writer")
108  
109  indexing.run(data={"fetcher": {"urls": ["https://mistral.ai/news/la-plateforme/"]}})
110  ```