Maintained by deepset
Integration: Mistral
Use the Mistral API for embedding and text generation models.
Table of Contents
Overview
Mistral AI currently provides two types of access to Large Language Models:
- An API providing pay-as-you-go access to the latest Mistral models like
mistral-embed
andmistral-small
. - Open-source models available under the Apache 2.0 License, available on
Hugging Face which you can use with the
HuggingFaceTGIGenerator
.
For more information on models available via the Mistral API, see the Mistal docs.
In order to follow along with this guide, you’ll need a
Mistal API key. Add it as an environment variable, MISTRAL_API_KEY
.
Installation
pip install mistral-haystack
Usage
Components
This instegration introduces 3 components:
- The
MistralDocumentEmbedder
: Creates embeddings for Haystack Documents using Mistral embedding models (currently onlymistral-embed
). - The
MistralTextEmbedder
: Creates embeddings for texts (such as queries) using Mistral embedding models (currently onlymistral-embed
) - The
MistralChatGenerator
: Uses Mistral chat completion models such asmistral-tiny
(default).
Use Mistral Generative Models
import os
from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.mistral import MistralChatGenerator
os.environ["MISTRAL_API_KEY"] = "YOUR_MISTRAL_API_KEY"
model = "mistral-medium"
client = MistralChatGenerator(model=model)
response = client.run(
messages=[ChatMessage.from_user("What is the best French cheese?")]
)
print(response)
{'replies': [ChatMessage(content='The "best" French cheese is subjective and depends on personal taste...', role=<ChatRole.ASSISTANT: 'assistant'>, name=None, meta={'model': 'mistral-medium', 'index': 0, 'finish_reason': 'stop', 'usage': {'completion_tokens': 231, 'prompt_tokens': 16, 'total_tokens': 247}})]}
Mistral LLMs also support streaming responses if you pass a callback into the MistralChatGenerator
like so:
import os
from haystack.components.generators.utils import print_streaming_chunk
from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.mistral import MistralChatGenerator
os.environ["MISTRAL_API_KEY"] = "YOUR_MISTRAL_API_KEY"
model = "mistral-medium"
client = MistralChatGenerator(
model=model,
streaming_callback=print_streaming_chunk
)
response = client.run(
messages=[ChatMessage.from_user("What is the best French cheese?")]
)
print(response)
Use a Mistral Embedding Models
Use the MistralDocumentEmbedder
in an indexing pipeline:
import os
from haystack_integrations.components.embedders.mistral.document_embedder import MistralDocumentEmbedder
os.environ["MISTRAL_API_KEY"] = "YOUR_MISTRAL_API_KEY"
document_store = InMemoryDocumentStore(embedding_similarity_function="cosine")
documents = [Document(content="My name is Wolfgang and I live in Berlin"),
Document(content="I saw a black horse running"),
Document(content="Germany has many big cities")]
embedder = MistralDocumentEmbedder()
writer = DocumentWriter(document_store=document_store)
indexing_pipeline = Pipeline()
indexing_pipeline.add_component(name="embedder", instance=embedder)
indexing_pipeline.add_component(name="writer", instance=writer)
indexing_pipeline.run(data={"embedder": {"documents": documents}})
Use the MistralTextEmbedder
in a RAG pipeline:
import os
from haystack import Document, Pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack_integrations.components.embedders.mistral.document_embedder import MistralDocumentEmbedder
from haystack_integrations.components.embedders.mistral.text_embedder import MistralTextEmbedder
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever
os.environ["MISTRAL_API_KEY"] = "YOUR_MISTRAL_API_KEY"
document_store = InMemoryDocumentStore()
documents = [Document(content="My name is Wolfgang and I live in Berlin"),
Document(content="I saw a black horse running"),
Document(content="Germany has many big cities")]
document_embedder = MistralDocumentEmbedder()
documents_with_embeddings = document_embedder.run(documents)['documents']
document_store.write_documents(documents)
text_embedder = MistralTextEmbedder()
retriever = InMemoryEmbeddingRetriever(document_store=document_store)
prompt_builder = DynamicChatPromptBuilder(runtime_variables=["documents"])
llm = MistralChatGenerator(streaming_callback=print_streaming_chunk)
messages = [ChatMessage.from_user("Here are some the documents: {{documents}} \\n Answer: {{query}}")]
rag_pipeline = Pipeline()
rag_pipeline.add_component("text_embedder", text_embedder)
rag_pipeline.add_component("retriever", retriever)
rag_pipeline.add_component("prompt_builder", prompt_builder)
rag_pipeline.add_component("llm", llm)
rag_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")
rag_pipeline.connect("retriever.documents", "prompt_builder.documents")
rag_pipeline.connect("prompt_builder.prompt", "llm.messages")
question = "Who lives in Berlin?"
result = rag_pipeline.run(
{
"text_embedder": {"text": question},
"prompt_builder": {"template_variables": {"query": question}, "prompt_source": messages},
"llm": {"generation_kwargs": {"max_tokens": 165}},
}
)
License
mistral-haystack
is distributed under the terms of the
Apache-2.0 license.