valkey.md
1 --- 2 title: "Valkey" 3 id: integrations-valkey 4 description: "Valkey integration for Haystack" 5 slug: "/integrations-valkey" 6 --- 7 8 9 ## haystack_integrations.components.retrievers.valkey.embedding_retriever 10 11 ### ValkeyEmbeddingRetriever 12 13 A component for retrieving documents from a ValkeyDocumentStore using vector similarity search. 14 15 This retriever uses dense embeddings to find semantically similar documents. It supports 16 filtering by metadata fields and configurable similarity thresholds. 17 18 Key features: 19 20 - Vector similarity search using HNSW algorithm 21 - Metadata filtering with tag and numeric field support 22 - Configurable top-k results 23 - Filter policy management for runtime filter application 24 25 Usage example: 26 27 ```python 28 from haystack.document_stores.types import DuplicatePolicy 29 from haystack import Document 30 from haystack import Pipeline 31 from haystack.components.embedders import SentenceTransformersTextEmbedder, SentenceTransformersDocumentEmbedder 32 from haystack_integrations.components.retrievers.valkey import ValkeyEmbeddingRetriever 33 from haystack_integrations.document_stores.valkey import ValkeyDocumentStore 34 35 document_store = ValkeyDocumentStore(index_name="my_index", embedding_dim=768) 36 37 documents = [Document(content="There are over 7,000 languages spoken around the world today."), 38 Document(content="Elephants have been observed to behave in a way that indicates..."), 39 Document(content="In certain places, you can witness the phenomenon of bioluminescent waves.")] 40 41 document_embedder = SentenceTransformersDocumentEmbedder() 42 document_embedder.warm_up() 43 documents_with_embeddings = document_embedder.run(documents) 44 45 document_store.write_documents(documents_with_embeddings.get("documents"), policy=DuplicatePolicy.OVERWRITE) 46 47 query_pipeline = Pipeline() 48 query_pipeline.add_component("text_embedder", SentenceTransformersTextEmbedder()) 49 query_pipeline.add_component("retriever", ValkeyEmbeddingRetriever(document_store=document_store)) 50 query_pipeline.connect("text_embedder.embedding", "retriever.query_embedding") 51 52 query = "How many languages are there?" 53 54 res = query_pipeline.run({"text_embedder": {"text": query}}) 55 assert res['retriever']['documents'][0].content == "There are over 7,000 languages spoken around the world today." 56 ``` 57 58 #### __init__ 59 60 ```python 61 __init__( 62 *, 63 document_store: ValkeyDocumentStore, 64 filters: dict[str, Any] | None = None, 65 top_k: int = 10, 66 filter_policy: str | FilterPolicy = FilterPolicy.REPLACE 67 ) 68 ``` 69 70 **Parameters:** 71 72 - **document_store** (<code>ValkeyDocumentStore</code>) – The Valkey Document Store. 73 - **filters** (<code>dict\[str, Any\] | None</code>) – Filters applied to the retrieved Documents. 74 - **top_k** (<code>int</code>) – Maximum number of Documents to return. 75 - **filter_policy** (<code>str | FilterPolicy</code>) – Policy to determine how filters are applied. 76 77 **Raises:** 78 79 - <code>ValueError</code> – If `document_store` is not an instance of `ValkeyDocumentStore`. 80 81 #### to_dict 82 83 ```python 84 to_dict() -> dict[str, Any] 85 ``` 86 87 Serializes the component to a dictionary. 88 89 **Returns:** 90 91 - <code>dict\[str, Any\]</code> – Dictionary with serialized data. 92 93 #### from_dict 94 95 ```python 96 from_dict(data: dict[str, Any]) -> ValkeyEmbeddingRetriever 97 ``` 98 99 Deserializes the component from a dictionary. 100 101 **Parameters:** 102 103 - **data** (<code>dict\[str, Any\]</code>) – Dictionary to deserialize from. 104 105 **Returns:** 106 107 - <code>ValkeyEmbeddingRetriever</code> – Deserialized component. 108 109 #### run 110 111 ```python 112 run( 113 query_embedding: list[float], 114 filters: dict[str, Any] | None = None, 115 top_k: int | None = None, 116 ) -> dict[str, list[Document]] 117 ``` 118 119 Retrieve documents from the `ValkeyDocumentStore`, based on their dense embeddings. 120 121 **Parameters:** 122 123 - **query_embedding** (<code>list\[float\]</code>) – Embedding of the query. 124 - **filters** (<code>dict\[str, Any\] | None</code>) – Filters applied to the retrieved Documents. The way runtime filters are applied depends on 125 the `filter_policy` chosen at retriever initialization. See init method docstring for more 126 details. 127 - **top_k** (<code>int | None</code>) – Maximum number of `Document`s to return. 128 129 **Returns:** 130 131 - <code>dict\[str, list\[Document\]\]</code> – List of Document similar to `query_embedding`. 132 133 #### run_async 134 135 ```python 136 run_async( 137 query_embedding: list[float], 138 filters: dict[str, Any] | None = None, 139 top_k: int | None = None, 140 ) -> dict[str, list[Document]] 141 ``` 142 143 Asynchronously retrieve documents from the `ValkeyDocumentStore`, based on their dense embeddings. 144 145 **Parameters:** 146 147 - **query_embedding** (<code>list\[float\]</code>) – Embedding of the query. 148 - **filters** (<code>dict\[str, Any\] | None</code>) – Filters applied to the retrieved Documents. The way runtime filters are applied depends on 149 the `filter_policy` chosen at retriever initialization. See init method docstring for more 150 details. 151 - **top_k** (<code>int | None</code>) – Maximum number of `Document`s to return. 152 153 **Returns:** 154 155 - <code>dict\[str, list\[Document\]\]</code> – List of Document similar to `query_embedding`. 156 157 ## haystack_integrations.document_stores.valkey.document_store 158 159 ### ValkeyDocumentStore 160 161 Bases: <code>DocumentStore</code> 162 163 A document store implementation using Valkey with vector search capabilities. 164 165 This document store provides persistent storage for documents with embeddings and supports 166 vector similarity search using the Valkey Search module. It's designed for high-performance 167 retrieval applications requiring both semantic search and metadata filtering. 168 169 Key features: 170 171 - Vector similarity search with HNSW algorithm 172 - Metadata filtering on tag and numeric fields 173 - Configurable distance metrics (L2, cosine, inner product) 174 - Batch operations for efficient document management 175 - Both synchronous and asynchronous operations 176 - Cluster and standalone mode support 177 178 Supported filterable Document metadata fields: 179 180 - meta_category (TagField): exact string matches 181 - meta_status (TagField): status filtering 182 - meta_priority (NumericField): numeric comparisons 183 - meta_score (NumericField): score filtering 184 - meta_timestamp (NumericField): date/time filtering 185 186 Usage example: 187 188 ```python 189 from haystack import Document 190 from haystack_integrations.document_stores.valkey import ValkeyDocumentStore 191 192 # Initialize document store 193 document_store = ValkeyDocumentStore( 194 nodes_list=[("localhost", 6379)], 195 index_name="my_documents", 196 embedding_dim=768, 197 distance_metric="cosine" 198 ) 199 200 # Store documents with embeddings 201 documents = [ 202 Document( 203 content="Valkey is a Redis-compatible database", 204 embedding=[0.1, 0.2, ...], # 768-dim vector 205 meta={"category": "database", "priority": 1} 206 ) 207 ] 208 document_store.write_documents(documents) 209 210 # Search with filters 211 results = document_store._embedding_retrival( 212 embedding=[0.1, 0.15, ...], 213 filters={"field": "meta.category", "operator": "==", "value": "database"}, 214 limit=10 215 ) 216 ``` 217 218 #### __init__ 219 220 ```python 221 __init__( 222 nodes_list: list[tuple[str, int]] | None = None, 223 *, 224 cluster_mode: bool = False, 225 use_tls: bool = False, 226 username: Secret | None = Secret.from_env_var( 227 "VALKEY_USERNAME", strict=False 228 ), 229 password: Secret | None = Secret.from_env_var( 230 "VALKEY_PASSWORD", strict=False 231 ), 232 request_timeout: int = 500, 233 retry_attempts: int = 3, 234 retry_base_delay_ms: int = 1000, 235 retry_exponent_base: int = 2, 236 batch_size: int = 100, 237 index_name: str = "default", 238 distance_metric: Literal["l2", "cosine", "ip"] = "cosine", 239 embedding_dim: int = 768, 240 metadata_fields: dict[str, type[str] | type[int]] | None = None 241 ) 242 ``` 243 244 Creates a new ValkeyDocumentStore instance. 245 246 **Parameters:** 247 248 - **nodes_list** (<code>list\[tuple\[str, int\]\] | None</code>) – List of (host, port) tuples for Valkey nodes. Defaults to [("localhost", 6379)]. 249 - **cluster_mode** (<code>bool</code>) – Whether to connect in cluster mode. Defaults to False. 250 - **use_tls** (<code>bool</code>) – Whether to use TLS for connections. Defaults to False. 251 - **username** (<code>Secret | None</code>) – Username for authentication. If not provided, reads from VALKEY_USERNAME environment variable. 252 Defaults to None. 253 - **password** (<code>Secret | None</code>) – Password for authentication. If not provided, reads from VALKEY_PASSWORD environment variable. 254 Defaults to None. 255 - **request_timeout** (<code>int</code>) – Request timeout in milliseconds. Defaults to 500. 256 - **retry_attempts** (<code>int</code>) – Number of retry attempts for failed operations. Defaults to 3. 257 - **retry_base_delay_ms** (<code>int</code>) – Base delay in milliseconds for exponential backoff. Defaults to 1000. 258 - **retry_exponent_base** (<code>int</code>) – Exponent base for exponential backoff calculation. Defaults to 2. 259 - **batch_size** (<code>int</code>) – Number of documents to process in a single batch for async operations. Defaults to 100. 260 - **index_name** (<code>str</code>) – Name of the search index. Defaults to "haystack_document". 261 - **distance_metric** (<code>Literal['l2', 'cosine', 'ip']</code>) – Distance metric for vector similarity. Options: "l2", "cosine", "ip" (inner product). 262 Defaults to "cosine". 263 - **embedding_dim** (<code>int</code>) – Dimension of document embeddings. Defaults to 768. 264 - **metadata_fields** (<code>dict\[str, type\[str\] | type\[int\]\] | None</code>) – Dictionary mapping metadata field names to Python types for filtering. 265 Supported types: str (for exact matching), int (for numeric comparisons). 266 Example: `{"category": str, "priority": int}`. 267 If not provided, no metadata fields will be indexed for filtering. 268 269 #### to_dict 270 271 ```python 272 to_dict() -> dict[str, Any] 273 ``` 274 275 Serializes this store to a dictionary. 276 277 #### from_dict 278 279 ```python 280 from_dict(data: dict[str, Any]) -> ValkeyDocumentStore 281 ``` 282 283 Deserializes the store from a dictionary. 284 285 #### count_documents 286 287 ```python 288 count_documents() -> int 289 ``` 290 291 Return the number of documents stored in the document store. 292 293 This method queries the Valkey Search index to get the total count of indexed documents. 294 If the index doesn't exist, it returns 0. 295 296 **Returns:** 297 298 - <code>int</code> – The number of documents in the document store. 299 300 **Raises:** 301 302 - <code>ValkeyDocumentStoreError</code> – If there's an error accessing the index or counting documents. 303 304 Example: 305 306 ```python 307 document_store = ValkeyDocumentStore() 308 count = document_store.count_documents() 309 print(f"Total documents: {count}") 310 ``` 311 312 #### count_documents_async 313 314 ```python 315 count_documents_async() -> int 316 ``` 317 318 Asynchronously return the number of documents stored in the document store. 319 320 This method queries the Valkey Search index to get the total count of indexed documents. 321 If the index doesn't exist, it returns 0. This is the async version of count_documents(). 322 323 **Returns:** 324 325 - <code>int</code> – The number of documents in the document store. 326 327 **Raises:** 328 329 - <code>ValkeyDocumentStoreError</code> – If there's an error accessing the index or counting documents. 330 331 Example: 332 333 ```python 334 document_store = ValkeyDocumentStore() 335 count = await document_store.count_documents_async() 336 print(f"Total documents: {count}") 337 ``` 338 339 #### filter_documents 340 341 ```python 342 filter_documents(filters: dict[str, Any] | None = None) -> list[Document] 343 ``` 344 345 Filter documents by metadata without vector search. 346 347 This method retrieves documents based on metadata filters without performing vector similarity search. 348 Since Valkey Search requires vector queries, this method uses a dummy vector internally and removes 349 the similarity scores from results. 350 351 **Parameters:** 352 353 - **filters** (<code>dict\[str, Any\] | None</code>) – Optional metadata filters in Haystack format. Supports filtering on: 354 - meta.category (string equality) 355 - meta.status (string equality) 356 - meta.priority (numeric comparisons) 357 - meta.score (numeric comparisons) 358 - meta.timestamp (numeric comparisons) 359 360 **Returns:** 361 362 - <code>list\[Document\]</code> – List of documents matching the filters, with score set to None. 363 364 **Raises:** 365 366 - <code>ValkeyDocumentStoreError</code> – If there's an error filtering documents. 367 368 Example: 369 370 ```python 371 # Filter by category 372 docs = document_store.filter_documents( 373 filters={"field": "meta.category", "operator": "==", "value": "news"} 374 ) 375 376 # Filter by numeric range 377 docs = document_store.filter_documents( 378 filters={"field": "meta.priority", "operator": ">=", "value": 5} 379 ) 380 ``` 381 382 #### filter_documents_async 383 384 ```python 385 filter_documents_async(filters: dict[str, Any] | None = None) -> list[Document] 386 ``` 387 388 Asynchronously filter documents by metadata without vector search. 389 390 This is the async version of filter_documents(). It retrieves documents based on metadata filters 391 without performing vector similarity search. Since Valkey Search requires vector queries, this method 392 uses a dummy vector internally and removes the similarity scores from results. 393 394 **Parameters:** 395 396 - **filters** (<code>dict\[str, Any\] | None</code>) – Optional metadata filters in Haystack format. Supports filtering on: 397 - meta.category (string equality) 398 - meta.status (string equality) 399 - meta.priority (numeric comparisons) 400 - meta.score (numeric comparisons) 401 - meta.timestamp (numeric comparisons) 402 403 **Returns:** 404 405 - <code>list\[Document\]</code> – List of documents matching the filters, with score set to None. 406 407 **Raises:** 408 409 - <code>ValkeyDocumentStoreError</code> – If there's an error filtering documents. 410 411 Example: 412 413 ```python 414 # Filter by category 415 docs = await document_store.filter_documents_async( 416 filters={"field": "meta.category", "operator": "==", "value": "news"} 417 ) 418 419 # Filter by numeric range 420 docs = await document_store.filter_documents_async( 421 filters={"field": "meta.priority", "operator": ">=", "value": 5} 422 ) 423 ``` 424 425 #### write_documents 426 427 ```python 428 write_documents( 429 documents: list[Document], policy: DuplicatePolicy = DuplicatePolicy.NONE 430 ) -> int 431 ``` 432 433 Write documents to the document store. 434 435 This method stores documents with their embeddings and metadata in Valkey. The search index is 436 automatically created if it doesn't exist. Documents without embeddings will be assigned a 437 dummy vector for indexing purposes. 438 439 **Parameters:** 440 441 - **documents** (<code>list\[Document\]</code>) – List of Document objects to store. Each document should have: 442 - content: The document text 443 - embedding: Vector representation (optional, dummy vector used if missing) 444 - meta: Optional metadata dict with supported fields (category, status, priority, score, timestamp) 445 - **policy** (<code>DuplicatePolicy</code>) – How to handle duplicate documents. Only NONE and OVERWRITE are supported. 446 Defaults to DuplicatePolicy.NONE. 447 448 **Returns:** 449 450 - <code>int</code> – Number of documents successfully written. 451 452 **Raises:** 453 454 - <code>ValkeyDocumentStoreError</code> – If there's an error writing documents. 455 - <code>ValueError</code> – If documents list contains invalid objects. 456 457 Example: 458 459 ```python 460 documents = [ 461 Document( 462 content="First document", 463 embedding=[0.1, 0.2, 0.3], 464 meta={"category": "news", "priority": 1} 465 ), 466 Document( 467 content="Second document", 468 embedding=[0.4, 0.5, 0.6], 469 meta={"category": "blog", "priority": 2} 470 ) 471 ] 472 count = document_store.write_documents(documents) 473 print(f"Wrote {count} documents") 474 ``` 475 476 #### write_documents_async 477 478 ```python 479 write_documents_async( 480 documents: list[Document], policy: DuplicatePolicy = DuplicatePolicy.NONE 481 ) -> int 482 ``` 483 484 Asynchronously write documents to the document store. 485 486 This is the async version of write_documents(). It stores documents with their embeddings and 487 metadata in Valkey using batch processing for improved performance. The search index is 488 automatically created if it doesn't exist. 489 490 **Parameters:** 491 492 - **documents** (<code>list\[Document\]</code>) – List of Document objects to store. Each document should have: 493 - content: The document text 494 - embedding: Vector representation (optional, dummy vector used if missing) 495 - meta: Optional metadata dict with supported fields (category, status, priority, score, timestamp) 496 - **policy** (<code>DuplicatePolicy</code>) – How to handle duplicate documents. Only NONE and OVERWRITE are supported. 497 Defaults to DuplicatePolicy.NONE. 498 499 **Returns:** 500 501 - <code>int</code> – Number of documents successfully written. 502 503 **Raises:** 504 505 - <code>ValkeyDocumentStoreError</code> – If there's an error writing documents. 506 - <code>ValueError</code> – If documents list contains invalid objects. 507 508 Example: 509 510 ```python 511 documents = [ 512 Document( 513 content="First document", 514 embedding=[0.1, 0.2, 0.3], 515 meta={"category": "news", "priority": 1} 516 ), 517 Document( 518 content="Second document", 519 embedding=[0.4, 0.5, 0.6], 520 meta={"category": "blog", "priority": 2} 521 ) 522 ] 523 count = await document_store.write_documents_async(documents) 524 print(f"Wrote {count} documents") 525 ``` 526 527 #### delete_documents 528 529 ```python 530 delete_documents(document_ids: list[str]) -> None 531 ``` 532 533 Delete documents from the document store by their IDs. 534 535 This method removes documents from both the Valkey database and the search index. 536 If some documents are not found, a warning is logged but the operation continues. 537 538 **Parameters:** 539 540 - **document_ids** (<code>list\[str\]</code>) – List of document IDs to delete. These should be the same IDs 541 used when the documents were originally stored. 542 543 **Raises:** 544 545 - <code>ValkeyDocumentStoreError</code> – If there's an error deleting documents. 546 547 Example: 548 549 ```python 550 # Delete specific documents 551 document_store.delete_documents(["doc1", "doc2", "doc3"]) 552 553 # Delete a single document 554 document_store.delete_documents(["single_doc_id"]) 555 ``` 556 557 #### delete_documents_async 558 559 ```python 560 delete_documents_async(document_ids: list[str]) -> None 561 ``` 562 563 Asynchronously delete documents from the document store by their IDs. 564 565 This is the async version of delete_documents(). It removes documents from both the Valkey 566 database and the search index. If some documents are not found, a warning is logged but 567 the operation continues. 568 569 **Parameters:** 570 571 - **document_ids** (<code>list\[str\]</code>) – List of document IDs to delete. These should be the same IDs 572 used when the documents were originally stored. 573 574 **Raises:** 575 576 - <code>ValkeyDocumentStoreError</code> – If there's an error deleting documents. 577 578 Example: 579 580 ```python 581 # Delete specific documents 582 await document_store.delete_documents_async(["doc1", "doc2", "doc3"]) 583 584 # Delete a single document 585 await document_store.delete_documents_async(["single_doc_id"]) 586 ``` 587 588 #### delete_by_filter 589 590 ```python 591 delete_by_filter(filters: dict[str, Any]) -> int 592 ``` 593 594 Delete all documents that match the provided filters. 595 596 **Parameters:** 597 598 - **filters** (<code>dict\[str, Any\]</code>) – Haystack filter dictionary to select documents to delete. 599 600 **Returns:** 601 602 - <code>int</code> – The number of documents deleted. 603 604 **Raises:** 605 606 - <code>FilterError</code> – If the filter structure is invalid. 607 - <code>ValkeyDocumentStoreError</code> – If deletion fails. 608 609 #### delete_by_filter_async 610 611 ```python 612 delete_by_filter_async(filters: dict[str, Any]) -> int 613 ``` 614 615 Asynchronously delete all documents that match the provided filters. 616 617 **Parameters:** 618 619 - **filters** (<code>dict\[str, Any\]</code>) – Haystack filter dictionary to select documents to delete. 620 621 **Returns:** 622 623 - <code>int</code> – The number of documents deleted. 624 625 **Raises:** 626 627 - <code>FilterError</code> – If the filter structure is invalid. 628 - <code>ValkeyDocumentStoreError</code> – If deletion fails. 629 630 #### update_by_filter 631 632 ```python 633 update_by_filter(filters: dict[str, Any], meta: dict[str, Any]) -> int 634 ``` 635 636 Update metadata of all documents that match the provided filters. 637 638 **Parameters:** 639 640 - **filters** (<code>dict\[str, Any\]</code>) – Haystack filter dictionary to select documents to update. 641 - **meta** (<code>dict\[str, Any\]</code>) – Metadata key-value pairs to set on matching documents (merged with existing meta). 642 643 **Returns:** 644 645 - <code>int</code> – The number of documents updated. 646 647 **Raises:** 648 649 - <code>FilterError</code> – If the filter structure is invalid. 650 - <code>ValkeyDocumentStoreError</code> – If update or write fails. 651 652 #### update_by_filter_async 653 654 ```python 655 update_by_filter_async(filters: dict[str, Any], meta: dict[str, Any]) -> int 656 ``` 657 658 Asynchronously update metadata of all documents that match the provided filters. 659 660 **Parameters:** 661 662 - **filters** (<code>dict\[str, Any\]</code>) – Haystack filter dictionary to select documents to update. 663 - **meta** (<code>dict\[str, Any\]</code>) – Metadata key-value pairs to set on matching documents (merged with existing meta). 664 665 **Returns:** 666 667 - <code>int</code> – The number of documents updated. 668 669 **Raises:** 670 671 - <code>FilterError</code> – If the filter structure is invalid. 672 - <code>ValkeyDocumentStoreError</code> – If update or write fails. 673 674 #### count_documents_by_filter 675 676 ```python 677 count_documents_by_filter(filters: dict[str, Any]) -> int 678 ``` 679 680 Return the number of documents that match the provided filters. 681 682 **Parameters:** 683 684 - **filters** (<code>dict\[str, Any\]</code>) – Haystack filter dictionary to apply. 685 686 **Returns:** 687 688 - <code>int</code> – The number of matching documents. 689 690 **Raises:** 691 692 - <code>FilterError</code> – If the filter structure is invalid. 693 - <code>ValkeyDocumentStoreError</code> – If counting fails. 694 695 #### count_documents_by_filter_async 696 697 ```python 698 count_documents_by_filter_async(filters: dict[str, Any]) -> int 699 ``` 700 701 Asynchronously return the number of documents that match the provided filters. 702 703 **Parameters:** 704 705 - **filters** (<code>dict\[str, Any\]</code>) – Haystack filter dictionary to apply. 706 707 **Returns:** 708 709 - <code>int</code> – The number of matching documents. 710 711 **Raises:** 712 713 - <code>FilterError</code> – If the filter structure is invalid. 714 - <code>ValkeyDocumentStoreError</code> – If counting fails. 715 716 #### count_unique_metadata_by_filter 717 718 ```python 719 count_unique_metadata_by_filter( 720 filters: dict[str, Any], metadata_fields: list[str] 721 ) -> dict[str, int] 722 ``` 723 724 Count unique values for each specified metadata field in documents matching the filters. 725 726 **Parameters:** 727 728 - **filters** (<code>dict\[str, Any\]</code>) – Haystack filter dictionary to select documents. 729 - **metadata_fields** (<code>list\[str\]</code>) – List of metadata field names (e.g. "category" or "meta.category"). 730 731 **Returns:** 732 733 - <code>dict\[str, int\]</code> – Dictionary mapping each field name to the count of its unique values. 734 735 **Raises:** 736 737 - <code>FilterError</code> – If the filter structure is invalid. 738 - <code>ValueError</code> – If a field in metadata_fields is not configured for filtering. 739 - <code>ValkeyDocumentStoreError</code> – If the operation fails. 740 741 #### count_unique_metadata_by_filter_async 742 743 ```python 744 count_unique_metadata_by_filter_async( 745 filters: dict[str, Any], metadata_fields: list[str] 746 ) -> dict[str, int] 747 ``` 748 749 Asynchronously count unique values for each specified metadata field in documents matching the filters. 750 751 **Parameters:** 752 753 - **filters** (<code>dict\[str, Any\]</code>) – Haystack filter dictionary to select documents. 754 - **metadata_fields** (<code>list\[str\]</code>) – List of metadata field names (e.g. "category" or "meta.category"). 755 756 **Returns:** 757 758 - <code>dict\[str, int\]</code> – Dictionary mapping each field name to the count of its unique values. 759 760 **Raises:** 761 762 - <code>FilterError</code> – If the filter structure is invalid. 763 - <code>ValueError</code> – If a field in metadata_fields is not configured for filtering. 764 - <code>ValkeyDocumentStoreError</code> – If the operation fails. 765 766 #### get_metadata_fields_info 767 768 ```python 769 get_metadata_fields_info() -> dict[str, dict[str, str]] 770 ``` 771 772 Return information about metadata fields configured for filtering. 773 774 Returns the store's configured metadata field names and their types (as used in the index). 775 Field names are returned without the "meta." prefix (e.g. "category", "priority"). 776 777 **Returns:** 778 779 - <code>dict\[str, dict\[str, str\]\]</code> – Dictionary mapping field name to a dict with "type" key ("keyword" for tag, "long" for numeric). 780 781 #### get_metadata_field_min_max 782 783 ```python 784 get_metadata_field_min_max(metadata_field: str) -> dict[str, Any] 785 ``` 786 787 Return the minimum and maximum values for a numeric metadata field. 788 789 **Parameters:** 790 791 - **metadata_field** (<code>str</code>) – Metadata field name (e.g. "priority" or "meta.priority"). Must be a configured 792 numeric field. 793 794 **Returns:** 795 796 - <code>dict\[str, Any\]</code> – Dictionary with "min" and "max" keys (values are int/float or None if no values). 797 798 **Raises:** 799 800 - <code>ValueError</code> – If the field is not configured or is not numeric. 801 - <code>ValkeyDocumentStoreError</code> – If the operation fails. 802 803 #### get_metadata_field_min_max_async 804 805 ```python 806 get_metadata_field_min_max_async(metadata_field: str) -> dict[str, Any] 807 ``` 808 809 Asynchronously return the minimum and maximum values for a numeric metadata field. 810 811 **Parameters:** 812 813 - **metadata_field** (<code>str</code>) – Metadata field name (e.g. "priority" or "meta.priority"). Must be a configured 814 numeric field. 815 816 **Returns:** 817 818 - <code>dict\[str, Any\]</code> – Dictionary with "min" and "max" keys (values are int/float or None if no values). 819 820 **Raises:** 821 822 - <code>ValueError</code> – If the field is not configured or is not numeric. 823 - <code>ValkeyDocumentStoreError</code> – If the operation fails. 824 825 #### get_metadata_field_unique_values 826 827 ```python 828 get_metadata_field_unique_values( 829 metadata_field: str, 830 search_term: str | None = None, 831 from_: int = 0, 832 size: int = 10, 833 ) -> tuple[list[str], int] 834 ``` 835 836 Return unique values for a metadata field with optional search and pagination. 837 838 Values are stringified. For tag fields the distinct values are returned; for numeric fields 839 the string representation of each distinct value is returned. 840 841 **Parameters:** 842 843 - **metadata_field** (<code>str</code>) – Metadata field name (e.g. "category" or "meta.category"). 844 - **search_term** (<code>str | None</code>) – Optional case-insensitive substring filter on the value. 845 - **from\_** (<code>int</code>) – Start index for pagination (default 0). 846 - **size** (<code>int</code>) – Number of values to return (default 10). 847 848 **Returns:** 849 850 - <code>tuple\[list\[str\], int\]</code> – Tuple of (list of unique values for the requested page, total count of unique values). 851 852 **Raises:** 853 854 - <code>ValueError</code> – If the field is not configured for filtering. 855 - <code>ValkeyDocumentStoreError</code> – If the operation fails. 856 857 #### get_metadata_field_unique_values_async 858 859 ```python 860 get_metadata_field_unique_values_async( 861 metadata_field: str, 862 search_term: str | None = None, 863 from_: int = 0, 864 size: int = 10, 865 ) -> tuple[list[str], int] 866 ``` 867 868 Asynchronously return unique values for a metadata field with optional search and pagination. 869 870 **Parameters:** 871 872 - **metadata_field** (<code>str</code>) – Metadata field name (e.g. "category" or "meta.category"). 873 - **search_term** (<code>str | None</code>) – Optional case-insensitive substring filter on the value. 874 - **from\_** (<code>int</code>) – Start index for pagination (default 0). 875 - **size** (<code>int</code>) – Number of values to return (default 10). 876 877 **Returns:** 878 879 - <code>tuple\[list\[str\], int\]</code> – Tuple of (list of unique values for the requested page, total count of unique values). 880 881 **Raises:** 882 883 - <code>ValueError</code> – If the field is not configured for filtering. 884 - <code>ValkeyDocumentStoreError</code> – If the operation fails. 885 886 #### delete_all_documents 887 888 ```python 889 delete_all_documents() -> None 890 ``` 891 892 Delete all documents from the document store. 893 894 This method removes all documents by dropping the entire search index. This is an efficient 895 way to clear all data but requires recreating the index for future operations. If the index 896 doesn't exist, the operation completes without error. 897 898 **Raises:** 899 900 - <code>ValkeyDocumentStoreError</code> – If there's an error dropping the index. 901 902 Warning: 903 This operation is irreversible and will permanently delete all documents and the search index. 904 905 Example: 906 907 ```python 908 # Clear all documents from the store 909 document_store.delete_all_documents() 910 911 # The index will be automatically recreated on next write operation 912 document_store.write_documents(new_documents) 913 ``` 914 915 #### delete_all_documents_async 916 917 ```python 918 delete_all_documents_async() -> None 919 ``` 920 921 Asynchronously delete all documents from the document store. 922 923 This is the async version of delete_all_documents(). It removes all documents by dropping 924 the entire search index. This is an efficient way to clear all data but requires recreating 925 the index for future operations. If the index doesn't exist, the operation completes without error. 926 927 **Raises:** 928 929 - <code>ValkeyDocumentStoreError</code> – If there's an error dropping the index. 930 931 Warning: 932 This operation is irreversible and will permanently delete all documents and the search index. 933 934 Example: 935 936 ```python 937 # Clear all documents from the store 938 await document_store.delete_all_documents_async() 939 940 # The index will be automatically recreated on next write operation 941 await document_store.write_documents_async(new_documents) 942 ``` 943 944 ## haystack_integrations.document_stores.valkey.filters 945 946 Valkey document store filtering utilities. 947 948 This module provides filter conversion from Haystack's filter format to Valkey Search query syntax. 949 It supports both tag-based exact matching and numeric range filtering with logical operators. 950 951 Supported filter operations: 952 953 - TagField filters: ==, !=, in, not in (exact string matches) 954 - NumericField filters: ==, !=, >, >=, \<, \<=, in, not in (numeric comparisons) 955 - Logical operators: AND, OR for combining conditions 956 957 Filter syntax examples: 958 959 ```python 960 # Simple equality filter 961 filters = {"field": "meta.category", "operator": "==", "value": "tech"} 962 963 # Numeric range filter 964 filters = {"field": "meta.priority", "operator": ">=", "value": 5} 965 966 # List membership filter 967 filters = {"field": "meta.status", "operator": "in", "value": ["active", "pending"]} 968 969 # Complex logical filter 970 filters = { 971 "operator": "AND", 972 "conditions": [ 973 {"field": "meta.category", "operator": "==", "value": "tech"}, 974 {"field": "meta.priority", "operator": ">=", "value": 3} 975 ] 976 } 977 ```