Cradicle Explorer

/ docs-website / reference_versioned_docs / version-2.25 / integrations-api / oracle.md
oracle.md
  1  ---
  2  title: "Oracle AI Vector Search"
  3  id: integrations-oracle
  4  description: "Oracle AI Vector Search integration for Haystack"
  5  slug: "/integrations-oracle"
  6  ---
  7  
  8  
  9  ## haystack_integrations.components.retrievers.oracle.embedding_retriever
 10  
 11  ### OracleEmbeddingRetriever
 12  
 13  Retrieves documents from an OracleDocumentStore using vector similarity.
 14  
 15  Use inside a Haystack pipeline after a text embedder::
 16  
 17  ```
 18  pipeline.add_component("embedder", SentenceTransformersTextEmbedder())
 19  pipeline.add_component("retriever", OracleEmbeddingRetriever(
 20      document_store=store, top_k=5
 21  ))
 22  pipeline.connect("embedder.embedding", "retriever.query_embedding")
 23  ```
 24  
 25  #### run
 26  
 27  ```python
 28  run(
 29      query_embedding: list[float],
 30      filters: dict[str, Any] | None = None,
 31      top_k: int | None = None,
 32  ) -> dict[str, list[Document]]
 33  ```
 34  
 35  Retrieve documents by vector similarity.
 36  
 37  Args:
 38  query_embedding: Dense float vector from an embedder component.
 39  filters: Runtime filters, merged with constructor filters according to filter_policy.
 40  top_k: Override the constructor top_k for this call.
 41  
 42  Returns:
 43  `{"documents": [Document, ...]}`
 44  
 45  #### run_async
 46  
 47  ```python
 48  run_async(
 49      query_embedding: list[float],
 50      filters: dict[str, Any] | None = None,
 51      top_k: int | None = None,
 52  ) -> dict[str, list[Document]]
 53  ```
 54  
 55  Async variant of :meth:`run`.
 56  
 57  #### to_dict
 58  
 59  ```python
 60  to_dict() -> dict[str, Any]
 61  ```
 62  
 63  Serializes the component to a dictionary.
 64  
 65  **Returns:**
 66  
 67  - <code>dict\[str, Any\]</code> – Dictionary with serialized data.
 68  
 69  #### from_dict
 70  
 71  ```python
 72  from_dict(data: dict[str, Any]) -> OracleEmbeddingRetriever
 73  ```
 74  
 75  Deserializes the component from a dictionary.
 76  
 77  **Parameters:**
 78  
 79  - **data** (<code>dict\[str, Any\]</code>) – Dictionary to deserialize from.
 80  
 81  **Returns:**
 82  
 83  - <code>OracleEmbeddingRetriever</code> – Deserialized component.
 84  
 85  ## haystack_integrations.document_stores.oracle.document_store
 86  
 87  ### OracleConnectionConfig
 88  
 89  Connection parameters for Oracle Database.
 90  
 91  Supports both thin (direct TCP) and thick (wallet / ADB-S) modes.
 92  Thin mode requires no Oracle Instant Client; thick mode is activated
 93  automatically when *wallet_location* is provided.
 94  
 95  #### to_dict
 96  
 97  ```python
 98  to_dict() -> dict[str, Any]
 99  ```
100  
101  Serializes the component to a dictionary.
102  
103  **Returns:**
104  
105  - <code>dict\[str, Any\]</code> – Dictionary with serialized data.
106  
107  #### from_dict
108  
109  ```python
110  from_dict(data: dict[str, Any]) -> OracleConnectionConfig
111  ```
112  
113  Deserializes the component from a dictionary.
114  
115  **Parameters:**
116  
117  - **data** (<code>dict\[str, Any\]</code>) – Dictionary to deserialize from.
118  
119  **Returns:**
120  
121  - <code>OracleConnectionConfig</code> – Deserialized component.
122  
123  ### OracleDocumentStore
124  
125  Haystack DocumentStore backed by Oracle AI Vector Search.
126  
127  Requires Oracle Database 23ai or later (for VECTOR data type and
128  IF NOT EXISTS DDL support).
129  
130  Usage::
131  
132  ```
133  from haystack.utils import Secret
134  from haystack_integrations.document_stores.oracle import (
135      OracleDocumentStore, OracleConnectionConfig,
136  )
137  
138  store = OracleDocumentStore(
139      connection_config=OracleConnectionConfig(
140          user=Secret.from_env_var("ORACLE_USER"),
141          password=Secret.from_env_var("ORACLE_PASSWORD"),
142          dsn=Secret.from_env_var("ORACLE_DSN"),
143      ),
144      embedding_dim=1536,
145  )
146  ```
147  
148  #### __init__
149  
150  ```python
151  __init__(
152      *,
153      connection_config: OracleConnectionConfig,
154      table_name: str = "haystack_documents",
155      embedding_dim: int,
156      distance_metric: Literal["COSINE", "EUCLIDEAN", "DOT"] = "COSINE",
157      create_table_if_not_exists: bool = True,
158      create_index: bool = False,
159      hnsw_neighbors: int = 32,
160      hnsw_ef_construction: int = 200,
161      hnsw_accuracy: int = 95,
162      hnsw_parallel: int = 4
163  ) -> None
164  ```
165  
166  Initialise the document store and optionally create the backing table and indexes.
167  
168  **Parameters:**
169  
170  - **connection_config** (<code>OracleConnectionConfig</code>) – Oracle connection settings (user, password, DSN, optional wallet).
171  - **table_name** (<code>str</code>) – Name of the Oracle table used to store documents. Must be a valid Oracle
172    identifier (letters, digits, `_`, `$`, `#`; max 128 chars; cannot start with a digit).
173  - **embedding_dim** (<code>int</code>) – Dimensionality of the embedding vectors. Must match the model producing them.
174  - **distance_metric** (<code>Literal['COSINE', 'EUCLIDEAN', 'DOT']</code>) – Vector distance function used for similarity search.
175    One of `"COSINE"`, `"EUCLIDEAN"`, or `"DOT"`.
176  - **create_table_if_not_exists** (<code>bool</code>) – When `True` (default), creates the table and the DBMS_SEARCH
177    keyword index on first use if they do not already exist. Set to `False` when connecting to a
178    pre-existing table.
179  - **create_index** (<code>bool</code>) – When `True`, creates an HNSW vector index on initialisation. Equivalent to
180    calling :meth:`create_hnsw_index` manually. Defaults to `False`.
181  - **hnsw_neighbors** (<code>int</code>) – Number of neighbours in the HNSW graph. Higher values improve recall at the
182    cost of index size and build time. Defaults to `32`.
183  - **hnsw_ef_construction** (<code>int</code>) – Size of the dynamic candidate list during HNSW index construction.
184    Higher values improve recall at the cost of build time. Defaults to `200`.
185  - **hnsw_accuracy** (<code>int</code>) – Target recall accuracy percentage for the HNSW index (0-100).
186    Defaults to `95`.
187  - **hnsw_parallel** (<code>int</code>) – Degree of parallelism used when building the HNSW index. Defaults to `4`.
188  
189  **Raises:**
190  
191  - <code>ValueError</code> – If `table_name` is not a valid Oracle identifier or `embedding_dim` is not
192    a positive integer.
193  
194  #### create_keyword_index
195  
196  ```python
197  create_keyword_index() -> None
198  ```
199  
200  Create the DBMS_SEARCH keyword index on this table.
201  
202  Safe to call multiple times — silently skips if the index already exists.
203  Required for keyword retrieval. Called automatically when
204  `create_table_if_not_exists=True`, but must be called explicitly
205  when connecting to a pre-existing table.
206  
207  #### create_hnsw_index
208  
209  ```python
210  create_hnsw_index() -> None
211  ```
212  
213  Create an HNSW vector index on the embedding column.
214  
215  Safe to call multiple times — uses IF NOT EXISTS.
216  
217  #### create_hnsw_index_async
218  
219  ```python
220  create_hnsw_index_async() -> None
221  ```
222  
223  Asynchronously creates an HNSW vector index on the embedding column.
224  
225  Safe to call multiple times — uses `IF NOT EXISTS`.
226  
227  #### write_documents
228  
229  ```python
230  write_documents(
231      documents: list[Document], policy: DuplicatePolicy = DuplicatePolicy.NONE
232  ) -> int
233  ```
234  
235  Writes documents to the document store.
236  
237  **Parameters:**
238  
239  - **documents** (<code>list\[Document\]</code>) – A list of Documents to write to the document store.
240  - **policy** (<code>DuplicatePolicy</code>) – The duplicate policy to use when writing documents.
241  
242  **Returns:**
243  
244  - <code>int</code> – The number of documents written to the document store.
245  
246  **Raises:**
247  
248  - <code>DuplicateDocumentError</code> – If a document with the same id already exists in the document store
249    and the policy is set to `DuplicatePolicy.FAIL` or `DuplicatePolicy.NONE`.
250  
251  #### write_documents_async
252  
253  ```python
254  write_documents_async(
255      documents: list[Document], policy: DuplicatePolicy = DuplicatePolicy.NONE
256  ) -> int
257  ```
258  
259  Asynchronously writes documents to the document store.
260  
261  **Parameters:**
262  
263  - **documents** (<code>list\[Document\]</code>) – A list of Documents to write to the document store.
264  - **policy** (<code>DuplicatePolicy</code>) – The duplicate policy to use when writing documents.
265  
266  **Returns:**
267  
268  - <code>int</code> – The number of documents written to the document store.
269  
270  **Raises:**
271  
272  - <code>DuplicateDocumentError</code> – If a document with the same id already exists in the document store
273    and the policy is set to `DuplicatePolicy.FAIL` or `DuplicatePolicy.NONE`.
274  
275  #### filter_documents
276  
277  ```python
278  filter_documents(filters: dict[str, Any] | None = None) -> list[Document]
279  ```
280  
281  Returns the documents that match the filters provided.
282  
283  For a detailed specification of the filters,
284  refer to the [documentation](https://docs.haystack.deepset.ai/docs/metadata-filtering)
285  
286  **Parameters:**
287  
288  - **filters** (<code>dict\[str, Any\] | None</code>) – The filters to apply to the document list.
289  
290  **Returns:**
291  
292  - <code>list\[Document\]</code> – A list of Documents that match the given filters.
293  
294  #### filter_documents_async
295  
296  ```python
297  filter_documents_async(filters: dict[str, Any] | None = None) -> list[Document]
298  ```
299  
300  Asynchronously returns the documents that match the filters provided.
301  
302  For a detailed specification of the filters,
303  refer to the [documentation](https://docs.haystack.deepset.ai/docs/metadata-filtering)
304  
305  **Parameters:**
306  
307  - **filters** (<code>dict\[str, Any\] | None</code>) – The filters to apply to the document list.
308  
309  **Returns:**
310  
311  - <code>list\[Document\]</code> – A list of Documents that match the given filters.
312  
313  #### delete_documents
314  
315  ```python
316  delete_documents(document_ids: list[str]) -> None
317  ```
318  
319  Deletes documents that match the provided `document_ids` from the document store.
320  
321  **Parameters:**
322  
323  - **document_ids** (<code>list\[str\]</code>) – the document ids to delete
324  
325  #### delete_documents_async
326  
327  ```python
328  delete_documents_async(document_ids: list[str]) -> None
329  ```
330  
331  Asynchronously deletes documents that match the provided `document_ids` from the document store.
332  
333  **Parameters:**
334  
335  - **document_ids** (<code>list\[str\]</code>) – the document ids to delete
336  
337  #### count_documents
338  
339  ```python
340  count_documents() -> int
341  ```
342  
343  Returns how many documents are present in the document store.
344  
345  **Returns:**
346  
347  - <code>int</code> – Number of documents in the document store.
348  
349  #### count_documents_async
350  
351  ```python
352  count_documents_async() -> int
353  ```
354  
355  Asynchronously returns how many documents are present in the document store.
356  
357  **Returns:**
358  
359  - <code>int</code> – Number of documents in the document store.
360  
361  #### delete_table
362  
363  ```python
364  delete_table() -> None
365  ```
366  
367  Permanently drops the document store table and its associated DBMS_SEARCH keyword index.
368  
369  Uses `DROP TABLE ... PURGE` which bypasses the Oracle recycle bin — the operation is
370  irreversible. The keyword index is dropped after the table; if either operation fails a
371  :class:`DocumentStoreError` is raised.
372  
373  **Raises:**
374  
375  - <code>DocumentStoreError</code> – If the table or keyword index cannot be dropped.
376  
377  #### delete_table_async
378  
379  ```python
380  delete_table_async() -> None
381  ```
382  
383  Asynchronously permanently drops the document store table and its DBMS_SEARCH keyword index.
384  
385  Uses `DROP TABLE ... PURGE` which bypasses the Oracle recycle bin — the operation is
386  irreversible.
387  
388  **Raises:**
389  
390  - <code>DocumentStoreError</code> – If the table or keyword index cannot be dropped.
391  
392  #### delete_all_documents
393  
394  ```python
395  delete_all_documents() -> None
396  ```
397  
398  Removes all documents from the table using `TRUNCATE`.
399  
400  `TRUNCATE` is non-recoverable — it cannot be rolled back and bypasses row-level triggers.
401  The table structure and indexes are preserved.
402  
403  #### delete_all_documents_async
404  
405  ```python
406  delete_all_documents_async() -> None
407  ```
408  
409  Asynchronously removes all documents from the table using `TRUNCATE`.
410  
411  `TRUNCATE` is non-recoverable — it cannot be rolled back and bypasses row-level triggers.
412  The table structure and indexes are preserved.
413  
414  #### count_documents_by_filter
415  
416  ```python
417  count_documents_by_filter(filters: dict[str, Any]) -> int
418  ```
419  
420  Returns the number of documents that match the provided filters.
421  
422  **Parameters:**
423  
424  - **filters** (<code>dict\[str, Any\]</code>) – Haystack filter dict. An empty dict matches all documents.
425    See the `metadata filtering docs <https://docs.haystack.deepset.ai/docs/metadata-filtering>`\_.
426  
427  **Returns:**
428  
429  - <code>int</code> – Count of matching documents.
430  
431  #### count_documents_by_filter_async
432  
433  ```python
434  count_documents_by_filter_async(filters: dict[str, Any]) -> int
435  ```
436  
437  Asynchronously returns the number of documents that match the provided filters.
438  
439  **Parameters:**
440  
441  - **filters** (<code>dict\[str, Any\]</code>) – Haystack filter dict. An empty dict matches all documents.
442    See the `metadata filtering docs <https://docs.haystack.deepset.ai/docs/metadata-filtering>`\_.
443  
444  **Returns:**
445  
446  - <code>int</code> – Count of matching documents.
447  
448  #### delete_by_filter
449  
450  ```python
451  delete_by_filter(filters: dict[str, Any]) -> int
452  ```
453  
454  Deletes all documents that match the provided filters.
455  
456  **Parameters:**
457  
458  - **filters** (<code>dict\[str, Any\]</code>) – Haystack filter dict. An empty dict is treated as a no-op and returns `0`
459    without touching the table.
460    See the `metadata filtering docs <https://docs.haystack.deepset.ai/docs/metadata-filtering>`\_.
461  
462  **Returns:**
463  
464  - <code>int</code> – Number of deleted documents.
465  
466  #### delete_by_filter_async
467  
468  ```python
469  delete_by_filter_async(filters: dict[str, Any]) -> int
470  ```
471  
472  Asynchronously deletes all documents that match the provided filters.
473  
474  **Parameters:**
475  
476  - **filters** (<code>dict\[str, Any\]</code>) – Haystack filter dict. An empty dict is treated as a no-op and returns `0`
477    without touching the table.
478    See the `metadata filtering docs <https://docs.haystack.deepset.ai/docs/metadata-filtering>`\_.
479  
480  **Returns:**
481  
482  - <code>int</code> – Number of deleted documents.
483  
484  #### update_by_filter
485  
486  ```python
487  update_by_filter(filters: dict[str, Any], meta: dict[str, Any]) -> int
488  ```
489  
490  Merges `meta` into the metadata of all documents that match the provided filters.
491  
492  Uses Oracle's `JSON_MERGEPATCH` — existing keys are updated, new keys are added,
493  and keys set to `null` in `meta` are removed.
494  
495  **Parameters:**
496  
497  - **filters** (<code>dict\[str, Any\]</code>) – Haystack filter dict that selects which documents to update.
498    See the `metadata filtering docs <https://docs.haystack.deepset.ai/docs/metadata-filtering>`\_.
499  - **meta** (<code>dict\[str, Any\]</code>) – Metadata patch to apply. Must be a non-empty dictionary.
500  
501  **Returns:**
502  
503  - <code>int</code> – Number of updated documents.
504  
505  **Raises:**
506  
507  - <code>ValueError</code> – If `meta` is empty.
508  
509  #### update_by_filter_async
510  
511  ```python
512  update_by_filter_async(filters: dict[str, Any], meta: dict[str, Any]) -> int
513  ```
514  
515  Asynchronously merges `meta` into the metadata of all documents matching the provided filters.
516  
517  Uses Oracle's `JSON_MERGEPATCH` — existing keys are updated, new keys are added,
518  and keys set to `null` in `meta` are removed.
519  
520  **Parameters:**
521  
522  - **filters** (<code>dict\[str, Any\]</code>) – Haystack filter dict that selects which documents to update.
523    See the `metadata filtering docs <https://docs.haystack.deepset.ai/docs/metadata-filtering>`\_.
524  - **meta** (<code>dict\[str, Any\]</code>) – Metadata patch to apply. Must be a non-empty dictionary.
525  
526  **Returns:**
527  
528  - <code>int</code> – Number of updated documents.
529  
530  **Raises:**
531  
532  - <code>ValueError</code> – If `meta` is empty.
533  
534  #### count_unique_metadata_by_filter
535  
536  ```python
537  count_unique_metadata_by_filter(
538      filters: dict[str, Any], metadata_fields: list[str]
539  ) -> dict[str, int]
540  ```
541  
542  Returns the number of distinct values for each requested metadata field among matching documents.
543  
544  **Parameters:**
545  
546  - **filters** (<code>dict\[str, Any\]</code>) – Haystack filter dict that scopes the document set.
547    See the `metadata filtering docs <https://docs.haystack.deepset.ai/docs/metadata-filtering>`\_.
548  - **metadata_fields** (<code>list\[str\]</code>) – List of metadata field names to count distinct values for.
549    Fields may be prefixed with `"meta."` (e.g. `"meta.lang"` or `"lang"`).
550    Must be a non-empty list.
551  
552  **Returns:**
553  
554  - <code>dict\[str, int\]</code> – Dict mapping each field name to its distinct-value count.
555  
556  **Raises:**
557  
558  - <code>ValueError</code> – If `metadata_fields` is empty.
559  - <code>ValueError</code> – If any field name contains characters outside `[A-Za-z0-9_.]`.
560  
561  #### count_unique_metadata_by_filter_async
562  
563  ```python
564  count_unique_metadata_by_filter_async(
565      filters: dict[str, Any], metadata_fields: list[str]
566  ) -> dict[str, int]
567  ```
568  
569  Asynchronously returns the number of distinct values for each metadata field among matching documents.
570  
571  **Parameters:**
572  
573  - **filters** (<code>dict\[str, Any\]</code>) – Haystack filter dict that scopes the document set.
574    See the `metadata filtering docs <https://docs.haystack.deepset.ai/docs/metadata-filtering>`\_.
575  - **metadata_fields** (<code>list\[str\]</code>) – List of metadata field names to count distinct values for.
576    Fields may be prefixed with `"meta."` (e.g. `"meta.lang"` or `"lang"`).
577    Must be a non-empty list.
578  
579  **Returns:**
580  
581  - <code>dict\[str, int\]</code> – Dict mapping each field name to its distinct-value count.
582  
583  **Raises:**
584  
585  - <code>ValueError</code> – If `metadata_fields` is empty.
586  - <code>ValueError</code> – If any field name contains characters outside `[A-Za-z0-9_.]`.
587  
588  #### get_metadata_fields_info
589  
590  ```python
591  get_metadata_fields_info() -> dict[str, dict[str, str]]
592  ```
593  
594  Return a mapping of metadata field names to their detected types.
595  
596  Uses Oracle's `JSON_DATAGUIDE` aggregate to introspect the stored metadata column.
597  Returns an empty dict when the table has no documents.
598  
599  **Returns:**
600  
601  - <code>dict\[str, dict\[str, str\]\]</code> – Dict of the form `{"field_name": {"type": "<type>"}, ...}` where `<type>`
602    is one of `"text"`, `"number"`, or `"boolean"`.
603  
604  #### get_metadata_field_min_max
605  
606  ```python
607  get_metadata_field_min_max(metadata_field: str) -> dict[str, Any]
608  ```
609  
610  Return the minimum and maximum values of a metadata field across all documents.
611  
612  First attempts numeric comparison via `TO_NUMBER` so that `MAX(1, 5, 10)` returns `10`
613  rather than `"5"` (which would win under lexicographic ordering). Falls back to plain string
614  comparison when the field contains non-numeric values. Numeric strings are automatically
615  converted to `int` or `float` in the result.
616  
617  **Parameters:**
618  
619  - **metadata_field** (<code>str</code>) – Metadata field name. May be prefixed with `"meta."`
620    (e.g. `"meta.year"` or `"year"`).
621  
622  **Returns:**
623  
624  - <code>dict\[str, Any\]</code> – `{"min": <value>, "max": <value>}`. Both values are `None` when the table is
625    empty or the field does not exist.
626  
627  **Raises:**
628  
629  - <code>ValueError</code> – If `metadata_field` contains characters outside `[A-Za-z0-9_.]`.
630  
631  #### get_metadata_field_unique_values
632  
633  ```python
634  get_metadata_field_unique_values(
635      metadata_field: str,
636      search_term: str | None = None,
637      from_: int = 0,
638      size: int | None = None,
639  ) -> tuple[list[str], int]
640  ```
641  
642  Return a paginated list of distinct values for a metadata field, plus the total distinct count.
643  
644  **Parameters:**
645  
646  - **metadata_field** (<code>str</code>) – Metadata field name. May be prefixed with `"meta."`
647    (e.g. `"meta.lang"` or `"lang"`).
648  - **search_term** (<code>str | None</code>) – Optional substring filter applied to both the document text and the field value.
649  - **from\_** (<code>int</code>) – Zero-based offset for pagination. Defaults to `0`.
650  - **size** (<code>int | None</code>) – Maximum number of values to return. When `None` all values from `from_` onward
651    are returned.
652  
653  **Returns:**
654  
655  - <code>tuple\[list\[str\], int\]</code> – A tuple `(values, total)` where `values` is the paginated list of distinct field
656    values as strings and `total` is the overall distinct count (before pagination).
657  
658  **Raises:**
659  
660  - <code>ValueError</code> – If `metadata_field` contains characters outside `[A-Za-z0-9_.]`.
661  
662  #### get_metadata_fields_info_async
663  
664  ```python
665  get_metadata_fields_info_async() -> dict[str, dict[str, str]]
666  ```
667  
668  Asynchronously returns a mapping of metadata field names to their detected types.
669  
670  Uses Oracle's `JSON_DATAGUIDE` aggregate to introspect the stored metadata column.
671  Returns an empty dict when the table has no documents.
672  
673  **Returns:**
674  
675  - <code>dict\[str, dict\[str, str\]\]</code> – Dict of the form `{"field_name": {"type": "<type>"}, ...}` where `<type>`
676    is one of `"text"`, `"number"`, or `"boolean"`.
677  
678  #### get_metadata_field_min_max_async
679  
680  ```python
681  get_metadata_field_min_max_async(metadata_field: str) -> dict[str, Any]
682  ```
683  
684  Asynchronously returns the minimum and maximum values of a metadata field across all documents.
685  
686  First attempts numeric comparison via `TO_NUMBER`, falling back to string comparison for
687  non-numeric fields. Numeric strings are automatically converted to `int` or `float`.
688  
689  **Parameters:**
690  
691  - **metadata_field** (<code>str</code>) – Metadata field name. May be prefixed with `"meta."`
692    (e.g. `"meta.year"` or `"year"`).
693  
694  **Returns:**
695  
696  - <code>dict\[str, Any\]</code> – `{"min": <value>, "max": <value>}`. Both values are `None` when the table is
697    empty or the field does not exist.
698  
699  **Raises:**
700  
701  - <code>ValueError</code> – If `metadata_field` contains characters outside `[A-Za-z0-9_.]`.
702  
703  #### get_metadata_field_unique_values_async
704  
705  ```python
706  get_metadata_field_unique_values_async(
707      metadata_field: str,
708      search_term: str | None = None,
709      from_: int = 0,
710      size: int | None = None,
711  ) -> tuple[list[str], int]
712  ```
713  
714  Asynchronously returns a paginated list of distinct values for a metadata field, plus the total count.
715  
716  **Parameters:**
717  
718  - **metadata_field** (<code>str</code>) – Metadata field name. May be prefixed with `"meta."`
719    (e.g. `"meta.lang"` or `"lang"`).
720  - **search_term** (<code>str | None</code>) – Optional substring filter applied to both the document text and the field value.
721  - **from\_** (<code>int</code>) – Zero-based offset for pagination. Defaults to `0`.
722  - **size** (<code>int | None</code>) – Maximum number of values to return. When `None` all values from `from_` onward
723    are returned.
724  
725  **Returns:**
726  
727  - <code>tuple\[list\[str\], int\]</code> – A tuple `(values, total)` where `values` is the paginated list of distinct field
728    values as strings and `total` is the overall distinct count (before pagination).
729  
730  **Raises:**
731  
732  - <code>ValueError</code> – If `metadata_field` contains characters outside `[A-Za-z0-9_.]`.
733  
734  #### to_dict
735  
736  ```python
737  to_dict() -> dict[str, Any]
738  ```
739  
740  Serializes the component to a dictionary.
741  
742  **Returns:**
743  
744  - <code>dict\[str, Any\]</code> – Dictionary with serialized data.
745  
746  #### from_dict
747  
748  ```python
749  from_dict(data: dict[str, Any]) -> OracleDocumentStore
750  ```
751  
752  Deserializes the component from a dictionary.
753  
754  **Parameters:**
755  
756  - **data** (<code>dict\[str, Any\]</code>) – Dictionary to deserialize from.
757  
758  **Returns:**
759  
760  - <code>OracleDocumentStore</code> – Deserialized component.