Cradicle Explorer

/ docs / faq.md
faq.md
  1  # FAQ
  2  
  3  ![faq](images/faq.png)
  4  
  5  Below is a list of frequently asked questions and common issues encountered.
  6  
  7  ## Questions
  8  
  9  ----------
 10  
 11  __Question__
 12  
 13  What models are recommended?
 14  
 15  __Answer__
 16  
 17  See the [model guide](../models).
 18  
 19  ----------
 20  
 21  __Question__
 22  
 23  What is the best way to track the progress of an `embeddings.index` call?
 24  
 25  __Answer__
 26  
 27  Wrap the list or generator passed to the index call with tqdm. See [#478](https://github.com/neuml/txtai/issues/478) for more.
 28  
 29  ----------
 30  
 31  __Question__
 32  
 33  What is the best way to analyze and debug a txtai process?
 34  
 35  __Answer__
 36  
 37  See the [observability](../observability) section for more on how this can be enabled in txtai processes.
 38  
 39  txtai also has a console application. [This article](https://medium.com/neuml/insights-from-the-txtai-console-d307c28e149e) has more details.
 40  
 41  ----------
 42  
 43  __Question__
 44  
 45  How can models be externally loaded and passed to embeddings and pipelines?
 46  
 47  __Answer__
 48  
 49  Embeddings example.
 50  
 51  ```python
 52  from transformers import AutoModel, AutoTokenizer
 53  from txtai import Embeddings
 54  
 55  # Load model externally
 56  model = AutoModel.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")
 57  tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")
 58  
 59  # Pass to embeddings instance
 60  embeddings = Embeddings(path=model, tokenizer=tokenizer)
 61  ```
 62  
 63  LLM pipeline example.
 64  
 65  ```python
 66  import torch
 67  
 68  from transformers import AutoModelForCausalLM, AutoTokenizer
 69  from txtai import LLM
 70  
 71  # Load Qwen3 0.6B
 72  path = "Qwen/Qwen3-0.6B"
 73  model = AutoModelForCausalLM.from_pretrained(
 74    path,
 75    dtype=torch.bfloat16,
 76  )
 77  tokenizer = AutoTokenizer.from_pretrained(path)
 78  
 79  llm = LLM((model, tokenizer))
 80  ```
 81  
 82  ## Common issues
 83  
 84  ----------
 85  
 86  __Issue__
 87  
 88  Embeddings query errors like this:
 89  
 90  ```
 91  SQLError: no such function: json_extract
 92  ```
 93  
 94  __Solution__
 95  
 96  Upgrade Python version as it doesn't have SQLite support for `json_extract`
 97  
 98  ----------
 99  
100  __Issue__
101  
102  Segmentation faults and similar errors on macOS
103  
104  __Solution__
105  
106  Set the following environment parameters.
107  
108  - Disable OpenMP multithreading via `export OMP_NUM_THREADS=1`
109  - Workaround `OMP: Error #15` errors via `export KMP_DUPLICATE_LIB_OK=TRUE`
110  - Disable PyTorch MPS device via `export PYTORCH_MPS_DISABLE=1`
111  - Disable llama.cpp metal via `export LLAMA_NO_METAL=1`
112  
113  For more details, refer to [this issue on GitHub](https://github.com/kyamagu/faiss-wheels/issues/73).
114  
115  ----------
116  
117  __Issue__
118  
119  Error running SQLite ANN on macOS
120  
121  ```
122  AttributeError: 'sqlite3.Connection' object has no attribute 'enable_load_extension'
123  ```
124  
125  __Solution__
126  
127  See [this note](https://alexgarcia.xyz/sqlite-vec/python.html#macos-blocks-sqlite-extensions-by-default) for options on how to fix this.
128  
129  ----------
130  
131  __Issue__
132  
133  `ContextualVersionConflict` and/or package METADATA exception while running one of the [examples](../examples) notebooks on Google Colab
134  
135  __Solution__
136  
137  Restart the kernel. See issue [#409](https://github.com/neuml/txtai/issues/409) for more on this issue. 
138  
139  ----------
140  
141  __Issue__
142  
143  Error installing optional/extra dependencies such as `pipeline`
144  
145  __Solution__
146  
147  The default MacOS shell (zsh) and Windows PowerShell require escaping square brackets
148  
149  ```
150  pip install 'txtai[pipeline]'
151  ```