faq.md
1 # FAQ 2 3  4 5 Below is a list of frequently asked questions and common issues encountered. 6 7 ## Questions 8 9 ---------- 10 11 __Question__ 12 13 What models are recommended? 14 15 __Answer__ 16 17 See the [model guide](../models). 18 19 ---------- 20 21 __Question__ 22 23 What is the best way to track the progress of an `embeddings.index` call? 24 25 __Answer__ 26 27 Wrap the list or generator passed to the index call with tqdm. See [#478](https://github.com/neuml/txtai/issues/478) for more. 28 29 ---------- 30 31 __Question__ 32 33 What is the best way to analyze and debug a txtai process? 34 35 __Answer__ 36 37 See the [observability](../observability) section for more on how this can be enabled in txtai processes. 38 39 txtai also has a console application. [This article](https://medium.com/neuml/insights-from-the-txtai-console-d307c28e149e) has more details. 40 41 ---------- 42 43 __Question__ 44 45 How can models be externally loaded and passed to embeddings and pipelines? 46 47 __Answer__ 48 49 Embeddings example. 50 51 ```python 52 from transformers import AutoModel, AutoTokenizer 53 from txtai import Embeddings 54 55 # Load model externally 56 model = AutoModel.from_pretrained("sentence-transformers/all-MiniLM-L6-v2") 57 tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/all-MiniLM-L6-v2") 58 59 # Pass to embeddings instance 60 embeddings = Embeddings(path=model, tokenizer=tokenizer) 61 ``` 62 63 LLM pipeline example. 64 65 ```python 66 import torch 67 68 from transformers import AutoModelForCausalLM, AutoTokenizer 69 from txtai import LLM 70 71 # Load Qwen3 0.6B 72 path = "Qwen/Qwen3-0.6B" 73 model = AutoModelForCausalLM.from_pretrained( 74 path, 75 dtype=torch.bfloat16, 76 ) 77 tokenizer = AutoTokenizer.from_pretrained(path) 78 79 llm = LLM((model, tokenizer)) 80 ``` 81 82 ## Common issues 83 84 ---------- 85 86 __Issue__ 87 88 Embeddings query errors like this: 89 90 ``` 91 SQLError: no such function: json_extract 92 ``` 93 94 __Solution__ 95 96 Upgrade Python version as it doesn't have SQLite support for `json_extract` 97 98 ---------- 99 100 __Issue__ 101 102 Segmentation faults and similar errors on macOS 103 104 __Solution__ 105 106 Set the following environment parameters. 107 108 - Disable OpenMP multithreading via `export OMP_NUM_THREADS=1` 109 - Workaround `OMP: Error #15` errors via `export KMP_DUPLICATE_LIB_OK=TRUE` 110 - Disable PyTorch MPS device via `export PYTORCH_MPS_DISABLE=1` 111 - Disable llama.cpp metal via `export LLAMA_NO_METAL=1` 112 113 For more details, refer to [this issue on GitHub](https://github.com/kyamagu/faiss-wheels/issues/73). 114 115 ---------- 116 117 __Issue__ 118 119 Error running SQLite ANN on macOS 120 121 ``` 122 AttributeError: 'sqlite3.Connection' object has no attribute 'enable_load_extension' 123 ``` 124 125 __Solution__ 126 127 See [this note](https://alexgarcia.xyz/sqlite-vec/python.html#macos-blocks-sqlite-extensions-by-default) for options on how to fix this. 128 129 ---------- 130 131 __Issue__ 132 133 `ContextualVersionConflict` and/or package METADATA exception while running one of the [examples](../examples) notebooks on Google Colab 134 135 __Solution__ 136 137 Restart the kernel. See issue [#409](https://github.com/neuml/txtai/issues/409) for more on this issue. 138 139 ---------- 140 141 __Issue__ 142 143 Error installing optional/extra dependencies such as `pipeline` 144 145 __Solution__ 146 147 The default MacOS shell (zsh) and Windows PowerShell require escaping square brackets 148 149 ``` 150 pip install 'txtai[pipeline]' 151 ```