/ docs / pipeline / llm / llm.md
llm.md
  1  # LLM
  2  
  3  ![pipeline](../../images/pipeline.png#only-light)
  4  ![pipeline](../../images/pipeline-dark.png#only-dark)
  5  
  6  The LLM pipeline runs prompts through a large language model (LLM). This pipeline autodetects the LLM framework based on the model path.
  7  
  8  ## Example
  9  
 10  The following shows a simple example using this pipeline.
 11  
 12  ```python
 13  from txtai import LLM
 14  
 15  # Create LLM pipeline
 16  llm = LLM()
 17  
 18  # Run prompt
 19  llm(
 20    """
 21    Answer the following question using the provided context.
 22  
 23    Question:
 24    What are the applications of txtai?
 25  
 26    Context:
 27    txtai is an open-source platform for semantic search and
 28    workflows powered by language models.
 29    """
 30  )
 31  
 32  # Prompts with chat templating can be directly passed
 33  # The template format varies by model
 34  llm(
 35    """
 36    <|im_start|>system
 37    You are a friendly assistant.<|im_end|>
 38    <|im_start|>user
 39    Answer the following question...<|im_end|>
 40    <|im_start|>assistant
 41    """
 42  )
 43  
 44  # Chat messages automatically handle templating
 45  llm([
 46    {"role": "system", "content": "You are a friendly assistant."},
 47    {"role": "user", "content": "Answer the following question..."}
 48  ])
 49  
 50  # When there is no system prompt passed to instruction tuned models
 51  # the default role is inferred `defaultrole="auto"`
 52  llm("Answer the following question...")
 53  
 54  # To always generate chat messages for string inputs
 55  llm("Answer the following question...", defaultrole="user")
 56  
 57  # To never generate chat messages for string inputs
 58  llm("Answer the following question...", defaultrole="prompt")
 59  ```
 60  
 61  The LLM pipeline automatically detects the underlying LLM framework. This can also be manually set. The following methods are supported.
 62  
 63  - [Hugging Face Transformers](https://github.com/huggingface/transformers)
 64  - [llama.cpp](https://github.com/abetlen/llama-cpp-python)
 65  - [LLM APIs via LiteLLM](https://github.com/BerriAI/litellm)
 66  - [OpenCode server](https://github.com/anomalyco/opencode)
 67  
 68  `llama.cpp` models support both local and remote GGUF paths on the HF Hub. See the [LiteLLM documentation](https://litellm.vercel.app/docs/providers) for the options available with LiteLLM models. See the [OpenCode documentation](https://opencode.ai/docs/server/) for more on how to integrate the LLM pipeline with a running OpenCode instance.
 69  
 70  ```python
 71  from txtai import LLM
 72  
 73  # Transformers
 74  llm = LLM("openai/gpt-oss-20b")
 75  llm = LLM("openai/gpt-oss-20b", method="transformers")
 76  
 77  # llama.cpp
 78  llm = LLM("unsloth/gpt-oss-20b-GGUF/gpt-oss-20b-Q4_K_M.gguf")
 79  llm = LLM("unsloth/gpt-oss-20b-GGUF/gpt-oss-20b-Q4_K_M.gguf",
 80             method="llama.cpp")
 81  
 82  # LiteLLM
 83  llm = LLM("ollama/gpt-oss")
 84  llm = LLM("ollama/gpt-oss", method="litellm")
 85  
 86  # Custom Ollama endpoint
 87  llm = LLM("ollama/gpt-oss", api_base="http://localhost:11434")
 88  
 89  # Custom OpenAI-compatible endpoint
 90  llm = LLM("openai/gpt-oss", api_base="http://localhost:4000")
 91  
 92  # LLM APIs - must also set API key via environment variable
 93  llm = LLM("gpt-5.2")
 94  llm = LLM("claude-opus-4-5-20251101")
 95  llm = LLM("gemini/gemini-3-pro-preview")
 96  
 97  # Local OpenCode server started via `opencode serve`
 98  llm = LLM("opencode")
 99  llm = LLM("opencode/big-pickle", url="http://localhost:4000")
100  ```
101  
102  Models can be externally loaded and passed to pipelines. This is useful for models that are not yet supported by Transformers and/or need special initialization.
103  
104  ```python
105  import torch
106  
107  from transformers import AutoModelForCausalLM, AutoTokenizer
108  from txtai import LLM
109  
110  # Load Qwen3 0.6B
111  path = "Qwen/Qwen3-0.6B"
112  model = AutoModelForCausalLM.from_pretrained(
113    path,
114    dtype=torch.bfloat16,
115  )
116  tokenizer = AutoTokenizer.from_pretrained(path)
117  
118  llm = LLM((model, tokenizer))
119  ```
120  
121  See the links below for more detailed examples.
122  
123  | Notebook  | Description  |       |
124  |:----------|:-------------|------:|
125  | [Prompt-driven search with LLMs](https://github.com/neuml/txtai/blob/master/examples/42_Prompt_driven_search_with_LLMs.ipynb) | Embeddings-guided and Prompt-driven search with Large Language Models (LLMs) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/42_Prompt_driven_search_with_LLMs.ipynb) |
126  | [Prompt templates and task chains](https://github.com/neuml/txtai/blob/master/examples/44_Prompt_templates_and_task_chains.ipynb) | Build model prompts and connect tasks together with workflows | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/44_Prompt_templates_and_task_chains.ipynb) |
127  | [Build RAG pipelines with txtai](https://github.com/neuml/txtai/blob/master/examples/52_Build_RAG_pipelines_with_txtai.ipynb) [▶️](https://www.youtube.com/watch?v=t_OeAc8NVfQ) | Guide on retrieval augmented generation including how to create citations | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/52_Build_RAG_pipelines_with_txtai.ipynb) |
128  | [Integrate LLM frameworks](https://github.com/neuml/txtai/blob/master/examples/53_Integrate_LLM_Frameworks.ipynb) | Integrate llama.cpp, LiteLLM and custom generation frameworks | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/53_Integrate_LLM_Frameworks.ipynb) |
129  | [Generate knowledge with Semantic Graphs and RAG](https://github.com/neuml/txtai/blob/master/examples/55_Generate_knowledge_with_Semantic_Graphs_and_RAG.ipynb) | Knowledge exploration and discovery with Semantic Graphs and RAG | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/55_Generate_knowledge_with_Semantic_Graphs_and_RAG.ipynb) |
130  | [Build knowledge graphs with LLMs](https://github.com/neuml/txtai/blob/master/examples/57_Build_knowledge_graphs_with_LLM_driven_entity_extraction.ipynb) | Build knowledge graphs with LLM-driven entity extraction | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/57_Build_knowledge_graphs_with_LLM_driven_entity_extraction.ipynb) |
131  | [Advanced RAG with graph path traversal](https://github.com/neuml/txtai/blob/master/examples/58_Advanced_RAG_with_graph_path_traversal.ipynb) | Graph path traversal to collect complex sets of data for advanced RAG | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/58_Advanced_RAG_with_graph_path_traversal.ipynb) |
132  | [Advanced RAG with guided generation](https://github.com/neuml/txtai/blob/master/examples/60_Advanced_RAG_with_guided_generation.ipynb) | Retrieval Augmented and Guided Generation | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/60_Advanced_RAG_with_guided_generation.ipynb) |
133  | [RAG with llama.cpp and external API services](https://github.com/neuml/txtai/blob/master/examples/62_RAG_with_llama_cpp_and_external_API_services.ipynb) | RAG with additional vector and LLM frameworks | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/62_RAG_with_llama_cpp_and_external_API_services.ipynb) |
134  | [How RAG with txtai works](https://github.com/neuml/txtai/blob/master/examples/63_How_RAG_with_txtai_works.ipynb) | Create RAG processes, API services and Docker instances | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/63_How_RAG_with_txtai_works.ipynb) |
135  | [Speech to Speech RAG](https://github.com/neuml/txtai/blob/master/examples/65_Speech_to_Speech_RAG.ipynb) [▶️](https://www.youtube.com/watch?v=tH8QWwkVMKA) | Full cycle speech to speech workflow with RAG | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/65_Speech_to_Speech_RAG.ipynb) |
136  | [Analyzing Hugging Face Posts with Graphs and Agents](https://github.com/neuml/txtai/blob/master/examples/68_Analyzing_Hugging_Face_Posts_with_Graphs_and_Agents.ipynb) | Explore a rich dataset with Graph Analysis and Agents | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/68_Analyzing_Hugging_Face_Posts_with_Graphs_and_Agents.ipynb) |
137  | [Granting autonomy to agents](https://github.com/neuml/txtai/blob/master/examples/69_Granting_autonomy_to_agents.ipynb) | Agents that iteratively solve problems as they see fit | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/69_Granting_autonomy_to_agents.ipynb) |
138  | [Getting started with LLM APIs](https://github.com/neuml/txtai/blob/master/examples/70_Getting_started_with_LLM_APIs.ipynb) | Generate embeddings and run LLMs with OpenAI, Claude, Gemini, Bedrock and more | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/70_Getting_started_with_LLM_APIs.ipynb) |
139  | [Analyzing LinkedIn Company Posts with Graphs and Agents](https://github.com/neuml/txtai/blob/master/examples/71_Analyzing_LinkedIn_Company_Posts_with_Graphs_and_Agents.ipynb) | Exploring how to improve social media engagement with AI | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/71_Analyzing_LinkedIn_Company_Posts_with_Graphs_and_Agents.ipynb) |
140  | [Parsing the stars with txtai](https://github.com/neuml/txtai/blob/master/examples/72_Parsing_the_stars_with_txtai.ipynb) | Explore an astronomical knowledge graph of known stars, planets, galaxies | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/72_Parsing_the_stars_with_txtai.ipynb) |
141  | [Chunking your data for RAG](https://github.com/neuml/txtai/blob/master/examples/73_Chunking_your_data_for_RAG.ipynb) | Extract, chunk and index content for effective retrieval | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/73_Chunking_your_data_for_RAG.ipynb) |
142  | [Medical RAG Research with txtai](https://github.com/neuml/txtai/blob/master/examples/75_Medical_RAG_Research_with_txtai.ipynb) | Analyze PubMed article metadata with RAG | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/75_Medical_RAG_Research_with_txtai.ipynb) |
143  | [GraphRAG with Wikipedia and GPT OSS](https://github.com/neuml/txtai/blob/master/examples/77_GraphRAG_with_Wikipedia_and_GPT_OSS.ipynb) | Deep graph search powered RAG | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/77_GraphRAG_with_Wikipedia_and_GPT_OSS.ipynb) |
144  | [RAG is more than Vector Search](https://github.com/neuml/txtai/blob/master/examples/79_RAG_is_more_than_Vector_Search.ipynb) | Context retrieval via Web, SQL and other sources | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/79_RAG_is_more_than_Vector_Search.ipynb) |
145  | [OpenCode as a txtai LLM](https://github.com/neuml/txtai/blob/master/examples/81_OpenCode_as_a_txtai_LLM.ipynb) | Integrate OpenCode with the txtai ecosystem | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/81_OpenCode_as_a_txtai_LLM.ipynb) |
146  | [Agentic College Search](https://github.com/neuml/txtai/blob/master/examples/82_Agentic_College_Search.ipynb) | Identify a list of strong engineering colleges | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/82_Agentic_College_Search.ipynb) |
147  | [TxtAI got skills](https://github.com/neuml/txtai/blob/master/examples/83_TxtAI_got_skills.ipynb) | Integrate skill.md files with your agent | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/83_TxtAI_got_skills.ipynb) |
148  | [Agent Tools](https://github.com/neuml/txtai/blob/master/examples/84_Agent_Tools.ipynb) [▶️](https://www.youtube.com/watch?v=RDNaFXQy3GQ) | Learn about the txtai agent toolkit | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/84_Agent_Tools.ipynb) |
149  
150  ## Configuration-driven example
151  
152  Pipelines are run with Python or configuration. Pipelines can be instantiated in [configuration](../../../api/configuration/#pipeline) using the lower case name of the pipeline. Configuration-driven pipelines are run with [workflows](../../../workflow/#configuration-driven-example) or the [API](../../../api#local-instance).
153  
154  ### config.yml
155  ```yaml
156  # Create pipeline using lower case class name
157  llm:
158  
159  # Run pipeline with workflow
160  workflow:
161    llm:
162      tasks:
163        - action: llm
164  ```
165  
166  Similar to the Python example above, the underlying [Hugging Face pipeline parameters](https://huggingface.co/docs/transformers/main/main_classes/pipelines#transformers.pipeline.model) and [model parameters](https://huggingface.co/docs/transformers/model_doc/auto#transformers.AutoModel.from_pretrained) can be set in pipeline configuration.
167  
168  ```yaml
169  llm:
170    path: Qwen/Qwen3-0.6B
171    dtype: torch.bfloat16
172  ```
173  
174  ### Run with Workflows
175  
176  ```python
177  from txtai import Application
178  
179  # Create and run pipeline with workflow
180  app = Application("config.yml")
181  list(app.workflow("llm", [
182    """
183    Answer the following question using the provided context.
184   
185    Question:
186    What are the applications of txtai? 
187  
188    Context:
189    txtai is an open-source platform for semantic search and
190    workflows powered by language models.
191    """
192  ]))
193  ```
194  
195  ### Run with API
196  
197  ```bash
198  CONFIG=config.yml uvicorn "txtai.api:app" &
199  
200  curl \
201    -X POST "http://localhost:8000/workflow" \
202    -H "Content-Type: application/json" \
203    -d '{"name":"llm", "elements": ["Answer the following question..."]}'
204  ```
205  
206  ## Methods
207  
208  Python documentation for the pipeline.
209  
210  ### ::: txtai.pipeline.LLM.__init__
211  ### ::: txtai.pipeline.LLM.__call__