index.mdx
1 --- 2 title: Self Hosting Overview 3 description: Deploy and manage your own MLflow instance. Open-source, vendor-neutral MLOps and LLMOps platform for experiment tracking, model registry, and LLM observability. 4 sidebar_position: 1 5 --- 6 7 # Self-Hosting MLflow 8 9 > #### **_The most vendor-neutral MLOps/LLMOps platform in the world._** 10 11 MLflow is fully open-source. Thousands of users and organizations run their own MLflow instances to meet their specific needs. Being open-source and trusted by the popular cloud providers, MLflow is the best choice for teams/organizations that worry about vendor lock-in. 12 13 :::warning Default Storage Backend Change 14 15 As of MLflow 3.7.0, the default tracking backend has changed from file-based storage (`./mlruns`) to SQLite database (`sqlite:///mlflow.db`) for better performance and reliability. 16 17 **Existing users:** If you have existing data in `./mlruns`, MLflow will automatically detect and continue using it. No action is required. 18 19 **New users:** New MLflow servers will use SQLite by default. To use file-based storage instead, set `MLFLOW_TRACKING_URI=./mlruns` or specify `--backend-store-uri ./mlruns` when starting the server. 20 21 For more details and migration guidance, see [GitHub Issue #18534](https://github.com/mlflow/mlflow/issues/18534). 22 23 ::: 24 25 ## The Quickest Path: Run `mlflow` Command 26 27 The easiest way to start MLflow server is to run the `mlflow` CLI command in your terminal. This is suitable for personal use or small teams. 28 29 First, install MLflow with: 30 31 ```bash 32 pip install mlflow 33 ``` 34 35 :::info 36 See [Secure Installs](/self-hosting/security/secure-installs) to learn how to pin dependencies to known good versions using hash checking and upload-time filtering. 37 ::: 38 39 Then, start the server with: 40 41 ```bash 42 mlflow server --port 5000 43 ``` 44 45 This will start the server and UI at `http://localhost:5000` using SQLite as the backend store (the default). You can connect the client to the server by setting the tracking URI: 46 47 ```python 48 import mlflow 49 50 mlflow.set_tracking_uri("http://localhost:5000") 51 52 # Start tracking! 53 # Open http://localhost:5000 in your browser to view the UI. 54 ``` 55 56 Now, you are ready to start your experiment! 57 58 - [Tracing QuickStart](/genai/tracing/quickstart/) 59 - [LLM Evaluation Quickstart](/genai/eval-monitor/quickstart/) 60 - [Prompt Management Quickstart](/genai/prompt-registry/#getting-started) 61 - [Model Training Quickstart](/ml/tracking/quickstart/) 62 63 :::tip 64 65 For production deployments or custom backend configurations, see <ins>[Backend Store](./architecture/backend-store)</ins> documentation. 66 67 ::: 68 69 ## Other Deployment Options 70 71 ### Docker Compose 72 73 The MLflow repository includes a ready-to-run Compose project under `docker-compose/` that provisions MLflow, PostgreSQL, and [RustFS](https://github.com/rustfs/rustfs). 74 75 ```bash 76 git clone https://github.com/mlflow/mlflow.git 77 cd mlflow/docker-compose 78 cp .env.dev.example .env 79 docker compose up -d 80 # Open http://localhost:5000 in your browser to view the UI. 81 ``` 82 83 Read the instructions [here](https://github.com/mlflow/mlflow/tree/master/docker-compose) for more details and configuration options for the docker compose bundle. 84 85 ### Kubernetes 86 87 To deploy on Kubernetes, use the MLflow Helm chart provided by [Bitnami](https://artifacthub.io/packages/helm/bitnami/mlflow) or [Community Helm Charts](https://artifacthub.io/packages/helm/community-charts/mlflow). 88 89 ### Cloud Services 90 91 If you are looking for production-scale deployments without maintenance costs, MLflow is also available as managed services from popular cloud providers. 92 93 - [Databricks](https://www.databricks.com/product/managed-mlflow) 94 - [AWS Sagemaker](https://aws.amazon.com/sagemaker/ai/experiments/) 95 - [Azure Machine Learning](https://learn.microsoft.com/en-us/azure/machine-learning/concept-mlflow?view=azureml-api-2) 96 - [Nebius](https://nebius.com/services/managed-mlflow) 97 - [GCP (GKE)](https://gke-ai-labs.dev/docs/tutorials/frameworks-and-pipelines/mlflow/) 98 99 ## Architecture 100 101 MLflow, at a high level, consists of the following components: 102 103 1. **Tracking Server**: The lightweight FastAPI server that serves the MLflow UI and API. 104 2. **Backend Store**: The Backend Store is relational database (or file system) that stores the metadata of the experiments, runs, traces, etc. 105 3. **Artifact Store**: The Artifact Store is responsible for storing the large artifacts such as model weights, images, etc. 106 107 Each component is designed to be pluggable, so you can customize it to meet your needs. For example, you can start with a single host mode with SQLite backend and local file system for storing artifacts. To scale up, you can switch backend store to PostgreSQL cluster and point artifact store to cloud storage such as S3, GCS, or Azure Blob Storage. 108 109 To learn more about the architecture and available backend options, see [Architecture](./architecture/overview). 110 111 ## Workspaces 112 113 MLflow supports [workspaces](/self-hosting/workspaces) to organize experiments, registered models, prompts, and artifacts on a shared MLflow instance. Workspaces add logical separation and workspace-level permissions so teams can collaborate without running separate servers. Workspaces are opt-in and require a SQL database backend. 114 115 ## Access Control & Security 116 117 MLflow support [username/password login](./security/basic-http-auth) via basic HTTP authentication, [SSO (Single Sign-On)](./security/sso), and [custom authentication plugins](./security/custom). 118 119 MLflow also provides built-in [network protection](./security/network) middleware to protect your tracking server from network exposure. 120 121 :::tip Try Managed MLflow 122 123 Need highly secure MLflow server? Check out <ins>[Databricks Managed MLflow](https://www.databricks.com/product/managed-mlflow)</ins> to get fully managed MLflow servers with unified governance and security. 124 125 ::: 126 127 ## FAQs 128 129 See [Troubleshooting & FAQs](./troubleshooting) for more information. 130 131 :::info[ACCESS DENIED?] 132 133 When using the remote tracking server, you may hit an access denied error when accessing the MLflow UI 134 from a browser. 135 136 > Invalid Host header - possible DNS rebinding attack detected 137 138 This error typically indicates that the tracking server's network security settings need to be configured. 139 The most common causes are: 140 141 - **Host validation**: The `--allowed-hosts` flag restricts which Host headers are accepted 142 - **CORS restrictions**: The `--cors-allowed-origins` flag controls which origins can make API requests 143 144 To resolve this, configure your tracking server with the appropriate flags. For example: 145 146 ```bash 147 mlflow server --allowed-hosts "mlflow.company.com,localhost:*" \ 148 --cors-allowed-origins "https://app.company.com" 149 ``` 150 151 **Note**: These security options are only available with the default FastAPI-based server (uvicorn). They are 152 not supported when using Flask directly or with `--gunicorn-opts` or `--waitress-opts`. 153 154 Refer to the <ins>[Network Security Guide](/self-hosting/security/network)</ins> for detailed configuration options. 155 156 :::