/ examples / docker / README.rst
README.rst
 1  Dockerized Model Training with MLflow
 2  -------------------------------------
 3  This directory contains an MLflow project that trains a linear regression model on the UC Irvine
 4  Wine Quality Dataset. The project uses a Docker image to capture the dependencies needed to run
 5  training code. Running a project in a Docker environment (as opposed to Conda) allows for capturing
 6  non-Python dependencies, e.g. Java libraries. In the future, we also hope to add tools to MLflow
 7  for running Dockerized projects e.g. on a Kubernetes cluster for scale out.
 8  
 9  Structure of this MLflow Project
10  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11  
12  This MLflow project contains a ``train.py`` file that trains a scikit-learn model and uses
13  MLflow Tracking APIs to log the model and its metadata (e.g., hyperparameters and metrics)
14  for later use and reference. ``train.py`` operates on the Wine Quality Dataset, which is included
15  in ``wine-quality.csv``.
16  
17  Most importantly, the project also includes an ``MLproject`` file, which specifies the Docker
18  container environment in which to run the project using the ``docker_env`` field:
19  
20  .. code-block:: yaml
21  
22    docker_env:
23      image:  mlflow-docker-example
24  
25  Here, ``image`` can be any valid argument to ``docker run``, such as the tag, ID or URL of a Docker
26  image (see `Docker docs <https://docs.docker.com/engine/reference/run/#general-form>`_). The above
27  example references a locally-stored image (``mlflow-docker-example``) by tag.
28  
29  Finally, the project includes a ``Dockerfile`` that is used to build the image referenced by the
30  ``MLproject`` file. The ``Dockerfile`` specifies library dependencies required by the project, such
31  as ``mlflow`` and ``scikit-learn``.
32  
33  Running this Example
34  ^^^^^^^^^^^^^^^^^^^^
35  
36  First, install MLflow (via ``pip install mlflow``) and install
37  `Docker <https://www.docker.com/get-started>`_.
38  
39  Then, build the image for the project's Docker container environment. You must use the same image
40  name that is given by the ``docker_env.image`` field of the MLproject file. In this example, the
41  image name is ``mlflow-docker-example``. Issue the following command to build an image with this
42  name:
43  
44  .. code-block:: bash
45  
46    docker build -t mlflow-docker-example -f Dockerfile .
47  
48  Note that the name if the image used in the ``docker build`` command, ``mlflow-docker-example``,
49  matches the name of the image referenced in the ``MLproject`` file.
50  
51  Finally, run the example project using ``mlflow run examples/docker -P alpha=0.5``.
52  
53  .. note::
54      If running this example on a Mac with Apple silicon, ensure that Docker Desktop is running and
55      that you are logged in to the Docker Desktop service.
56      If you are modifying the example ``DockerFile`` to specify older versions of ``scikit-learn``,
57      you should enable `Rosetta compatibility <https://docs.docker.com/desktop/settings/mac/#features-in-development>`_
58      in the Docker Desktop configuration settings to ensure that the appropriate ``cython`` compiler is used.
59  
60  What happens when the project is run?
61  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
62  
63  Running ``mlflow run examples/docker`` builds a new Docker image based on ``mlflow-docker-example``
64  that also contains our project code. The resulting image is tagged as
65  ``mlflow-docker-example-<git-version>`` where ``<git-version>`` is the git commit ID. After the image is
66  built, MLflow executes the default (main) project entry point within the container using ``docker run``.
67  
68  Environment variables, such as ``MLFLOW_TRACKING_URI``, are propagated inside the container during
69  project execution. When running against a local tracking URI, MLflow mounts the host system's
70  tracking directory (e.g., a local ``mlruns`` directory) inside the container so that metrics and
71  params logged during project execution are accessible afterwards.