MLflow: A Comprehensive Guide for Machine Learning Deployment

One of the most used open-source tools for machine learning model lifecycle analysis and management

Adrià Serra

Published in

Towards Dev

6 min readFeb 10, 2023

Introduction

Machine learning is rapidly becoming an essential tool for businesses to stay ahead of their competitors. However, building, testing, and deploying ML models can be challenging, especially when it comes to tracking experiments, reproducing results, and managing models and dependencies. MLflow is a platform for the end-to-end management of ML projects, which helps to mitigate these challenges and simplify the ML workflow.

In this article, we’ll explore the features of MLflow and how to use it to manage your ML projects. We’ll cover the following topics:

Introduction to MLflow
Setting up MLflow
Tracking experiments
Packaging code into reproducible runs
Sharing and deploying models

1. Introduction to MLflow

MLflow is an open-source platform for the complete machine learning cycle, developed by Databricks. It provides a set of APIs and tools to manage the entire ML workflow, from experimenting and tracking to packaging and deploying.

MLflow’s modular design enables it to integrate with many tools, such as TensorFlow, PyTorch, and scikit-learn, to provide a unified interface for ML projects. MLflow provides a simple API for logging parameters, code versions, and results, making it easy to track and compare experiments.

2. Setting up MLflow with Docker

Without artifact store

To get started with MLflow, you’ll need to install it on your machine. The installation process is straightforward and can be completed using pip:

pip install mlflow

Once you’ve installed MLflow, you can start a local tracking server by running the following command:

mlflow server

This will start a web-based UI, which you can use to view and manage your ML projects. Examples of model management will be shown in the following section.

There are several problems that you may encounter if you don’t deploy MLflow with an artifact server:

Data loss: Without an artifact server, the artifacts generated during a run (such as models, parameters, and data) are stored on the local file system of the machine where the run was executed. This increases the risk of data loss in the case of machine failure or termination.
Limited accessibility: The artifacts are only accessible from the machine where the run was executed. This makes it difficult to collaborate with other team members or to move artifacts between machines for deployment.
Lack of versioning: Without an artifact server, there’s no way to easily track different versions of artifacts and compare them. This makes it difficult to determine which version of an artifact is the most recent or the best performing.
Inefficient storage: The local file system may not be optimized for storing large files, making it inefficient for storing large artifacts such as models and data.

By deploying MLflow with an artifact server, you can avoid these problems and enjoy the benefits of centralized artifact storage, versioning, and accessibility.

With an artifact store, deployed using docker

Here’s a Docker Compose file that starts both the MLflow tracking server, a Minio instance for use as the artifact store, and a PostgreSQL database to store experiment and run data:

version: "3"

networks:
  experiment-tracking: 
    external: false

services: 
  mlflow:
    container_name: mlflow
    build:
      context: .
    networks:
    - experiment-tracking
    command: mlflow server --backend-store-uri postgresql+psycopg2://${POSTGRES_USER}:${POSTGRES_PASSWORD}@db:5432/${POSTGRES_DB} --host 0.0.0.0 --default-artifact-root s3://${BUCKET_NAME}/
    environment:
    # Env vars in .env stop at docker-compose.yml, need to be explicitly passed to containers
    - AWS_SECRET_ACCESS_KEY=${MINIO_SECRET_KEY}
    - AWS_ACCESS_KEY_ID=${MINIO_ACCESS_KEY}
    - AWS_DEFAULT_REGION=${AWS_DEFAULT_REGION}
    - MLFLOW_S3_ENDPOINT_URL=${MLFLOW_S3_ENDPOINT_URL}
    - MLFLOW_S3_IGNORE_TLS=${MLFLOW_S3_IGNORE_TLS}
    ports: 
    - "5000:5000" # map mlflow's default 5000 to desired port
    expose:
    - 5000

  minio:
    image: minio/minio
    container_name: minio
    networks:
    - experiment-tracking
    volumes:
    - ./minio-data:/data
    environment:
    - MINIO_ACCESS_KEY=${MINIO_ACCESS_KEY}
    - MINIO_SECRET_KEY=${MINIO_SECRET_KEY}
    - MINIO_ROOT_USER=${MINIO_ROOT_USER}
    - MINIO_ROOT_PASSWORD=${MINIO_ROOT_PASSWORD}
    - AWS_DEFAULT_REGION=${AWS_DEFAULT_REGION}
    ports:
    - "5001:5001" # port to interact w/ minio
    - "5002:5002" # port to login to minio dashboard
    command: minio server /data --console-address ":5001" --address ":5002"
    expose:
      - 5002

  db:
    image: postgres:10-alpine
    container_name: db
    networks:
    - experiment-tracking
    volumes:
      - ./mlflow-data:/var/lib/postgresql/data
    # Use non-root user, else the folder inherits root access only from /var/lib/postgresql/data
    # user: ${DB_USER}
    # More hassle to configure, forget about it
    environment:
    - POSTGRES_DB=${POSTGRES_DB}
    - POSTGRES_USER=${POSTGRES_USER}
    - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}

The variables are read from a .env file located at the root folder, the same than the docker-compose.yml file.

This is a Docker Compose file in version 3. It defines three services: mlflow, minio, and db.

The mlflow service is the MLflow server, which is built from the current directory. The server will listen on port 5000 and will store data in a PostgreSQL database and artifacts in an S3-compatible bucket. The credentials for accessing the S3 bucket and the PostgreSQL database are passed as environment variables.
The minio service is a Minio server, which is an S3-compatible object storage server. The Minio server will listen on ports 5001 and 5002, and the access and secret keys, as well as the default region and root user and password, are passed as environment variables. Data will be stored in a local volume ./minio-data.
The db service is a PostgreSQL database to store some of the MLflow kpi.

3. Tracking experiments

One of the key features of MLflow is the ability to track experiments. With MLflow, you can log parameters, code versions, and results, making it easy to compare and reproduce experiments.

Here’s a simple example of how to log parameters and results using MLflow:

import mlflow
import mlflow.sklearn

def run_experiment(alpha, l1_ratio):
    with mlflow.start_run():
        mlflow.log_param("alpha", alpha)
        mlflow.log_param("l1_ratio", l1_ratio)
        # Your ML code goes here
        mlflow.log_metric("mean_absolute_error", mean_absolute_error)

run_experiment(0.1, 0.5)

In this example, we start a run using the mlflow.start_run() context manager and log the parameters alpha and l1_ratio using mlflow.log_param(). We then log the mean absolute error using mlflow.log_metric().

You can view the logged parameters and results in the MLflow UI.

4. Packaging code into reproducible runs

This code is a script for a simple machine-learning experiment that uses MLflow for experiment tracking.

import os
os.environ["AWS_SECRET_ACCESS_KEY"] = "${}"
os.environ["AWS_ACCESS_KEY_ID"] = "${}"
os.environ["MLFLOW_S3_ENDPOINT_URL"] = "http://${IP}:5002"
os.environ["MLFLOW_S3_IGNORE_TLS"] = "true"

# Wrangling
import pandas as pd
import numpy as np

# Modelling
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.metrics import mean_squared_error, mean_absolute_error

# Tracking
import mlflow
from mlflow.tracking import MlflowClient
mlflow.set_tracking_uri("http://${IP}:5000")

# Import data

# Data Wrangling and splitting

# Set the experiment name
mlflow.set_experiment('{EXPERIMENT_NAME}')
# As we use a library already integrates, we can use autolog to gather basic information about the experiment
mlflow.sklearn.autolog()
with mlflow.start_run(run_name='RUN NAME SAVED IN THE EXPERIMENT'):
    pipe = Pipeline([('scaler', StandardScaler()), ('lr', LinearRegression())])
    pipe.fit(X_train, y_train)
    mlflow.log_metric('test_mse',mean_squared_error(pipe.predict(X_test), y_test))
    mlflow.log_metric('val_mse',mean_squared_error(pipe.predict(X_val), y_val))
    mlflow.log_metric('test_mae',mean_absolute_error(pipe.predict(X_test), y_test))
    mlflow.log_metric('val_mae',mean_absolute_error(pipe.predict(X_val), y_val))

mlflow.end_run()

The code first sets the environment variables for the AWS secret access key, AWS access key ID, MLflow S3 endpoint URL, and MLflow S3 ignore TLS, which are used to connect to the Minio artifact store.

Next, the code imports the necessary libraries for data wrangling, modeling, and tracking. The code then sets the tracking URI to the URL of the MLflow tracking server, which is http://${IP}:5000 in this case.

After setting the tracking URI, the code sets the experiment name using mlflow.set_experiment() and logs basic information about the experiment using mlflow.sklearn.autolog().

Next, the code starts a run with mlflow.start_run() and logs the mean squared error and mean absolute error of the model's predictions on the test and validation datasets using mlflow.log_metric().

Finally, the code ends the run with mlflow.end_run(). This information will be logged and stored in the Minio artifact store, allowing you to view and compare the results of different runs in the MLflow UI.

5. Sharing and deploying models

MLflow makes it easy to share and deploy models. You can share models using the MLflow UI, or by providing a URL to the model’s artifact store.

To deploy a model, you can use the mlflow.sklearn.load_model() function to load the model from the artifact store. You can then use the loaded model for predictions in your application.

Here’s a simple example of how to deploy a scikit-learn model:

import mlflow.sklearn

model = mlflow.sklearn.load_model("runs:/<run_id>/model")

# Use the model for predictions in your application
predictions = model.predict(X_test)

Conclusion

In this article, we covered the basics of MLflow and how it can be used to manage ML projects. MLflow provides a simple and unified interface for the complete ML workflow, making it easier to experiment, track, and deploy ML models. With its modular design and integration with popular ML frameworks, MLflow is a powerful tool for managing ML projects.

If you liked this post, I usually post about maths, machine learning, and starting to publish about data engineering. Do not hesitate to follow my profile to get notified about new posts

https://medium.com/@crunchyml