Proto Model

⚠️ This project is deprecated and will no longer be maintained. For maintained project, please refer to the Proto Model Turbo.

The Proto Model project contains a simple implementation of the TANGO Interfaces for creating a TANGO Model, ready to be deployed in an MLflow environment. This is a prototype for a TANGO explainable model, including its explanation given by using the LORE library.

This prototype serves as an example implementation for other developers and project partners, demonstrating how to integrate the TANGO interfaces into a machine learning model.

For this example, we took a standard and well-understood ML model (RandomForest from sklearn), wrapped it as an ExplainableTangoModel by integrating it (via a set_explainer() method) with LORE. The RandomForestWrapperModel acts as the core implementation, supported by the RandomForestExplainer for interpretability and leveraging the LORE method for explanations. The structured use of common interfaces ensures extensibility and consistent integration across different models and frameworks.

Init project

Install requirements.

pip install -r requirements.txt

Run a development environment

Install requirements.

pip install -r requirements-dev.txt

If useful, install local version for tango-interfaces.

pip uninstall tango-interfaces
pip install -e <path>/tango-interfaces/

Run tracking server and model registry

Run local tracking server and model registry.

mlflow ui

Prepare the environment for training/serving

Export the following environment variables.

export MLFLOW_TRACKING_URI="http://127.0.0.1:5000"
export MLFLOW_EXPERIMENT_NAME="test"

Use these if the server requires authentication.

export MLFLOW_TRACKING_USERNAME=<username>
export MLFLOW_TRACKING_PASSWORD=<password>

Make a training run

To make a run,

ensuring that you have correctly exported the environment variables as described above
execute the following command (this operation takes some time as creates environment).

mlflow run --env-manager local ./src

`mlflow run` general command syntax:

mlflow run . -P <param1=param1>

The MLproject file defines the entry point for the project, which is main.py, and the parameters that can be passed to it.

A run of the project performs the following steps:

Load the dataset from the data source.
Preprocess the raw training data.
For each model type:
1. Start a new run on MLflow.
2. Perform hyperparameters tuning (using Optuna) with cross-validation on the training set, to extract the best hyperparameters. Each trial is logged as a child run on MLflow, and does not end with a registration of the model in the MLflow Model Registry
3. Train the model on the whole training set, with the best hyperparameters found.
4. Evaluate the model by computing balanced_accuracy_score, along with other metrics.
5. Store the trained model in MLflow as a new version for model proto_model.
Compare the performance of each version and choose the best one according to balanced_accuracy_score. The best version of the model is assigned to the alias challenger on MLflow.

Running local model server

Install requirements.

pip install -r requirements-dev-modelserver.txt

Run a model server to serve the model

Execute model server.

# Export the MLFLOW_RUN_ID variable obtained from the training run
export MLFLOW_RUN_ID="<mlflow_run_id>"
mlflow models serve -m runs:/$MLFLOW_RUN_ID/model --enable-mlserver -p 5001

Invoke the model

To invoke the model, you need to make a POST request to the model server with the input data in JSON format.

An example of such a request for the proto-model can be found in the run_proto_model_invocation.py file.

# After serving the `proto-model` model server on port 5001,
# you can trigger a model invocation by running:
python run_proto_model_invocation.py

Introduction

The purpose of this project is to train and serve a machine learning model to solve the task of proto_model. This is achieved by trying different model types, pipelines and hyperparameters. MLflow is used to manage the models' lifecycle: you can find the documentation here.

Models on MLflow are organized in a Model Registry: this project will produce different versions for the model proto_model, one for each model type.

The version that is currently served in production is called the Champion model version, while the version that is competing with the Champion model version is called the Challenger model version. Versions types are managed in MLflow Model Registry thanks to the aliases champion and challenger. Each model can have only one Challenger and one Champion at the same time.

Each run of the project produces n model versions stored on MLflow Model Registry, where n is the number of model types. One of these versions is chosen as the Challenger model version, which is the version that competes with the current Champion model version.

The comparison metric used to choose the best model version is balanced_accuracy_score.

Define the Champion Model Version

To perform a comparison between the current Champion model version and the new Challenger model version, execute the following command:

python compare.py

This script will load the Champion and the Challenger model versions from MLflow, compare them and decide if the Challenger model version should be promoted to Champion. The comparison is based on the metric balanced_accuracy_score.

Testing

TODO

Creating a Docker image

As Proto Model does not have a dedicated GitLab CI/CD pipeline, you cannot rely on automatic Docker image update. To create an updated image of the model, follow these instructions.

Train a model

Train the updated model on the preferred target MLOps platform.

export MLFLOW_TRACKING_URI="https://mlflow.u-hopper.com"
export MLFLOW_EXPERIMENT_NAME="tango-proto-model"
export MLFLOW_TRACKING_USERNAME="datascience"
export MLFLOW_TRACKING_PASSWORD="**************"

mlflow run --env-manager local ./src

Set the champion label on the last created model version (from tracking server UI).

Create Docker image for latest created model

Access Lab on proto-model-deployment project.

Execute a pipeline indicating the specific environment.

DEPLOYMENT_ENV = production

DEPLOYMENT_ENV = staging

Test

To test config generation, execute mlserver configs generation script.

./generate_mlserver_configs.sh

Install generated requirements.

pip install -r ./build/mlserver_configs/proto_model/artifact/model/requirements.txt

Test model server start.

mlserver start build

License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.

Init project​

Run a development environment​

Run tracking server and model registry​

Prepare the environment for training/serving​

Make a training run​

mlflow run general command syntax:​

Running local model server​

Run a model server to serve the model​

Invoke the model​

Introduction​

Define the Champion Model Version​

Testing​

Creating a Docker image​

Train a model​

Create Docker image for latest created model​

Test​

License​

Init project

Run a development environment

Run tracking server and model registry

Prepare the environment for training/serving

Make a training run

`mlflow run` general command syntax:

Running local model server

Run a model server to serve the model

Invoke the model

Introduction

Define the Champion Model Version

Testing

Creating a Docker image

Train a model

Create Docker image for latest created model

Test

License