Proto Model
⚠️ This project is deprecated and will no longer be maintained. For maintained project, please refer to the Proto Model Turbo.
The Proto Model project contains a simple implementation of the TANGO Interfaces for creating a TANGO Model, ready to be deployed in an MLflow environment. This is a prototype for a TANGO explainable model, including its explanation given by using the LORE library.
This prototype serves as an example implementation for other developers and project partners, demonstrating how to integrate the TANGO interfaces into a machine learning model.
For this example, we took a standard and well-understood ML model (RandomForest from sklearn),
wrapped it as an ExplainableTangoModel by integrating it (via a set_explainer() method) with LORE.
The RandomForestWrapperModel acts as the core implementation, supported by the RandomForestExplainer
for interpretability and leveraging the LORE method for explanations.
The structured use of common interfaces ensures extensibility and consistent integration across different
models and frameworks.
Init project
Install requirements.
pip install -r requirements.txt
Run a development environment
Install requirements.
pip install -r requirements-dev.txt
If useful, install local version for tango-interfaces.
pip uninstall tango-interfaces
pip install -e <path>/tango-interfaces/
Run tracking server and model registry
Run local tracking server and model registry.
mlflow ui
Prepare the environment for training/serving
Export the following environment variables.
export MLFLOW_TRACKING_URI="http://127.0.0.1:5000"
export MLFLOW_EXPERIMENT_NAME="test"
Use these if the server requires authentication.
export MLFLOW_TRACKING_USERNAME=<username>
export MLFLOW_TRACKING_PASSWORD=<password>
Make a training run
To make a run,
- ensuring that you have correctly exported the environment variables as described above
- execute the following command (this operation takes some time as creates environment).
mlflow run --env-manager local ./src
mlflow run general command syntax:
mlflow run . -P <param1=param1>
The MLproject file defines the entry point for the project, which is main.py, and the parameters that can be passed to it.
A run of the project performs the following steps:
- Load the dataset from the data source.
- Preprocess the raw training data.
- For each model type:
- Start a new run on MLflow.
- Perform hyperparameters tuning (using Optuna) with cross-validation on the training set, to extract the best hyperparameters. Each trial is logged as a child run on MLflow, and does not end with a registration of the model in the MLflow Model Registry
- Train the model on the whole training set, with the best hyperparameters found.
- Evaluate the model by computing balanced_accuracy_score, along with other metrics.
- Store the trained model in MLflow as a new version for model proto_model.
- Compare the performance of each version and choose the best one according to balanced_accuracy_score. The best version of the model is assigned to the alias challenger on MLflow.
Running local model server
Install requirements.
pip install -r requirements-dev-modelserver.txt
Run a model server to serve the model
Execute model server.
# Export the MLFLOW_RUN_ID variable obtained from the training run
export MLFLOW_RUN_ID="<mlflow_run_id>"
mlflow models serve -m runs:/$MLFLOW_RUN_ID/model --enable-mlserver -p 5001
Invoke the model
To invoke the model, you need to make a POST request to the model server with the input data in JSON format.
An example of such a request for the proto-model can be found in the run_proto_model_invocation.py file.
# After serving the `proto-model` model server on port 5001,
# you can trigger a model invocation by running:
python run_proto_model_invocation.py
Introduction
The purpose of this project is to train and serve a machine learning model to solve the task of proto_model. This is achieved by trying different model types, pipelines and hyperparameters. MLflow is used to manage the models' lifecycle: you can find the documentation here.
Models on MLflow are organized in a Model Registry: this project will produce different versions for the model proto_model, one for each model type.
The version that is currently served in production is called the Champion model version, while the version that is competing with the Champion model version is called the Challenger model version. Versions types are managed in MLflow Model Registry thanks to the aliases champion and challenger. Each model can have only one Challenger and one Champion at the same time.
Each run of the project produces n model versions stored on MLflow Model Registry, where n is the number of model types. One of these versions is chosen as the Challenger model version, which is the version that competes with the current Champion model version.
The comparison metric used to choose the best model version is balanced_accuracy_score.
Define the Champion Model Version
To perform a comparison between the current Champion model version and the new Challenger model version, execute the following command:
python compare.py
This script will load the Champion and the Challenger model versions from MLflow, compare them and decide if the Challenger model version should be promoted to Champion. The comparison is based on the metric balanced_accuracy_score.
Testing
TODO
Creating a Docker image
As Proto Model does not have a dedicated GitLab CI/CD pipeline, you cannot rely on automatic Docker image update. To create an updated image of the model, follow these instructions.
Train a model
Train the updated model on the preferred target MLOps platform.
export MLFLOW_TRACKING_URI="https://mlflow.u-hopper.com"
export MLFLOW_EXPERIMENT_NAME="tango-proto-model"
export MLFLOW_TRACKING_USERNAME="datascience"
export MLFLOW_TRACKING_PASSWORD="**************"
mlflow run --env-manager local ./src
Set the champion label on the last created model version (from tracking server UI).
Create Docker image for latest created model
Access Lab on proto-model-deployment project.
Execute a pipeline indicating the specific environment.
DEPLOYMENT_ENV = production
or
DEPLOYMENT_ENV = staging
Test
To test config generation, execute mlserver configs generation script.
./generate_mlserver_configs.sh
Install generated requirements.
pip install -r ./build/mlserver_configs/proto_model/artifact/model/requirements.txt
Test model server start.
mlserver start build
License
This project is licensed under the Apache License 2.0. See the LICENSE file for details.