Skip to main content

Proto Model Turbo

The Proto Model Turbo project is a fork of the Proto Model project.

🚀 Key Features in Turbo​

Proto Model Turbo extends the original project by solving two critical challenges in model deployment: decoupling heavy data processing and handling large file sizes.

FeatureProblem Solved
1. Data MapperSeparates lightweight data processing from the core model, avoiding the need to load the entire model for simple transformations.
2. File I/O SupportEnables predictions and explanations for datasets that exceed standard JSON payload limits.

Inherited Functionality (The Proto Model Foundation)​

The original Proto Model is now deprecated and will no longer be maintained in favor of this new project.

The Proto Model offers a straightforward implementation of the TANGO Interfaces for creating a TANGO Model, ready for deployment in an MLflow environment. This original project serves as a prototype for an explainable TANGO model, including generating its explanations using the LORE library.

This prototype is intended as an example implementation for developers and project partners, demonstrating how to properly integrate the TANGO interfaces into a machine learning model.

For this example, we took a RandomForest model from sklearn (a standard and well-understood ML model) and wrapped it as an ExplainableTangoModel. We achieved this by integrating it with LORE via a set_explainer() method. The RandomForestWrapperModel forms the core implementation, which is supported by the RandomForestExplainer for interpretability, leveraging the LORE method for generating explanations. The use of structured, common interfaces ensures extensibility and consistent integration across different models and frameworks.

The original project is now deprecated and will no longer be maintained in favor of this new project.

The Proto Model Turbo project extends the original Proto Model with the following key features:

  • Implementation of a mapper to prepare input data for the model and post-process output data for the client.
  • Support for file-based input/output to handle large data that cannot be sent or received via JSON.

The Model Mapper and Architectural Separation​

The Proto Model Turbo improves on the original by separating the data processing logic from the core model logic through the implementation of a dedicated mapper.

The TANGO Request Flow​

The TANGO pipeline uses two distinct Python classes to handle a request, promoting a lighter, faster architecture:

ComponentClassPurpose
Mapper (TANGO Pipeline)Extends TangoModelHandles data transformation (map_request and map_response).
Core Model (MLflow Host)Extends MLPythonModelPerforms the prediction (predict) after data is preprocessed.

How the Mapper Works​

The mapper must implement the map_request and map_response methods.

  • map_request preprocessing occurs before the model is invoked.
  • map_response post-processing occurs after the raw model output is received.

Critically, this transformation occurs without loading the entire model and all its dependencies. The TANGO architecture executes the mapper, then calls the MLflow REST API's /invocations endpoint to invoke the separate predict method hosted by the model, which is not part of the TANGO pipeline. This division results in cleaner, more maintainable code and lightens the overall workload on the TANGO pipeline.

Support for File-based Input/Output​

The Proto Model Turbo can predict and explain data in both JSON and file formats. The JSON format is inherited directly from the original Proto Model project, while the file format is a new feature that allows the handling of large input/output data that exceeds the capacity of JSON transfer.

The Proto Model Turbo can be trained and served with two different signatures: one accepting the JSON format (inherited by the Proto Model) and the other accepting the file format.

When using the file format, the model is served with a different signature.

  • Input: A presigned URL to download the input data file and a presigned URL to upload the resulting output data file.
  • Output: The model can or cannot return a response body, as the output data is already stored in the file at the provided upload URL. If it is returned, it is handled by the workflow manager and included in the invocation response with the file ID.

File Flow Logic​

  1. map_request: This method is simplified, merely preparing the request by passing the upload and download presigned URLs for model invocation.
  2. predict: The model implementation is solely responsible for downloading the input file, performing the prediction, and uploading the output file to the provided URLs.
  3. map_response: This method can be simplified or omitted, as the output file is already uploaded. If it returns a response body, it is included in the invocation response with the file ID.

MLOps Concept & Model Lifecycle​

The purpose of this project is to train and serve a machine learning model to solve the task of proto_model. This is achieved by trying different model types, pipelines, and hyperparameters.

MLflow is used to manage the model's lifecycle, organizing versions in a Model Registry. This project produces different versions for the proto_model model, one for each model type tried.

  • Champion: The model version currently served in production (set via the champion alias).
  • Challenger: The new model version competing with the Champion (set via the challenger alias).

The comparison metric used to choose the best model version is balanced_accuracy_score.

Setup & Dependencies​

Prerequisites​

Start the local MLflow tracking server and model registry.

mlflow ui

Environment Variables​

Export these variables to link your local training/serving commands to the MLflow server.

export MLFLOW_TRACKING_URI="[http://127.0.0.1:5000](http://127.0.0.1:5000)"
export MLFLOW_EXPERIMENT_NAME="test"
# Optional: Use these if your server requires authentication
# export MLFLOW_TRACKING_USERNAME=<username>
# export MLFLOW_TRACKING_PASSWORD=<password>

Install Requirements​

Install the necessary dependencies for development and the core project.

# Core project requirements
pip install -r requirements.txt

# Development environment requirements
pip install -r requirements-dev.txt

# If needed, install a local version of tango-interfaces
# pip uninstall tango-interfaces
# pip install -e <path>/tango-interfaces/

Training the Model (mlflow run)​

A run of the project performs training, hyperparameter tuning, evaluation, and registers the model in MLflow. This operation takes time as it creates the runtime environment.

Training Process Overview​

  1. Load the dataset and preprocess the raw training data.
  2. For each model type, start a new run on MLflow.
  3. Train the model.
  4. Evaluate the model and log metrics like balanced_accuracy_score.
  5. Store the trained model in MLflow as a new version.
  6. The best-performing version is assigned the challenger alias.

Execute Training Run​

Choose the run command based on the data format you need to support:

FormatCommandDescription
JSON (Default)mlflow run --env-manager local ./srcTrains the model for standard JSON input/output.
File I/Omlflow run --env-manager local ./src -P file_management=trueTrains the model with support for presigned URL file transfer.

4. Serving and Invocation​

Serve Model Requirements​

Install the dependencies needed to run the local model server.

pip install -r requirements-dev-modelserver.txt

Run Local Model Server​

Serve the model using the MLFLOW_RUN_ID generated by your training run (found in the console output or MLflow UI).

# Export the MLFLOW_RUN_ID obtained from the training run
export MLFLOW_RUN_ID="<mlflow_run_id>"

# Start the server on port 5001
mlflow models serve -m runs:/$MLFLOW_RUN_ID/model --enable-mlserver -p 5001

Invoke the Model​

Invoke the locally served model using a helper script that sends a POST request with JSON input data.

python run_proto_model_invocation.py

Model Promotion: Champion vs. Challenger​

To compare a new Challenger model against the current Champion and potentially promote it, execute the following script:

python compare.py

This script loads both versions, compares their performance based on balanced_accuracy_score, and updates the MLflow aliases if the Challenger is superior.

Advanced: Deployment & Docker​

As Proto Model does not have a dedicated GitLab CI/CD pipeline, you cannot rely on automatic Docker image updates. Follow these instructions to manually create an updated image.

1. Train and Promote​

Train the updated model on the preferred target MLOps platform.

export MLFLOW_TRACKING_URI="[https://mlflow.u-hopper.com](https://mlflow.u-hopper.com)"
export MLFLOW_EXPERIMENT_NAME="tango-proto-model"
# ... export authentication variables ...

mlflow run --env-manager local ./src

Set the champion label on the desired model version (usually via the tracking server UI).

2. Create Docker Image​

Access the pipelines on the proto-model-deployment project and execute a pipeline, indicating the target environment.

DEPLOYMENT_ENV = production
# OR
# DEPLOYMENT_ENV = staging

Testing MLServer Configuration​

To test configuration generation, execute the MLServer configs generation script and verify the server starts correctly.

# Generate configs
./generate_mlserver_configs.sh

# Install generated requirements
pip install -r ./build/mlserver_configs/proto_model/artifact/model/requirements.txt

# Test model server start
mlserver start build

License​

This project is licensed under the Apache License 2.0. See the LICENSE file for details.