Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

MLflow integration for DataRobot

Availability information

The MLflow integration for DataRobot is a preview feature. Contact your DataRobot representative or administrator for information on using this feature.

The MLflow integration for DataRobot allows you to export a model from MLflow and import it into the DataRobot Model Registry, creating key values from the training parameters, metrics, tags, and artifacts in the MLflow model.

Prerequisites for the MLflow integration

The MLflow integration for DataRobot requires the following:

  • Python >= 3.9
  • DataRobot >= 9.0

This integration library uses a preview API endpoint; the DataRobot user associated with your API token must have Owner or User permissions for the DataRobot model package.

Install the MLflow integration for DataRobot

You can install the datarobot-mlfow integration with pip:

pip installation
pip install datarobot-mlflow

If you are running the integration on Azure, use the following command:

Azure pip installation
pip install "datarobot-mlflow[azure]"

Configure command line options

The following command line options are available for the drflow_cli:

Option Description
--mlflow-url Defines the MLflow tracking URL; for example:
  • Local MLflow: "file:///Users/me/mlflow/examples/mlruns"
  • Azure Databricks MLflow: "azureml://region.api.azureml.ms/mlflow/v1.0/subscriptions/subscription-id/resourceGroups/resource-group-name/providers/Microsoft.MachineLearningServices/workspaces/azure-ml-workspace-name"
--mlflow-model Defines the MLflow model name; for example, "cost-model".
--mlflow-model-version Defines the MLflow model version; for example, "2".
--dr-url Provides the main URL of DataRobot instance; for example, https://app.datarobot.com.
--dr-model Defines the ID of the registered model for key value upload; for example, 64227b4bf82db411c90c3209.
--prefix Provides a string to prepend to the names of all key values imported to DataRobot. The default value is empty.
--debug Sets the Python logging level to logging.DEBUG. The default level is logging.WARNING.
--verbose Prints information to stdout during the following processes:
  • Retrieving model from MLflow: prints model information.
  • Setting model data in DataRobot: prints each key value added to DataRobot.
--with-artifacts Downloads MLflow model artifacts to /tmp/model.
--service-provider-type Defines the service provider for validate-auth. The supported value is azure-databricks for Databricks MLflow in Azure.
--auth-type Defines the authentication type for validate-auth. The supported value is azure-service-principal for Azure Service Principal.
--action Defines the operation you want the MLflow integration for DataRobot to perform.

The following command line operations are available for the --action option:

Action Description
sync Imports parameters, tags, metrics, and artifacts from an MLflow model into a DataRobot model package as key values. This action requires --mlflow-url, --mlflow-model, --mlflow-model-version, --dr-url, and --dr-model.
list-mlflow-keys Lists parameters, tags, metrics, and artifacts in an MLflow model. This action requires --mlflow-url, --mlflow-model, and --mlflow-model-version.
validate-auth Validates the Azure AD Service Principal credentials for troubleshooting purposes. This action requires --auth-type and --service-provider-type.

Set environment variables

In addition to the command line options above, you should also provide any environment variables required for your use case:

Environment variable Description
MLOPS_API_TOKEN A DataRobot API key, found in the DataRobot Developer Tools.
AZURE_TENANT_ID The Azure Tenant ID for your Azure Databricks MLflow instance, found in the Azure portal.
AZURE_CLIENT_ID The Azure Client ID for your Azure Databricks MLflow instance, found in the Azure portal.
AZURE_CLIENT_SECRET The Azure Client Secret for your Azure Databricks MLflow instance, found in the Azure portal.

You can use export to define these environment variables with the information required for your use case:

export MLOPS_API_TOKEN="<dr-api-key>"
export AZURE_TENANT_ID="<tenant-id>"
export AZURE_CLIENT_ID="<client-id>"
export AZURE_CLIENT_SECRET="<secret>"

Run the sync action to import a model from MLflow into DataRobot

You can use the command line options and actions defined above to export MLflow model information from MLflow and import it into the DataRobot Model Registry:

Import from MLflow
DR_MODEL_ID="<MODEL_PACKAGE_ID>"

env PYTHONPATH=./ \
python datarobot_mlflow/drflow_cli.py \
  --mlflow-url http://localhost:8080 \
  --mlflow-model cost-model  \
  --mlflow-model-version 2 \
  --dr-model $DR_MODEL_ID \
  --dr-url https://app.datarobot.com \
  --with-artifacts \
  --verbose \
  --action sync

After you run this command successfully, you can see MLflow information on the Key Values tab of a Registered Model version:

In addition, in the Activity log of the Key Values tab, you can view a record of the key value creation events:

Troubleshoot Azure AD Service Principal credentials

To validate Azure AD Service Principal credentials for troubleshooting purposes, you can use the following command line example:

Validate Azure AD Service Principal credentials
export MLOPS_API_TOKEN="n/a"  # not used for Azure auth check, but the environment variable must be present

env PYTHONPATH=./ \
python datarobot_mlflow/drflow_cli.py \
  --verbose \
  --auth-type azure-service-principal \
  --service-provider-type azure-databricks \
  --action validate-auth

This command should produce the following output if you haven't configured the required environment variables:

Example output: missing environment variables
Required environment variable is not defined: AZURE_TENANT_ID
Required environment variable is not defined: AZURE_CLIENT_ID
Required environment variable is not defined: AZURE_CLIENT_SECRET
Azure AD Service Principal credentials are not valid; check environment variables

If you see this error, provide the required Azure AD Service Principal credentials as environment variables:

Provide Azure AD Service Principal credentials
export AZURE_TENANT_ID="<tenant-id>"
export AZURE_CLIENT_ID="<client-id>"
export AZURE_CLIENT_SECRET="<secret>"

When the environment variables for the Azure AD Service Principal credentials are defined, you should see the following output:

Example output: successful authentication
Azure AD Service Principal credentials are valid for obtaining access token

Updated August 6, 2024