Test custom models locally¶
To access the DataRobot Model Runner tool, contact your DataRobot representative.
The DataRobot Model Runner tool, named DRUM, is a tool that allows you to test Python, R, and Java custom models locally. The test verifies that a custom model can successfully run and make predictions before you upload it to DataRobot. However, this testing is only for development purposes. DataRobot recommends that any custom model you wish to deploy is also tested in the Custom Model Workshop after uploading it.
Before proceeding, reference the guidelines for setting up a custom model or environment folder.
The DataRobot Model Runner tool supports Python, R, and Java custom models.
Reference the DRUM readme for details about additional functionality, including:
- Custom hooks
- Performance tests
- Running models with a prediction server
- Running models inside a Docker container
Install the DataRobot Model Runner¶
The following describes the DRUM installation workflow. Consider the coding language prerequisites before proceeding.
|Python||Python 3 recommended||pip install datarobot-drum|
|Java||JRE ≥ 11||pip install datarobot-drum|
|R||* Python ≥ 3.6 * R framework installed Note that drum uses the rpy2 package to run R (the latest version is installed by default). You may need to adjust the rpy2 and pandas versions for compatibility.||pip install datarobot-drum[R]|
To install the DRUM with support for Python and Java models, use the following command:
pip install datarobot-drum
To install DRUM with support for R models:
pip install datarobot-drum[R]
If you are using a Conda environment, install the wheels with a
--no-deps flag. If any dependencies are required for a Conda environment, install them with Conda tools.
Custom model folder contents¶
The model folder must contain the model artifacts and any other code needed for
drum to run the model.
drum has built-in support for the following libraries; if your model is based on one of these libraries,
drum expects your model artifact to have a matching file extension.
DRUM supports models with DataRobot-generated Scoring Code and models that implement either the IClassificationPredictor or IRegressionPredictor interface from the DataRobot-prediction library. The model artifact must have a
For additional parameters, define the
DRUM_JAVA_XMX environment variable to set JVM maximum heap memory size (
-Xmx java parameter).
In addition to the required folder contents, DRUM requires the following for your serialized model:
- Regression models must return a single floating point per row of prediction data.
- Binary classification models must return two floating point values that sum to 1.0 per row of prediction data.
- The first value must be the positive class probability, and the second the negative class probability.
- There is a single pkl/pth/h5 file present.
Run tests with the DataRobot CM Runner¶
Use the following commands to execute local tests for your custom model.
List all possible arguments¶
Test a custom binary classification model¶
Make batch predictions with a custom binary classification model. Optionally, specify an output file. Otherwise, predictions are returned to the command line:
drum score -m ~/custom_model/ --input <input-dataset-filename.csv> [--positive-class-label <labelname>] [--negative-class-label <labelname>] [--output <output-filename.csv>] [--verbose] # Use --verbose for a more detailed output
drum score -m ~/custom_model/ --input 10k.csv --positive-class-label yes --negative-class-label no --output 10k-results.csv --verbose
Test a custom regression model¶
Make batch predictions with a custom regression model.
drum score -m ~/custom_model/ --input <input-dataset-filename.csv> [--output <output-filename.csv>] [--verbose]
# This is an example that does not include an output command, so the prediction results return in the command line. drum score -m ~/custom_model/ --input fast-iron.csv --verbose