DRUM CLI tool¶
DataRobot user model (DRUM) is a CLI tool that allows you to work with Python, R, and Java custom models and to quickly test custom tasks, custom models, and custom environments locally before uploading into DataRobot. Because it is also used to run custom tasks and models inside of DataRobot, if they pass local tests with DRUM, they are compatible with DataRobot. You can download DRUM from PyPI and access DRUM's GitHub repo.
DRUM can also:
-
Run performance and memory usage testing for models.
-
Perform model validation tests (for example, checking model functionality on corner cases, like null values imputation).
-
Run models in a Docker container.
You can install DRUM for Ubuntu, Windows, or MacOS.
Note
DRUM is not regularly tested on Windows or Mac. These steps may differ depending on the configuration of your machine.
DRUM on Ubuntu¶
The following describes the DRUM installation workflow. Consider the language prerequisites before proceeding.
Language | Prerequisites | Installation command |
---|---|---|
Python | Python 3 required | pip install datarobot-drum |
Java | JRE ≥ 11 | pip install datarobot-drum |
R |
|
pip install datarobot-drum[R] |
To install the DRUM with support for Python and Java models, use the following command:
pip install datarobot-drum
To install DRUM with support for R models:
pip install datarobot-drum[R]
Note
If you are using a Conda environment, install the wheels with a --no-deps
flag. If any dependencies are required for a Conda environment, install them with Conda tools.
DRUM on Mac¶
The following instructions describe installing DRUM with conda
(although you can use other tools if you prefer) and then using DRUM to test a task locally. Before you begin, DRUM requires:
-
An installation of
conda
. -
A Python environment (also required for R) of 3.7+.
Install DRUM on Mac¶
-
Create and activate a virtual environment with Python 3.7+. In the terminal for 3.8, run:
conda create -n DR-custom-tasks python=3.8 -y conda activate DR-custom-tasks
-
Install DRUM:
conda install -c conda-forge uwsgi -y pip install datarobot-drum
-
To set up the environment, install Docker Desktop and download from GitHub the DataRobot drop-in environments where your tasks will run. This recommended procedure ensures that your tasks run in the same environment both locally and inside DataRobot.
Alternatively, if you plan to run your tasks in a local
python
environment, install packages used by your custom task into the same environment as DRUM.
Use DRUM on Mac¶
To test a task locally, run the drum fit
command. For example, in a binary classification project:
-
Ensure that the
conda
environment DR-custom-tasks is activated. -
Run the
drum fit
command (replacing placeholder folder names in< >
brackets with actual folder names):drum fit --code-dir <folder_with_task_content> --input <test_data.csv> --target-type binary --target <target_column_name> --docker <folder_with_dockerfile> --verbose
For example:
drum fit --code-dir datarobot-user-models/custom_tasks/examples/python3_sklearn_binary --input datarobot-user-models/tests/testdata/iris_binary_training.csv --target-type binary --target Species --docker datarobot-user-models/public_dropin_environments/python3_sklearn/ --verbose
Tip
To learn more, you can view available parameters by typing drum fit --help
on the command line.
DRUM on Windows with WSL2¶
DRUM can be run on Windows 10 or 11 with WSL2 (Windows Subsystem for Linux), a native extension that is supported by the latest versions of Windows and allows you to easily install and run Linux OS on a Windows machine. With WSL, you can develop custom tasks and custom models locally in an IDE on Windows, and then immediately test and run them on the same machine using DRUM via the Linux command line.
Tip
You can use this YouTube video for instructions on installing WSL into Windows 11 and updating Ubuntu.
The following phases are required to complete the Windows DRUM installation:
Enable Linux (WSL)¶
-
From Control Panel > Turn Windows features on or off, check the option Windows Subsystem for Linux. After making changes, you will be prompted to restart.
-
Open Microsoft store and click to get Ubuntu.
-
Install Ubuntu and launch it from the start prompt. Provide a Unix username and password to complete installation. You can use any credentials but be sure to record them as they will be required in the future.
You can access Ubuntu at any time from the Windows start menu. Access files on the C drive under /mnt/c/.
Install pyenv¶
Because Ubuntu in WSL comes without Python or virtual environments installed, you must install pyenv
, a Python version management program used on macOS and Linux. (Learn about managing multiple Python environments here.)
In the Ubuntu terminal, run the following commands (you can ignore comments) row by row:
cd $HOME
sudo apt update --yes
sudo apt upgrade --yes
sudo apt-get install --yes git
git clone https://github.com/pyenv/pyenv.git ~/.pyenv
#add pyenv to bashrc
echo '# Pyenv environment variables' >> ~/.bashrc
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
echo '# Pyenv initialization' >> ~/.bashrc
echo 'if command -v pyenv 1>/dev/null 2>&1; then' >> ~/.bashrc
echo ' eval "$(pyenv init -)"' >> ~/.bashrc
echo 'fi' >> ~/.bashrc
#restart shell
exec $SHELL
#install pyenv dependencies (copy as a single line)
sudo apt-get install --yes libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libgdbm-dev lzma lzma-dev tcl-dev libxml2-dev libxmlsec1-dev libffi-dev liblzma-dev wget curl make build-essential python-openssl
#install python 3.7 (it can take awhile)
pyenv install 3.7.10
Install DRUM on Windows¶
To install DRUM, first you setup a Python environment where DRUM will run, and then install DRUM in that environment.
-
Create and activate a
pyenv
environment:cd $HOME pyenv local 3.7.10 .pyenv/shims/python3.7 -m venv DR-custom-tasks-pyenv source DR-custom-tasks-pyenv/bin/activate
-
Install DRUM and its dependencies into that environment:
pip install datarobot-drum exec $SHELL
-
Download container environments, where DRUM will run, from Github.
git clone https://github.com/datarobot/datarobot-user-models
Install Docker Desktop¶
While you can run DRUM directly in the pyenv
environment, it is preferable to run it in a Docker container. This recommended procedure ensures that your tasks run in the same environment both locally and inside DataRobot, as well as simplifies installation.
-
Download and install Docker Desktop, following the default installation steps.
-
Enable Ubuntu version WSL2 by opening Windows PowerShell and running:
wsl.exe --set-version Ubuntu 2 wsl --set-default-version 2
Note
You may need to download and install an update. Follow the instructions in the PowerShell until you see the Conversion complete message.
-
Enable access to Docker Desktop from Ubuntu:
- From the Window's task bar, open Docker Dashboard, then access Settings (the gear icon).
- Under Resources > WSL integration > Enable integration with additional distros, toggle on Ubuntu.
- Apply changes and restart.
Use DRUM on Windows¶
-
From the command line, open an Ubuntu terminal.
-
Use the following commands to activate the environment:
cd $HOME source DR-custom-tasks-pyenv/bin/activate
-
Run the
drum fit
command in an Ubuntu terminal window (replacing placeholder folder names in< >
brackets with actual folder names):drum fit --code-dir <folder_with_task_content> --input <test_data.csv> --target-type binary --target <target_column_name> --docker <folder_with_dockerfile> --verbose
For example:
drum fit --code-dir datarobot-user-models/custom_tasks/examples/python3_sklearn_binary --input datarobot-user-models/tests/testdata/iris_binary_training.csv --target-type binary --target Species --docker datarobot-user-models/public_dropin_environments/python3_sklearn/ --verbose