Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

Create a custom environment

Availability information

Custom execution environment management in NextGen is on by default.

Feature flag: Enable Custom Execution Environments NextGen UI

Once uploaded into DataRobot, custom models, jobs, applications, and notebooks run inside of environments—Docker containers running in Kubernetes. In other words, DataRobot copies the uploaded files defining the custom task into the image container. In most cases, adding a custom environment is not required because there are a variety of built-in environments available in DataRobot. For more information on creating a custom environment for custom models, review the guidelines below.

Custom model environment guidelines

Python and/or R packages can be easily added to these environments by uploading a requirements.txt file with the code. A custom environment is only required when a custom model:

  • Requires additional Linux packages.
  • Requires a different operating system.
  • Uses a language other than Python, R, or Java.

This document describes how to build a custom environment for these cases. To assemble and test a custom environment locally, install both Docker Desktop and the DataRobot user model (DRUM) CLI tool on your machine.

Note

DataRobot recommends using an environment template and not building your own environment except for specific use cases. (For example, you don't want to use DRUM but you want to implement your own prediction server.)

If you'd like to use a tool, language, or framework that is not supported by our template environments, you can make your own. DataRobot recommends modifying the provided environments to suit your needs; however, to make an easy-to-use, re-usable environment, you should adhere to the following guidelines:

  • Your environment must include a Dockerfile that installs any requirements you may want.

  • Custom models require a simple webserver to make predictions. DataRobot recommends putting this in your environment so you can reuse it with multiple models. The webserver must be listening on port 8080 and implement the following routes:

    URL prefix environment variable

    URL_PREFIX is an environment variable that is available at runtime. It must be added to the routes below.

    Mandatory endpoints Description
    GET /URL_PREFIX/ This route is used to check if your model's server is running.
    POST /URL_PREFIX/predict/ This route is used to make predictions.
    Optional extension endpoints Description
    GET /URL_PREFIX/stats/ This route is used to fetch memory usage data for DataRobot Custom Model Testing.
    GET /URL_PREFIX/health/ This route is used to check if model is loaded and functioning properly. If model loading fails error with 513 response code should be returned. Failing to handle this case may cause the backend k8s container to enter crash and enter a restart loop for several minutes.
  • An executable start_server.sh file is required to start the model server.

  • Any code and start_server.sh should be copied to /opt/code/ by your Dockerfile.

Note

To learn more about the complete API specification, you can review the DRUM server API yaml file.

Custom model environment variables

When you build a custom environment with DRUM, your custom model code can reference several environment variables injected to facilitate access to the DataRobot Client and MLOps Connected Client:

Environment Variable Description
MLOPS_DEPLOYMENT_ID If a custom model is running in deployment mode (i.e., the custom model is deployed), the deployment ID is available.
DATAROBOT_ENDPOINT If a custom model has public network access, the DataRobot endpoint URL is available.
DATAROBOT_API_TOKEN If a custom model has public network access, your DataRobot API token is available.

Create the environment

Once DRUM is installed, begin your environment creation by copying one of the examples from GitHub. Log in to GitHub before clicking this link. Make sure:

  • The environment code stays in a single folder.

  • You remove the env_info.json file.

Add Linux packages

To add Linux packages to an environment, add code at the beginning of dockerfile, immediately after the FROM datarobot… line. Use dockerfile syntax for an Ubuntu base. For example, the following command tells DataRobot which base to use and then to install packages foo, boo, and moo inside the Docker image:

FROM datarobot/python3-dropin-env-base
RUN apt-get update --fix-missing && apt-get install foo boo moo

Add Python/R packages

In some cases, you might want to include Python/R packages in the environment. To do so, note the following:

  • List packages to install in requirements.txt. For R packages, do not include versions in the list.

  • Do not mix Python and R packages in the same requirements.txt file. Instead, create multiple files and adjust dockerfile so DataRobot can find and use them.

Test the environment locally

The following example illustrates how to quickly test your environment using Docker tools and DRUM. To test a custom task with a custom environment, navigate to the local folder where the task content is stored. Then, run the following, replacing placeholder names in < > brackets with actual names:

``` sh
drum fit --code-dir <path_to_task_content> --docker <path_to_a_folder_with_environment_code> --input <path_to_test_data.csv> --target-type <target_type> --target <target_column_name> --verbose
```

Updated June 18, 2024