Skip to content

Click in-app to access the full platform documentation for your version of DataRobot.

API quickstart

The DataRobot API provides a programmatic alternative to the web interface for creating and managing DataRobot projects. The API can be used via REST or with DataRobot's Python or R clients in Windows, UNIX, and OS X environments. This guide walks you through setting up your environment and then you can follow a sample problem that outlines an end-to-end workflow for the API.

Prerequisites

Before proceeding, access and install the DataRobot client package for Python or R.

  • Python 2.7 or or Python >=3.4,<3.9
  • A registered DataRobot account
  • pip
  • R >= 3.2
  • httr (≥ 1.2.0)
  • jsonlite (≥ 1.0)
  • yaml (≥ 2.1.19)
  • A registered DataRobot account

Install the client

Note

On-premise users may want to install a previous version of the client in order to match their installed version of the DataRobot application. Reference the available versions to map your installation to the correct version of the API client.

pip install datarobot

Optionally, if you would like to build custom blueprints programmatically, install two additional packages: graphviz and blueprint-workshop.

For Windows users:

Download the graphviz installer

For Ubuntu users:

sudo apt-get install graphviz

For Mac users:

brew install graphviz

Once graphviz is installed, install the workshop:

pip install datarobot-bp-workshop

install.packages(“datarobot”)

Configure your environment

Using DataRobot APIs, you will execute a complete modeling workflow, from uploading a dataset to making predictions on a model deployed in a production environment.

Create a DataRobot API key

  1. From the DataRobot UI, click your user icon in the top right corner and select Developer Tools.

  2. Click Create new key.

  3. Name the new key, and click Save. This activates your new key, making it ready for use.

Once created, each individual key has three pieces of information:

Label Element Description
Key The name of the key, which you can edit.
Date The date the key was last used. Newly created and not yet used keys display “Wasn’t used.”
Key value The key itself.

Retrieve the API endpoint

DataRobot provides several deployment options to meet your business requirements. Each deployment type has its own set of endpoints. Choose from the tabs below:

The default endpoint root for the DataRobot AI Platform Trial and Self-Service users is https://app2.datarobot.com.

API endpoint root: https://app2.datarobot.com/api/v2

The Managed AI Cloud (US) offering is primarily accessed by US and Japanese users. It can be accessed at https://app.datarobot.com.

API endpoint root: https://app.datarobot.com/api/v2

The Managed AI Cloud (EU) offering is primarily accessed by EMEA users. It can be accessed at https://app.eu.datarobot.com.

API endpoint root: https://app.eu.datarobot.com/api/v2

For on-premise users, the API root will be the same as your DataRobot UI root. Replace {datarobot.example.com} with your deployment endpoint.

API endpoint root: https://{datarobot.example.com}/api/v2

Configure API authentication

To authenticate with DataRobot's API, your code needs to have access to an endpoint and token from the previous steps. This can be done in four ways:

drconfig.yaml is a file that the DataRobot Python and R clients automatically look for. This is DataRobot's recommended authentication method. You can instruct the API clients to look for the file in a specific location, ~/.config/datarobot/drconfig.yaml by default, or under a unique name. Therefore, you can leverage this to have multiple config files. The example below demonstrates the format of the .yaml:

endpoint: 'https://app.datarobot.com/api/v2'
token: 'NjE3ZjA3Mzk0MmY0MDFmZGFiYjQ0MztergsgsQwOk9G'

Once created, you can test your access to the API.

For Python:

dr.Client(config_path = "~/.config/datarobot/drconfig.yaml")

For R:

If the config file is located at ~/.config/datarobot/drconfig.yaml, then all you need to do is load the library:

library(datarobot)

Otherwise, use the following command:

ConnectToDataRobot(configPath =
"~/.config/datarobot/drconfig.yaml")

Set up an endpoint by setting environment variables in the UNIX shell:

export DATAROBOT_ENDPOINT=https://app.datarobot.com/api/v2
export DATAROBOT_API_TOKEN=your_api_token

Once set, authenticate to connect to DataRobot.

For Python:

import datarobot as dr
dr.Client()
dr.Project.list()

For R:

library(datarobot)

For cURL:

curl --location -X GET "${DATAROBOT_API_ENDPOINT}/projects" --header "Authorization: Bearer ${DATAROBOT_API_TOKEN}"

Optional. Be cautious to never commit your credentials to Git.

For Python:

import datarobot as dr
dr.Client(endpoint='https://app.datarobot.com/api/v2', token='NjE3ZjA3Mzk0MmY0MDFmZGFiYjQ0MztergsgsQwOk9G')

For R:

ConnectToDataRobot(endpoint =
"https://app.datarobot.com/api/v2",
token =
'NjE3ZjA3Mzk0MmY0MDFmZGFiYjQ0MztergsgsQwOk9G')

For cURL:

GET https://app.datarobot.com/api/v2/ HTTP/1.1
Authorization: Bearer DnwzBUNTOtKBO6Sp1hoUByG4YgZwCCw4

Zepl users can use a secret store data source to access the DataRobot API via Zepl.

Precedence order

The following list outlines the order in which DataRobot will search for your credentials across various methods. Note that the first option found is the set of credentials that DataRobot will use:

  1. The endpoint and token set in datarobot.Client
  2. A config file specified using datarobot.Client
  3. A config file specified by the environment variable DATAROBOT_CONFIG_FILE
  4. The environment variables DATAROBOT_ENDPOINT and DATAROBOT_API_TOKEN
  5. A config file in the home directory at ~/.config/datarobot/drconfig.yaml

Use the API: Predicting fuel economy

Once you have configured your API credentials, endpoints, and environment, you can use the DataRobot API to follow this example. The example uses the Python client and the REST API (using cURL), so a basic understanding of Python3 or cURL is required. It guides you through a simple problem: predicting the miles-per-gallon fuel economy from known automobile data (e.g., vehicle weight, number of cylinders, etc.).

The following sections provide sample code, for Python and cURL, that will:

  1. Upload a dataset.
  2. Train a model to learn from the dataset.
  3. Test prediction outcomes on the model with new data.
  4. Deploy the model.
  5. Predict outcomes on the deployed model using new data.

Upload a dataset

The first step to create a project is uploading a dataset. This example uses the dataset auto-mpg.csv, which you can download here.

import datarobot as dr
dr.Client()

# Set to the location of your auto-mpg.csv and auto-mpg-test.csv data files
# Example: dataset_file_path = '/Users/myuser/Downloads/auto-mpg.csv'
training_dataset_file_path = ''
test_dataset_file_path = ''

# Load dataset
training_dataset = dr.Dataset.create_from_file(training_dataset_file_path)

# Create a new project based on dataset
project = dr.Project.create_from_dataset(training_dataset.id, project_name='Auto MPG DR-Client')
# Set to the location of your auto-mpg.csv and auto-mpg-test.csv data files
# Example: dataset_file_path = '/Users/myuser/Downloads/auto-mpg.csv'
training_dataset_file_path <- ""
test_dataset_file_path <- ""
training_dataset <- utils::read.csv(training_dataset_file_path)
test_dataset <- utils::read.csv(test_dataset_file_path)
head(training_dataset)
project <- SetupProject(dataSource = training_dataset, projectName = "Auto MPG DR-Client", maxWait = 60 * 60)
DATAROBOT_API_TOKEN=${DATAROBOT_API_TOKEN}
DATAROBOT_ENDPOINT=${DATAROBOT_ENDPOINT}
location=$(curl -Lsi \
  -X POST \
  -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
  -F 'projectName="Auto MPG"' \
  -F "file=@${DATASET_FILE_PATH}" \
  # if you are not using app2.datarobot.com endpoints
  # change `location` to `Location`
  "${DATAROBOT_ENDPOINT}"/projects/ | grep -i 'location: .*$' | \
  cut -d " " -f2 | tr -d '\r')
echo "Uploaded dataset. Checking status of project at: ${location}"
while true; do
  project_id=$(curl -Ls \
    -X GET \
    -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" "${location}" \
    | grep -Eo 'id":\s"\w+' | cut -d '"' -f3 | tr -d '\r')
  if [ "${project_id}" = "" ]
  then
    echo "Setting up project..."
    sleep 10
  else
    echo "Project setup complete."
    echo "Project ID: ${project_id}"
    break
  fi
done

Train models

Now that DataRobot has data, it can use the data to train and build models with Autopilot. Autopilot is DataRobot's "survival of the fittest" modeling mode that automatically selects the best predictive models for the specified target feature and runs them at increasing sample sizes. The outcome of Autopilot is not only a selection of best-suited models, but also identification of a recommended model—the model that best understands how to predict the target feature "mpg." Choosing the best model is a balance of accuracy, metric performance, and model simplicity. You can read more about the model recommendation process in the UI documentation.

# Use training data to build models.
project.wait_for_autopilot()
model = dr.ModelRecommendation.get(project.id).get_model()
# Set the project target and initiate Autopilot
SetTarget(project,
  target = "mpg")
# Block execution until Autopilot is complete
WaitForAutopilot(project)
model <- GetRecommendedModel(project, type = RecommendedModelType$RecommendedForDeployment)
response=$(curl -Lsi \
  -X PATCH \
  -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
  -H "Content-Type: application/json" \
  --data '{"target": "mpg", "mode": "quick"}' \
  "${DATAROBOT_ENDPOINT}/projects/${project_id}/aim" | grep 'location: .*$' \
  | cut -d " " | tr -d '\r')
echo "AI training initiated. Checking status of training at: ${response}"
while true; do
  initial_project_status=$(curl -Ls \
  -X GET \
  -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" "${response}" \
  | grep -Eo 'stage":\s"\w+' | cut -d '"' -f3 | tr -d '\r')
  if [ "${initial_project_status}" = "" ]
  then
    echo "Setting up AI training..."
    sleep 10
  else
    echo "Training AI."
    echo "Grab a coffee or catch up on email."
    break
  fi
done

project_status=$(curl -Lsi \
  -X GET \
  -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
  "${DATAROBOT_ENDPOINT}/projects/${project_id}/status" \
  | grep -Eo 'autopilotDone":\strue'
  )
if [ "${project_status}" = "" ]
then
  echo "Autopilot training in progress..."
  sleep 60
else
  echo "Autopilot training complete. Model ready to deploy."
  break
fi
done

Make predictions against the model

After building models and identifying the top performers, you can further test a model by making predictions on new data. Typically, you would test predictions with a smaller dataset to ensure the model is behaving as expected before deploying the model to production. DataRobot offers several methods for making predictions on new data. You can read more about prediction methods in the UI documentation.

This code makes predictions on the recommended model using the test set you identified in the first step (test_dataset_file_path), when you uploaded data.

# Test predictions on new data
prediction_data = project.upload_dataset(test_dataset_file_path)
predict_job = model.request_predictions(prediction_data.id)
predictions = predict_job.get_result_when_complete()
predictions.head()

This code makes predictions on the recommended model using the test set you identified in the first step (test_dataset_file_path), when you uploaded data.

# Uploading the testing dataset
scoring <- UploadPredictionDataset(project, dataSource = test_dataset)

# Requesting prediction
predict_job_id <- RequestPredictions(project, modelId = model$modelId, datasetId = scoring$id)

# Grabbing predictions
predictions_prob <- GetPredictions(project, predictId = predict_job_id, type = "probability")
head(predictions_prob)

This code makes predictions on the recommended model using the test set you identified in the first step (test_dataset_file_path), when you uploaded data.

# Test predictions on new data
# shellcheck disable=SC2089
prediction_location=$(curl -Lsi\
-X POST \
-H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
-F "file=@${TEST_DATASET_FILE_PATH}" \
"${DATAROBOT_ENDPOINT}/projects/${project_id}/predictionDatasets/fileUploads/"\
| grep -i 'location: .*$' | cut -d " " -f2 | tr -d '\r')
echo "Uploaded prediction dataset. Checking status of upload at: ${prediction_location}"
while true; do
  prediction_dataset_id=$(curl -Ls \
  -X GET \
  -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" "${prediction_location}" \
  | grep -Eo 'id":\s"\w+' | cut -d '"' -f3 | tr -d '\r')
  if [ "${prediction_dataset_id}" = "" ]
  then
  echo "Uploading predictions..."
  sleep 10
  else
  echo "Predictions upload complete."
  echo "Predictions dataset ID: ${prediction_dataset_id}"
  break
  fi
  done
  prediction_request_data="{\
      \"modelId\":\"${recommended_model_id}\",\
      \"datasetId\":\"${prediction_dataset_id}\"\
  }"
  predict_job=$(curl -Lsi \
  -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
  --data "${prediction_request_data}" \
  "${DATAROBOT_ENDPOINT}/projects/${project_id}/predictions/"\
  | grep -i 'location: .*$' | cut -d " " -f2 | tr -d '\r')
while true; do
  initial_job_response=$(curl -Ls \
    -X GET \
    -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" "${predict_job}" \
    | grep -Eo 'status":\s"\w+' | cut -d '"' -f3 | tr -d '\r')
  if [ "${initial_job_status}" = "inprogress" ]
  then
    echo "Generating predictions..."
    sleep 10
  else
    echo "Predictions complete."
    break
  fi
done

curl -Ls \
  -X GET \
  -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" "${predict_job}"

Deploy the model

Deployment is the method by which you integrate a machine learning model into an existing production environment to make predictions with live data and generate insights. See the machine learning model deployment overview for more information.

# Deploy model
deployment = dr.Deployment.create_from_learning_model(
  model_id=model.id, label="MPG Prediction Server",
  description="Deployed with DataRobot client")

# View deployment stats
service_stats = deployment.get_service_stats()
print(service_stats.metrics)
# Deploy model
prediction_server = dr.PredictionServer.list()[0]
deployment = dr.Deployment.create_from_learning_model(
  model_id=model.id, label="MPG Prediction Server",
  description="Deployed with DataRobot client",
  default_prediction_server_id=prediction_server.id
)

# View deployment stats
service_stats = deployment.get_service_stats()
print(service_stats.metrics)
predictionServer <- ListPredictionServers()[[1]]
deployment <- CreateDeployment(model,
                               label = "MPG Prediction Server",
                               description = "Deployed with DataRobot client",
                               defaultPredictionServerId = predictionServer)
recommended_model_id=$(curl -s \
-X GET \
-H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
"${DATAROBOT_ENDPOINT}/projects/${project_id}/recommendedModels"\
"/recommendedModel/" \
| grep -Eo 'modelId":\s"\w+' | cut -d '"' -f3 | tr -d '\r')
server_data=$(curl -s -X GET \
-H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
"${DATAROBOT_ENDPOINT}/predictionServers/")
default_server_id=$(echo $server_data \
| grep -Eo 'id":\s"\w+' | cut -d '"' -f3 | tr -d '\r')
server_url=$(echo $server_data | grep -Eo 'url":\s".*?"' \
| cut -d '"' -f3 | tr -d '\r')
server_key=$(echo $server_data | grep -Eo 'datarobot-key":\s".*?"' \
| cut -d '"' -f3 | tr -d '\r')
request_data="{\
    \"defaultPredictionServerId\":\"${default_server_id}\",\
    \"modelId\":\"${recommended_model_id}\",\
    \"description\":\"Deployed with cURL\",\
    \"label\":\"MPG Prediction Server\"\
}"
deployment_response=$(curl -Lsi -X POST \
-H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
-H "Content-Type: application/json" \
--data "${request_data}" \
"${DATAROBOT_ENDPOINT}/deployments/fromLearningModel/")
deploy_response_code_202=$(echo $deployment_response | grep -Eo 'HTTP/2 202')
if [ "${deploy_response_code_202}" = "" ]
then
  deployment_id=$(echo "$deployment_response" | grep -Eo 'id":\s"\w+' \
  | cut -d '"' -f3 | tr -d '\r')
  echo "Prediction server ready."
else
  deployment_status=$(echo "$deployment_response" | grep -Eo 'location: .*$' \
  | cut -d " " | tr -d '\r')
  while true; do
    deployment_ready=$(curl -Ls \
      -X GET \
      -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" "${deployment_status}" \
      | grep -Eo 'id":\s"\w+' | cut -d '"' -f3 | tr -d '\r')
    if [ "${deployment_ready}" = "" ]
    then
      echo "Waiting for deployment..."
      sleep 10
    else
      deployment_id=$deployment_ready
      echo "Prediction server ready."
      break
    fi
  done
fi

Make predictions against the deployed model

When you have successfully deployed a model, you can use the DataRobot Prediction API to make predictions. This allows you to access advanced model management features such as data drift, accuracy, and service health statistics.

You can also reference a Python prediction snippet from the UI. Navigate to the Deployments page, select your deployment, and go to Predictions > Prediction API to reference the snippet for making predictions.

import requests
from pprint import pprint
import json
import os
# JSON records for example autos for which to predict mpg
autos = [
    {
        "cylinders": 4,
        "displacement": 119.0,
        "horsepower": 82.00,
        "weight": 2720.0,
        "acceleration": 19.4,
        "model year": 82,
        "origin": 1,
    },
    {
        "cylinders": 8,
        "displacement": 120.0,
        "horsepower": 79.00,
        "weight": 2625.0,
        "acceleration": 18.6,
        "model year": 82,
        "origin": 1,
    },
]
# Create REST request for prediction API
prediction_server = deployment.default_prediction_server
prediction_headers = {
    "Authorization": "Bearer {}".format(os.getenv("DATAROBOT_API_TOKEN")),
    "Content-Type": "application/json",
    "datarobot-key": prediction_server['datarobot-key']
}

predictions = requests.post(
    f"{prediction_server.url}/predApi/v1.0/deployments"
    f"/{deployment.id}/predictions",
    headers=prediction_headers,
    data=json.dumps(autos),
)
pprint(predictions.json())
# Prepare to connect to the prediction server
URL <- paste0(deployment$defaultPredictionServer$url,
        "/predApi/v1.0/deployments/",
        deployment$id,
        "/predictions")
  USERNAME <- "deployment$owners$preview$email" # This should be your DR email account
  API_TOKEN <- Sys.getenv("DATAROBOT_API_TOKEN") # This is configured implicitly when you first run `library(datarobot)`

  # Invoke Predictions API with the test_dataset
  response <- httr::POST(URL,
                   body = jsonlite::toJSON(test_dataset),
                   httr::add_headers("datarobot-key" = deployment$defaultPredictionServer$dataRobotKey),
                   httr::content_type_json(),
                   authenticate(USERNAME, API_TOKEN, "basic"))

  # Parse the results from the prediction server
  predictionResults <- fromJSON(httr::content(response, as = "text"),
                   simplifyDataFrame = TRUE,
                   flatten = TRUE)$data
  print(predictionResults)
autos='[{
  "cylinders": 4,
  "displacement": 119.0,
  "horsepower": 82.00,
  "weight": 2720.0,
  "acceleration": 19.4,
  "model year": 82,
  "origin":1
},{
  "cylinders": 8,
  "displacement": 120.0,
  "horsepower": 79.00,
  "weight": 2625.0,
  "acceleration": 18.6,
  "model year": 82,
  "origin":1
}]'
curl -X POST \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
  -H "datarobot-key: ${server_key}" \
  --data "${autos}" \
  "${server_url}/predApi/v1.0/deployments/${deployment_id}/predictions"

Learn more

After getting started with DataRobot's APIs, navigate to the user guide for overviews, Jupyter notebooks, and task-based tutorials that help you find complete examples of common data science and machine learning workflows. You can also read the reference documentation available for DataRobot's programmatic tools.

Note

Log in to GitHub before accessing these GitHub resources.


Updated June 9, 2022
Back to top