DataRobot API resources > API user guide > REST API code examples > Create a multiseries project

Create a multiseries project¶

This notebook outlines how to create a DataRobot project and begin modeling for a multiseries time series project with DataRobot's REST API.

Requirements¶

DataRobot recommends Python version 3.7 or later. However, this workflow is compatible with earlier versions.
DataRobot API version 2.28.0

Small adjustments may be required depending on the Python version and DataRobot API version you are using.

This notebook does not include a dataset and references several unknown columns; however, you can extrapolate to use your own series identifier.

You can also reference documentation for the DataRobot REST API.

Import libraries¶

In [1]:

Copied!





import datetime
import json
import time

from pandas.io.json import json_normalize
import requests
import yaml
import datetime
import json
import time

from pandas.io.json import json_normalize
import requests
import yaml

Set credentials¶

In [2]:

Copied!

FILE_CREDENTIALS = "path-to-drconfig.yaml"

parsed_file = yaml.load(open(FILE_CREDENTIALS), Loader=yaml.FullLoader)

DR_ENDPOINT = parsed_file["endpoint"]
API_TOKEN = parsed_file["token"]
AUTH_HEADERS = {"Authorization": "token %s" % API_TOKEN}
FILE_CREDENTIALS = "path-to-drconfig.yaml"

parsed_file = yaml.load(open(FILE_CREDENTIALS), Loader=yaml.FullLoader)

DR_ENDPOINT = parsed_file["endpoint"]
API_TOKEN = parsed_file["token"]
AUTH_HEADERS = {"Authorization": "token %s" % API_TOKEN}

Define functions¶

The functions below handle responses, including asynchronous calls.

In [3]:

Copied!





def wait_for_async_resolution(status_url):
    status = False

    while status == False:
        resp = requests.get(status_url, headers=AUTH_HEADERS)
        r = json.loads(resp.content)

        try:
            statusjob = r["status"].upper()
        except:
            statusjob = ""

        if resp.status_code == 200 and statusjob != "RUNNING" and statusjob != "INITIALIZED":
            status = True
            print("Finished: " + str(datetime.datetime.now()))
            return resp

        print("Waiting: " + str(datetime.datetime.now()))
        time.sleep(10)  # Delays for 10 seconds.


def wait_for_result(response):
    assert response.status_code in (200, 201, 202), response.content

    if response.status_code == 200:
        data = response.json()

    elif response.status_code == 201:
        status_url = response.headers["Location"]
        resp = requests.get(status_url, headers=AUTH_HEADERS)
        assert resp.status_code == 200, resp.content
        data = resp.json()

    elif response.status_code == 202:
        status_url = response.headers["Location"]
        resp = wait_for_async_resolution(status_url)
        data = resp.json()

    return data
def wait_for_async_resolution(status_url):
    status = False

    while status == False:
        resp = requests.get(status_url, headers=AUTH_HEADERS)
        r = json.loads(resp.content)

        try:
            statusjob = r["status"].upper()
        except:
            statusjob = ""

        if resp.status_code == 200 and statusjob != "RUNNING" and statusjob != "INITIALIZED":
            status = True
            print("Finished: " + str(datetime.datetime.now()))
            return resp

        print("Waiting: " + str(datetime.datetime.now()))
        time.sleep(10)  # Delays for 10 seconds.


def wait_for_result(response):
    assert response.status_code in (200, 201, 202), response.content

    if response.status_code == 200:
        data = response.json()

    elif response.status_code == 201:
        status_url = response.headers["Location"]
        resp = requests.get(status_url, headers=AUTH_HEADERS)
        assert resp.status_code == 200, resp.content
        data = resp.json()

    elif response.status_code == 202:
        status_url = response.headers["Location"]
        resp = wait_for_async_resolution(status_url)
        data = resp.json()

    return data

Create the project¶

Endpoint: POST /api/v2/projects/

In [4]:

Copied!

FILE_DATASET = "/Volumes/GoogleDrive/My Drive/Datasets/Store Sales/STORE_SALES-TRAIN-2022-04-25.csv"
FILE_DATASET = "/Volumes/GoogleDrive/My Drive/Datasets/Store Sales/STORE_SALES-TRAIN-2022-04-25.csv"

In [5]:

Copied!





payload = {
    # 'projectName': 'TestRESTTimeSeries_1',
    "file": ("Test_REST_TimeSeries_12", open(FILE_DATASET, "r"))
}

response = requests.post(
    "%s/projects/" % (DR_ENDPOINT), headers=AUTH_HEADERS, files=payload, timeout=180
)

response
payload = {
    # 'projectName': 'TestRESTTimeSeries_1',
    "file": ("Test_REST_TimeSeries_12", open(FILE_DATASET, "r"))
}

response = requests.post(
    "%s/projects/" % (DR_ENDPOINT), headers=AUTH_HEADERS, files=payload, timeout=180
)

response

Out[5]:

<Response [202]>

In [6]:

Copied!

# Wait for async task to complete

print("Uploading dataset and creating Project...")

projectCreation_response = wait_for_result(response)

project_id = projectCreation_response["id"]
print("\nProject ID: " + project_id)
# Wait for async task to complete

print("Uploading dataset and creating Project...")

projectCreation_response = wait_for_result(response)

project_id = projectCreation_response["id"]
print("\nProject ID: " + project_id)

Uploading dataset and creating Project...
Waiting: 2022-07-29 17:55:32.507696
Waiting: 2022-07-29 17:55:43.092965
Waiting: 2022-07-29 17:55:53.670669
Waiting: 2022-07-29 17:56:04.252294
Waiting: 2022-07-29 17:56:14.841809
Finished: 2022-07-29 17:56:25.650896

Project ID: 62e402f1ce8ba47b224fcea3

Update the project¶

Endpoint: PATCH /api/v2/projects/(projectId)/

In [7]:

Copied!

payload = {"workerCount": 16}

response = requests.patch(
    "%s/projects/%s/" % (DR_ENDPOINT, project_id), headers=AUTH_HEADERS, json=payload, timeout=180
)

response
payload = {"workerCount": 16}

response = requests.patch(
    "%s/projects/%s/" % (DR_ENDPOINT, project_id), headers=AUTH_HEADERS, json=payload, timeout=180
)

response

Out[7]:

<Response [200]>

Run a detection job¶

For a multiseries project, you must run a detection job to analyze the relationship between the partition and multiseries ID columns.

Endpoint: POST /api/v2/projects/(projectId)/multiseriesProperties/

In [8]:

Copied!





payload = {"datetimePartitionColumn": "Date", "multiseriesIdColumns": ["Store"]}

response = requests.post(
    "%s/projects/%s/multiseriesProperties/" % (DR_ENDPOINT, project_id),
    headers=AUTH_HEADERS,
    json=payload,
    timeout=180,
)

response
payload = {"datetimePartitionColumn": "Date", "multiseriesIdColumns": ["Store"]}

response = requests.post(
    "%s/projects/%s/multiseriesProperties/" % (DR_ENDPOINT, project_id),
    headers=AUTH_HEADERS,
    json=payload,
    timeout=180,
)

response

Out[8]:

<Response [202]>

In [9]:

Copied!

print("Analyzing multiseries partitions...")

multiseries_response = wait_for_result(response)
print("Analyzing multiseries partitions...")

multiseries_response = wait_for_result(response)

Analyzing multiseries partitions...
Waiting: 2022-07-29 17:56:27.571064
Waiting: 2022-07-29 17:56:38.156104
Finished: 2022-07-29 17:56:48.932686

Initiate modeling¶

Endpoint: PATCH /api/v2/projects/(projectId)/aim/

In [10]:

Copied!





payload = {
    "target": "Sales",
    "mode": "quick",
    "datetimePartitionColumn": "Date",
    "featureDerivationWindowStart": -25,
    "featureDerivationWindowEnd": 0,
    "forecastWindowStart": 1,
    "forecastWindowEnd": 12,
    "numberOfBacktests": 2,
    "useTimeSeries": True,
    "cvMethod": "datetime",
    "multiseriesIdColumns": ["Store"],
    "blendBestModels": False,
}

response = requests.patch(
    "%s/projects/%s/aim/" % (DR_ENDPOINT, project_id),
    headers=AUTH_HEADERS,
    json=payload,
    timeout=180,
)

response
payload = {
    "target": "Sales",
    "mode": "quick",
    "datetimePartitionColumn": "Date",
    "featureDerivationWindowStart": -25,
    "featureDerivationWindowEnd": 0,
    "forecastWindowStart": 1,
    "forecastWindowEnd": 12,
    "numberOfBacktests": 2,
    "useTimeSeries": True,
    "cvMethod": "datetime",
    "multiseriesIdColumns": ["Store"],
    "blendBestModels": False,
}

response = requests.patch(
    "%s/projects/%s/aim/" % (DR_ENDPOINT, project_id),
    headers=AUTH_HEADERS,
    json=payload,
    timeout=180,
)

response

Out[10]:

<Response [202]>

In [11]:

Copied!

print("Waiting for tasks previous to training to complete...")

autopilot_response = wait_for_result(response)
print("Waiting for tasks previous to training to complete...")

autopilot_response = wait_for_result(response)

Waiting for tasks previous to training to complete...
Waiting: 2022-07-29 17:56:51.024036
Waiting: 2022-07-29 17:57:01.746376
Waiting: 2022-07-29 17:57:12.329879
Waiting: 2022-07-29 17:57:22.904449
Waiting: 2022-07-29 17:57:33.679282
Waiting: 2022-07-29 17:57:44.262096
Waiting: 2022-07-29 17:57:54.845494
Waiting: 2022-07-29 17:58:05.427372
Waiting: 2022-07-29 17:58:15.995107
Waiting: 2022-07-29 17:58:26.605621
Waiting: 2022-07-29 17:58:37.188681
Waiting: 2022-07-29 17:58:47.762809
Waiting: 2022-07-29 17:58:58.348806
Waiting: 2022-07-29 17:59:08.925445
Waiting: 2022-07-29 17:59:19.505174
Waiting: 2022-07-29 17:59:30.093026
Waiting: 2022-07-29 17:59:40.670835
Waiting: 2022-07-29 17:59:51.239278
Waiting: 2022-07-29 18:00:01.818356
Waiting: 2022-07-29 18:00:12.395658
Waiting: 2022-07-29 18:00:22.993393
Waiting: 2022-07-29 18:00:33.576738
Waiting: 2022-07-29 18:00:44.166028
Waiting: 2022-07-29 18:00:54.768693
Waiting: 2022-07-29 18:01:05.372862
Waiting: 2022-07-29 18:01:15.981022
Waiting: 2022-07-29 18:01:26.571205
Waiting: 2022-07-29 18:01:37.160074
Waiting: 2022-07-29 18:01:47.741388
Waiting: 2022-07-29 18:01:58.326862
Waiting: 2022-07-29 18:02:08.912622
Finished: 2022-07-29 18:02:19.739789