DataRobot API resources > API reference documentation > DataRobot REST API > Data preparation > Data wrangling

Data Wrangling¶

This page outlines the operations, endpoints, parameters, and example requests and responses for the Data Wrangling.

GET /api/v2/recipes/¶

Get a list of the recipes available for given user.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/recipes/ \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
offset	query	integer	false	Number of results to skip.
limit	query	integer	false	At most this many results are returned. The default may change without notice.
orderBy	query	string	false	The attribute sort order applied to the returned recipes list: 'recipe_id', 'name', 'description', 'dialect', 'status', 'recipe_type', 'created_at', 'created_by', 'updated_at', 'updated_by'. Prefix the attribute name with a dash to sort in descending order. e.g., orderBy='-name'. Defaults to '-created'.
search	query	string	false	Only return recipes with names that contain the specified string.
dialect	query	any	false	SQL dialect for Query Generator.
status	query	any	false	Status used for filtering recipes.
recipeType	query	any	false	Type of the recipe workflow.
creatorUserId	query	any	false	Filter results to display only those created by user(s) associated with the specified ID.
creatorUsername	query	any	false	Filter results to display only those created by user(s) associated with the specified username.

Enumerated Values¶

Parameter	Value
orderBy	[`recipeId`, `-recipeId`, `name`, `-name`, `description`, `-description`, `dialect`, `-dialect`, `status`, `-status`, `recipeType`, `-recipeType`, `createdAt`, `-createdAt`, `createdBy`, `-createdBy`, `updatedAt`, `-updatedAt`, `updatedBy`, `-updatedBy`]

Example responses¶


{
  "count": 0,
  "data": [
    {
      "createdAt": "2019-08-24T14:15:22Z",
      "createdBy": {
        "email": "string",
        "fullName": "string",
        "id": "string",
        "userhash": "string",
        "username": "string"
      },
      "description": "string",
      "dialect": "snowflake",
      "downsampling": {
        "arguments": {
          "rows": 0,
          "seed": null
        },
        "directive": "random-sample"
      },
      "errorMessage": null,
      "failedOperationsIndex": null,
      "inputs": [
        {
          "alias": "string",
          "dataSourceId": "string",
          "dataStoreId": "string",
          "datasetId": "string",
          "inputType": "datasource",
          "sampling": {
            "arguments": {
              "rows": 10000,
              "seed": 0
            },
            "directive": "random-sample"
          }
        }
      ],
      "name": "string",
      "operations": [
        {
          "arguments": {
            "conditions": [
              {
                "column": "string",
                "function": "between",
                "functionArguments": []
              }
            ],
            "keepRows": true,
            "operator": "and"
          },
          "directive": "filter"
        }
      ],
      "recipeId": "string",
      "recipeType": "sql",
      "settings": {
        "featureDiscoveryProjectId": "string",
        "featureDiscoverySupervisedFeatureReduction": null,
        "predictionPoint": "string",
        "relationshipsConfigurationId": "string",
        "target": "string",
        "weightsFeature": "string"
      },
      "sql": "string",
      "status": "draft",
      "updatedAt": "2019-08-24T14:15:22Z",
      "updatedBy": {
        "email": "string",
        "fullName": "string",
        "id": "string",
        "userhash": "string",
        "username": "string"
      }
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Responses¶

Status	Meaning	Description	Schema
200	OK	none	RecipesListResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/recipes/fromDataStore/¶

Create a recipe which could be used for wrangling from a created fully reconfigured source of data. A data source specifies, via SQL query or selected table and schema data, which data to extract from the data connection (the location of data within a given endpoint) to use for modeling or predictions. A data source has one data connection and one connector but can have many datasets.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/recipes/fromDataStore/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{RecipeFromDataSourceCreate}'

Body parameter¶


{
  "dataSourceType": "dr-database-v1",
  "dataStoreId": "string",
  "dialect": "snowflake",
  "experimentContainerId": "string",
  "inputs": [
    {
      "canonicalName": "string",
      "catalog": "string",
      "sampling": {
        "arguments": {
          "rows": 10000,
          "seed": 0
        },
        "directive": "random-sample"
      },
      "schema": "string",
      "table": "string"
    }
  ],
  "recipeType": "sql",
  "useCaseId": "string"
}

Parameters

Name	In	Type	Required	Description
body	body	RecipeFromDataSourceCreate	false	none

Example responses¶


{
  "createdAt": "2019-08-24T14:15:22Z",
  "createdBy": {
    "email": "string",
    "fullName": "string",
    "id": "string",
    "userhash": "string",
    "username": "string"
  },
  "description": "string",
  "dialect": "snowflake",
  "downsampling": {
    "arguments": {
      "rows": 0,
      "seed": null
    },
    "directive": "random-sample"
  },
  "errorMessage": null,
  "failedOperationsIndex": null,
  "inputs": [
    {
      "alias": "string",
      "dataSourceId": "string",
      "dataStoreId": "string",
      "datasetId": "string",
      "inputType": "datasource",
      "sampling": {
        "arguments": {
          "rows": 10000,
          "seed": 0
        },
        "directive": "random-sample"
      }
    }
  ],
  "name": "string",
  "operations": [
    {
      "arguments": {
        "conditions": [
          {
            "column": "string",
            "function": "between",
            "functionArguments": []
          }
        ],
        "keepRows": true,
        "operator": "and"
      },
      "directive": "filter"
    }
  ],
  "recipeId": "string",
  "recipeType": "sql",
  "settings": {
    "featureDiscoveryProjectId": "string",
    "featureDiscoverySupervisedFeatureReduction": null,
    "predictionPoint": "string",
    "relationshipsConfigurationId": "string",
    "target": "string",
    "weightsFeature": "string"
  },
  "sql": "string",
  "status": "draft",
  "updatedAt": "2019-08-24T14:15:22Z",
  "updatedBy": {
    "email": "string",
    "fullName": "string",
    "id": "string",
    "userhash": "string",
    "username": "string"
  }
}

Responses¶

Status	Meaning	Description	Schema
201	Created	Data source and recipe created successfully.	RecipeResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/recipes/fromDataset/¶

Create a recipe which could be used for wrangling from given dataset. Deepcopy the dataset's recipe if available.Otherwise create a new recipe reusing the dataset's data source.A data source specifies, via SQL query or selected table and schema data, which data to extract from the data connection (the location of data within a given endpoint) to use for modeling or predictions. A data source has one data connection and one connector but can have many datasets.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/recipes/fromDataset/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{GenericRecipeFromDataset}'

Body parameter¶


{
  "datasetId": "string",
  "dialect": "snowflake",
  "status": "preview"
}

Parameters

Name	In	Type	Required	Description
body	body	GenericRecipeFromDataset	false	none

Example responses¶


{
  "createdAt": "2019-08-24T14:15:22Z",
  "createdBy": {
    "email": "string",
    "fullName": "string",
    "id": "string",
    "userhash": "string",
    "username": "string"
  },
  "description": "string",
  "dialect": "snowflake",
  "downsampling": {
    "arguments": {
      "rows": 0,
      "seed": null
    },
    "directive": "random-sample"
  },
  "errorMessage": null,
  "failedOperationsIndex": null,
  "inputs": [
    {
      "alias": "string",
      "dataSourceId": "string",
      "dataStoreId": "string",
      "datasetId": "string",
      "inputType": "datasource",
      "sampling": {
        "arguments": {
          "rows": 10000,
          "seed": 0
        },
        "directive": "random-sample"
      }
    }
  ],
  "name": "string",
  "operations": [
    {
      "arguments": {
        "conditions": [
          {
            "column": "string",
            "function": "between",
            "functionArguments": []
          }
        ],
        "keepRows": true,
        "operator": "and"
      },
      "directive": "filter"
    }
  ],
  "recipeId": "string",
  "recipeType": "sql",
  "settings": {
    "featureDiscoveryProjectId": "string",
    "featureDiscoverySupervisedFeatureReduction": null,
    "predictionPoint": "string",
    "relationshipsConfigurationId": "string",
    "target": "string",
    "weightsFeature": "string"
  },
  "sql": "string",
  "status": "draft",
  "updatedAt": "2019-08-24T14:15:22Z",
  "updatedBy": {
    "email": "string",
    "fullName": "string",
    "id": "string",
    "userhash": "string",
    "username": "string"
  }
}

Responses¶

Status	Meaning	Description	Schema
201	Created	Recipe created successfully.	RecipeResponse
422	Unprocessable Entity	You can't specify `dialect` or `inputs` when source Recipe is available.	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/recipes/fromRecipe/¶

Shallow copy the given recipe, reusing existing data sources. Implicitly creates duplicate of wrangling session.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/recipes/fromRecipe/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{RecipeFromRecipeCreate}'

Body parameter¶


{
  "name": "string",
  "recipeId": "string"
}

Parameters

Name	In	Type	Required	Description
body	body	RecipeFromRecipeCreate	false	none

Example responses¶


{
  "createdAt": "2019-08-24T14:15:22Z",
  "createdBy": {
    "email": "string",
    "fullName": "string",
    "id": "string",
    "userhash": "string",
    "username": "string"
  },
  "description": "string",
  "dialect": "snowflake",
  "downsampling": {
    "arguments": {
      "rows": 0,
      "seed": null
    },
    "directive": "random-sample"
  },
  "errorMessage": null,
  "failedOperationsIndex": null,
  "inputs": [
    {
      "alias": "string",
      "dataSourceId": "string",
      "dataStoreId": "string",
      "datasetId": "string",
      "inputType": "datasource",
      "sampling": {
        "arguments": {
          "rows": 10000,
          "seed": 0
        },
        "directive": "random-sample"
      }
    }
  ],
  "name": "string",
  "operations": [
    {
      "arguments": {
        "conditions": [
          {
            "column": "string",
            "function": "between",
            "functionArguments": []
          }
        ],
        "keepRows": true,
        "operator": "and"
      },
      "directive": "filter"
    }
  ],
  "recipeId": "string",
  "recipeType": "sql",
  "settings": {
    "featureDiscoveryProjectId": "string",
    "featureDiscoverySupervisedFeatureReduction": null,
    "predictionPoint": "string",
    "relationshipsConfigurationId": "string",
    "target": "string",
    "weightsFeature": "string"
  },
  "sql": "string",
  "status": "draft",
  "updatedAt": "2019-08-24T14:15:22Z",
  "updatedBy": {
    "email": "string",
    "fullName": "string",
    "id": "string",
    "userhash": "string",
    "username": "string"
  }
}

Responses¶

Status	Meaning	Description	Schema
201	Created	Recipe created successfully.	RecipeResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

DELETE /api/v2/recipes/{recipeId}/¶

Marks the wrangling recipe with a given ID as deleted.

Code samples¶


curl -X DELETE https://app.datarobot.com/api/v2/recipes/{recipeId}/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "featureDiscoverySupervisedFeatureReduction": true,
  "predictionPoint": "string",
  "relationshipsConfigurationId": "string",
  "target": "string",
  "weightsFeature": "string"
}

Parameters

Name	In	Type	Required	Description
recipeId	path	string	true	The ID of the recipe.
body	body	RecipeSettingsUpdate	false	none

Example responses¶


{
  "featureDiscoveryProjectId": "string",
  "featureDiscoverySupervisedFeatureReduction": null,
  "predictionPoint": "string",
  "relationshipsConfigurationId": "string",
  "target": "string",
  "weightsFeature": "string"
}

Responses¶

Status	Meaning	Description	Schema
204	No Content	Successfully deleted.	RecipeSettingsResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/recipes/{recipeId}/¶

Retrieve a wrangling recipe given ID.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/recipes/{recipeId}/ \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
recipeId	path	string	true	The ID of the recipe.

Example responses¶


{
  "createdAt": "2019-08-24T14:15:22Z",
  "createdBy": {
    "email": "string",
    "fullName": "string",
    "id": "string",
    "userhash": "string",
    "username": "string"
  },
  "description": "string",
  "dialect": "snowflake",
  "downsampling": {
    "arguments": {
      "rows": 0,
      "seed": null
    },
    "directive": "random-sample"
  },
  "errorMessage": null,
  "failedOperationsIndex": null,
  "inputs": [
    {
      "alias": "string",
      "dataSourceId": "string",
      "dataStoreId": "string",
      "datasetId": "string",
      "inputType": "datasource",
      "sampling": {
        "arguments": {
          "rows": 10000,
          "seed": 0
        },
        "directive": "random-sample"
      }
    }
  ],
  "name": "string",
  "operations": [
    {
      "arguments": {
        "conditions": [
          {
            "column": "string",
            "function": "between",
            "functionArguments": []
          }
        ],
        "keepRows": true,
        "operator": "and"
      },
      "directive": "filter"
    }
  ],
  "recipeId": "string",
  "recipeType": "sql",
  "settings": {
    "featureDiscoveryProjectId": "string",
    "featureDiscoverySupervisedFeatureReduction": null,
    "predictionPoint": "string",
    "relationshipsConfigurationId": "string",
    "target": "string",
    "weightsFeature": "string"
  },
  "sql": "string",
  "status": "draft",
  "updatedAt": "2019-08-24T14:15:22Z",
  "updatedBy": {
    "email": "string",
    "fullName": "string",
    "id": "string",
    "userhash": "string",
    "username": "string"
  }
}

Responses¶

Status	Meaning	Description	Schema
200	OK	none	RecipeResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

PATCH /api/v2/recipes/{recipeId}/¶

Patch a wrangling recipe name and description

Code samples¶


curl -X PATCH https://app.datarobot.com/api/v2/recipes/{recipeId}/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "description": "string",
  "name": "string",
  "recipeType": "sql",
  "sql": "string"
}

Parameters

Name	In	Type	Required	Description
recipeId	path	string	true	The ID of the recipe.
body	body	PatchRecipe	false	none

Example responses¶


{
  "createdAt": "2019-08-24T14:15:22Z",
  "createdBy": {
    "email": "string",
    "fullName": "string",
    "id": "string",
    "userhash": "string",
    "username": "string"
  },
  "description": "string",
  "dialect": "snowflake",
  "downsampling": {
    "arguments": {
      "rows": 0,
      "seed": null
    },
    "directive": "random-sample"
  },
  "errorMessage": null,
  "failedOperationsIndex": null,
  "inputs": [
    {
      "alias": "string",
      "dataSourceId": "string",
      "dataStoreId": "string",
      "datasetId": "string",
      "inputType": "datasource",
      "sampling": {
        "arguments": {
          "rows": 10000,
          "seed": 0
        },
        "directive": "random-sample"
      }
    }
  ],
  "name": "string",
  "operations": [
    {
      "arguments": {
        "conditions": [
          {
            "column": "string",
            "function": "between",
            "functionArguments": []
          }
        ],
        "keepRows": true,
        "operator": "and"
      },
      "directive": "filter"
    }
  ],
  "recipeId": "string",
  "recipeType": "sql",
  "settings": {
    "featureDiscoveryProjectId": "string",
    "featureDiscoverySupervisedFeatureReduction": null,
    "predictionPoint": "string",
    "relationshipsConfigurationId": "string",
    "target": "string",
    "weightsFeature": "string"
  },
  "sql": "string",
  "status": "draft",
  "updatedAt": "2019-08-24T14:15:22Z",
  "updatedBy": {
    "email": "string",
    "fullName": "string",
    "id": "string",
    "userhash": "string",
    "username": "string"
  }
}

Responses¶

Status	Meaning	Description	Schema
200	OK	none	RecipeResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

PUT /api/v2/recipes/{recipeId}/downsampling/¶

Updates the downsampling directive in the recipe.Downsampling will be applied on top of the recipe during publishing.

Code samples¶


curl -X PUT https://app.datarobot.com/api/v2/recipes/{recipeId}/downsampling/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "downsampling": {
    "arguments": {
      "rows": 0,
      "seed": null
    },
    "directive": "random-sample"
  }
}

Parameters

Name	In	Type	Required	Description
recipeId	path	string	true	The ID of the recipe.
body	body	RecipeDownsamplingUpdate	false	none

Example responses¶


{
  "createdAt": "2019-08-24T14:15:22Z",
  "createdBy": {
    "email": "string",
    "fullName": "string",
    "id": "string",
    "userhash": "string",
    "username": "string"
  },
  "description": "string",
  "dialect": "snowflake",
  "downsampling": {
    "arguments": {
      "rows": 0,
      "seed": null
    },
    "directive": "random-sample"
  },
  "errorMessage": null,
  "failedOperationsIndex": null,
  "inputs": [
    {
      "alias": "string",
      "dataSourceId": "string",
      "dataStoreId": "string",
      "datasetId": "string",
      "inputType": "datasource",
      "sampling": {
        "arguments": {
          "rows": 10000,
          "seed": 0
        },
        "directive": "random-sample"
      }
    }
  ],
  "name": "string",
  "operations": [
    {
      "arguments": {
        "conditions": [
          {
            "column": "string",
            "function": "between",
            "functionArguments": []
          }
        ],
        "keepRows": true,
        "operator": "and"
      },
      "directive": "filter"
    }
  ],
  "recipeId": "string",
  "recipeType": "sql",
  "settings": {
    "featureDiscoveryProjectId": "string",
    "featureDiscoverySupervisedFeatureReduction": null,
    "predictionPoint": "string",
    "relationshipsConfigurationId": "string",
    "target": "string",
    "weightsFeature": "string"
  },
  "sql": "string",
  "status": "draft",
  "updatedAt": "2019-08-24T14:15:22Z",
  "updatedBy": {
    "email": "string",
    "fullName": "string",
    "id": "string",
    "userhash": "string",
    "username": "string"
  }
}

Responses¶

Status	Meaning	Description	Schema
200	OK	none	RecipeResponse
422	Unprocessable Entity	Cannot modify published recipe.	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/recipes/{recipeId}/inputs/¶

Gets inputs of the given recipe.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/recipes/{recipeId}/inputs/ \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
recipeId	path	string	true	The ID of the recipe.

Example responses¶


{
  "inputs": [
    {
      "alias": "string",
      "columnCount": 0,
      "connectionName": "string",
      "dataSourceId": "string",
      "dataStoreId": "string",
      "datasetId": "string",
      "datasetVersionId": "string",
      "inputType": "datasource",
      "name": "string",
      "rowCount": 0,
      "sampling": {
        "arguments": {
          "rows": 10000,
          "seed": 0
        },
        "directive": "random-sample"
      },
      "snapshotPolicy": "fixed",
      "status": "ABORTED"
    }
  ]
}

Responses¶

Status	Meaning	Description	Schema
200	OK	none	RecipeInputsResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

PUT /api/v2/recipes/{recipeId}/inputs/¶

Set the inputs on a recipe to change the configuration. Implicitly restart the initial sampling job which calculates:1) Column names;2) resulting size of the sample in bytes;3) resulting size of the sample in rows.

Code samples¶


curl -X PUT https://app.datarobot.com/api/v2/recipes/{recipeId}/inputs/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "inputs": [
    {
      "alias": "string",
      "dataSourceId": "string",
      "dataStoreId": "string",
      "datasetId": "string",
      "inputType": "datasource",
      "sampling": {
        "arguments": {
          "rows": 10000,
          "seed": 0
        },
        "directive": "random-sample"
      }
    }
  ]
}

Parameters

Name	In	Type	Required	Description
recipeId	path	string	true	The ID of the recipe.
body	body	RecipeInputUpdate	false	none

Example responses¶


{
  "createdAt": "2019-08-24T14:15:22Z",
  "createdBy": {
    "email": "string",
    "fullName": "string",
    "id": "string",
    "userhash": "string",
    "username": "string"
  },
  "description": "string",
  "dialect": "snowflake",
  "downsampling": {
    "arguments": {
      "rows": 0,
      "seed": null
    },
    "directive": "random-sample"
  },
  "errorMessage": null,
  "failedOperationsIndex": null,
  "inputs": [
    {
      "alias": "string",
      "dataSourceId": "string",
      "dataStoreId": "string",
      "datasetId": "string",
      "inputType": "datasource",
      "sampling": {
        "arguments": {
          "rows": 10000,
          "seed": 0
        },
        "directive": "random-sample"
      }
    }
  ],
  "name": "string",
  "operations": [
    {
      "arguments": {
        "conditions": [
          {
            "column": "string",
            "function": "between",
            "functionArguments": []
          }
        ],
        "keepRows": true,
        "operator": "and"
      },
      "directive": "filter"
    }
  ],
  "recipeId": "string",
  "recipeType": "sql",
  "settings": {
    "featureDiscoveryProjectId": "string",
    "featureDiscoverySupervisedFeatureReduction": null,
    "predictionPoint": "string",
    "relationshipsConfigurationId": "string",
    "target": "string",
    "weightsFeature": "string"
  },
  "sql": "string",
  "status": "draft",
  "updatedAt": "2019-08-24T14:15:22Z",
  "updatedBy": {
    "email": "string",
    "fullName": "string",
    "id": "string",
    "userhash": "string",
    "username": "string"
  }
}

Responses¶

Status	Meaning	Description	Schema
200	OK	none	RecipeResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/recipes/{recipeId}/insights/¶

Retrieve recipe insights.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/recipes/{recipeId}/insights/?limit=100&offset=0 \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
limit	query	integer	true	At most this many results are returned. The default may change and a maximum limit may be imposed without notice.
offset	query	integer	true	This many results will be skipped.
numberOfOperationsToUse	query	integer	false	The number indicating how many operations from the beginning to return insights for.
recipeId	path	string	true	The ID of the recipe.

Example responses¶


{
  "count": 0,
  "data": [
    {
      "datasetId": "string",
      "datasetVersionId": "string",
      "dateFormat": "string",
      "featureType": "Boolean",
      "id": 0,
      "isZeroInflated": true,
      "keySummary": {
        "key": "string",
        "summary": {
          "dataQualities": "ISSUES_FOUND",
          "max": 0,
          "mean": 0,
          "median": 0,
          "min": 0,
          "pctRows": 0,
          "stdDev": 0
        }
      },
      "language": "string",
      "lowInformation": true,
      "lowerQuartile": "string",
      "majorityClassCount": 0,
      "max": "string",
      "mean": "string",
      "median": "string",
      "min": "string",
      "minorityClassCount": 0,
      "naCount": 0,
      "name": "string",
      "plot": [
        {
          "count": 0,
          "label": "string"
        }
      ],
      "sampleRows": 0,
      "stdDev": "string",
      "timeSeriesEligibilityReason": "string",
      "timeSeriesEligibilityReasonAggregation": "string",
      "timeSeriesEligible": true,
      "timeSeriesEligibleAggregation": true,
      "timeStep": 0,
      "timeStepAggregation": 0,
      "timeUnit": "string",
      "timeUnitAggregation": "string",
      "uniqueCount": 0,
      "upperQuartile": "string"
    }
  ],
  "message": "string",
  "next": "http://example.com",
  "previous": "http://example.com",
  "status": "ABORTED",
  "totalCount": 0
}

Responses¶

Status	Meaning	Description	Schema
200	OK	none	RefinexInsightsResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

PUT /api/v2/recipes/{recipeId}/operations/¶

Updates the operations in a recipe by saving new directives. To validate the new operations, run preview validation and SQL generation. To apply them to the new recipe, you must run a request to preview the results must be run.

Code samples¶


curl -X PUT https://app.datarobot.com/api/v2/recipes/{recipeId}/operations/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "force": false,
  "operations": [
    {
      "arguments": {
        "conditions": [
          {
            "column": "string",
            "function": "between",
            "functionArguments": []
          }
        ],
        "keepRows": true,
        "operator": "and"
      },
      "directive": "filter"
    }
  ]
}

Parameters

Name	In	Type	Required	Description
recipeId	path	string	true	The ID of the recipe.
body	body	RecipeOperationsUpdate	false	none

Example responses¶


{
  "createdAt": "2019-08-24T14:15:22Z",
  "createdBy": {
    "email": "string",
    "fullName": "string",
    "id": "string",
    "userhash": "string",
    "username": "string"
  },
  "description": "string",
  "dialect": "snowflake",
  "downsampling": {
    "arguments": {
      "rows": 0,
      "seed": null
    },
    "directive": "random-sample"
  },
  "errorMessage": null,
  "failedOperationsIndex": null,
  "inputs": [
    {
      "alias": "string",
      "dataSourceId": "string",
      "dataStoreId": "string",
      "datasetId": "string",
      "inputType": "datasource",
      "sampling": {
        "arguments": {
          "rows": 10000,
          "seed": 0
        },
        "directive": "random-sample"
      }
    }
  ],
  "name": "string",
  "operations": [
    {
      "arguments": {
        "conditions": [
          {
            "column": "string",
            "function": "between",
            "functionArguments": []
          }
        ],
        "keepRows": true,
        "operator": "and"
      },
      "directive": "filter"
    }
  ],
  "recipeId": "string",
  "recipeType": "sql",
  "settings": {
    "featureDiscoveryProjectId": "string",
    "featureDiscoverySupervisedFeatureReduction": null,
    "predictionPoint": "string",
    "relationshipsConfigurationId": "string",
    "target": "string",
    "weightsFeature": "string"
  },
  "sql": "string",
  "status": "draft",
  "updatedAt": "2019-08-24T14:15:22Z",
  "updatedBy": {
    "email": "string",
    "fullName": "string",
    "id": "string",
    "userhash": "string",
    "username": "string"
  }
}

Responses¶

Status	Meaning	Description	Schema
200	OK	none	RecipeResponse
409	Conflict	Operations can't be applied due to a wrangling session state.	None
422	Unprocessable Entity	Cannot modify published recipe.	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/recipes/{recipeId}/operations/{operationIndex}/¶

Returns an operation configuration with an additional inputColumns field to show the list of columns available at that stage.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/recipes/{recipeId}/operations/{operationIndex}/ \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
recipeId	path	string	true	The ID of the recipe.
operationIndex	path	integer	true	The zero-based index of the operation.

Example responses¶

{}

Responses¶

Status	Meaning	Description	Schema
200	OK	none	OperationDetails

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/recipes/{recipeId}/preview/¶

Retrieve a wrangling preview given ID.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/recipes/{recipeId}/preview/ \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
offset	query	integer	false	Number of results to skip.
limit	query	integer	false	At most this many results are returned. The default may change without notice.
numberOfOperationsToUse	query	integer	false	The number indicating how many operations from the beginning to retrieve a preview for.
recipeId	path	string	true	The ID of the recipe.

Example responses¶


{
  "byteSize": 0,
  "columns": [
    "string"
  ],
  "count": 0,
  "data": [
    [
      "string"
    ]
  ],
  "estimatedSizeExceedsLimit": true,
  "next": "http://example.com",
  "previous": "http://example.com",
  "resultSchema": [
    {
      "columnDefaultValue": "string",
      "dataType": "string",
      "dataTypeInt": 0,
      "isInPrimaryKey": true,
      "isNullable": "NO",
      "name": "string",
      "precision": 0,
      "scale": 0
    }
  ],
  "totalCount": 0
}

Responses¶

Status	Meaning	Description	Schema
200	OK	none	RecipePreviewResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/recipes/{recipeId}/preview/¶

Starts the preview process for the recipe. Since this is an asynchronous process this endpoint returns a status ID to use with the status endpoint and a location header with the URL that can be polled for status.Launch WranglingJob, which includes: 1. InitialSamplingJob if it hasn’t been launched before 2. Preview query itself 3. Launch recipe eda job

Insights computation is launched implicitly if there was sampling specified and no operations specified.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/recipes/{recipeId}/preview/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "credentialId": "string",
  "numberOfOperationsToUse": 0
}

Parameters

Name	In	Type	Required	Description
recipeId	path	string	true	The ID of the recipe.
body	body	RecipeRunPreviewAsync	false	none

Example responses¶


{
  "code": 0,
  "created": "2019-08-24T14:15:22Z",
  "description": "",
  "message": "",
  "status": "INITIALIZED",
  "statusId": "e900225c-0629-4e96-be6e-86a17a309645",
  "statusType": ""
}

Responses¶

Status	Meaning	Description	Schema
202	Accepted	none	StatusResponse
422	Unprocessable Entity	Credentials were not provided and default credentials were not found.	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/recipes/{recipeId}/relationshipQualityAssessments/¶

Submit a job to assess the quality of the relationship configuration within a Feature Discovery session in Workbench.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/recipes/{recipeId}/relationshipQualityAssessments/ \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "credentials": [
    {
      "catalogVersionId": "string",
      "credentialId": "string",
      "url": "string"
    }
  ],
  "datetimePartitionColumn": "string",
  "featureEngineeringPredictionPoint": "string",
  "relationshipsConfiguration": {
    "datasetDefinitions": [
      {
        "catalogId": "string",
        "catalogVersionId": "string",
        "featureListId": "string",
        "identifier": "string",
        "primaryTemporalKey": "string",
        "snapshotPolicy": "specified"
      }
    ],
    "featureDiscoveryMode": "default",
    "featureDiscoverySettings": [
      {
        "description": "string",
        "family": "string",
        "name": "string",
        "settingType": "string",
        "value": true,
        "verboseName": "string"
      }
    ],
    "id": "string",
    "relationships": [
      {
        "dataset1Identifier": "string",
        "dataset1Keys": [
          "string"
        ],
        "dataset2Identifier": "string",
        "dataset2Keys": [
          "string"
        ],
        "featureDerivationWindowEnd": 0,
        "featureDerivationWindowStart": 0,
        "featureDerivationWindowTimeUnit": "MILLISECOND",
        "featureDerivationWindows": [
          {
            "end": 0,
            "start": 0,
            "unit": "MILLISECOND"
          }
        ],
        "predictionPointRounding": 30,
        "predictionPointRoundingTimeUnit": "MILLISECOND"
      }
    ],
    "snowflakePushDownCompatible": true
  },
  "userId": "string"
}

Parameters

Name	In	Type	Required	Description
recipeId	path	string	true	The ID of the recipe.
body	body	RelationshipQualityAssessmentsCreate	false	none

Responses¶

Status	Meaning	Description	Schema
202	Accepted	Relationship quality assessment has successfully started. See the Location header.	None
422	Unprocessable Entity	Unable to process the request	None

Response Headers¶

Status	Header	Type	Format	Description
202	Location	string		A url that can be polled to check the status.

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

PATCH /api/v2/recipes/{recipeId}/settings/¶

Updates some recipe settings applicable in the modeling stage.

Code samples¶


curl -X PATCH https://app.datarobot.com/api/v2/recipes/{recipeId}/settings/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "featureDiscoverySupervisedFeatureReduction": true,
  "predictionPoint": "string",
  "relationshipsConfigurationId": "string",
  "target": "string",
  "weightsFeature": "string"
}

Parameters

Name	In	Type	Required	Description
recipeId	path	string	true	The ID of the recipe.
body	body	RecipeSettingsUpdate	false	none

Example responses¶


{
  "featureDiscoveryProjectId": "string",
  "featureDiscoverySupervisedFeatureReduction": null,
  "predictionPoint": "string",
  "relationshipsConfigurationId": "string",
  "target": "string",
  "weightsFeature": "string"
}

Responses¶

Status	Meaning	Description	Schema
200	OK	none	RecipeSettingsResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/recipes/{recipeId}/sql/¶

Builds a SQL query for the recipe. Overrides operations to get the adjusted query without changing the recipe.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/recipes/{recipeId}/sql/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "inputsAsAliases": false,
  "operations": [
    {
      "arguments": {
        "conditions": [
          {
            "column": "string",
            "function": "between",
            "functionArguments": []
          }
        ],
        "keepRows": true,
        "operator": "and"
      },
      "directive": "filter"
    }
  ]
}

Parameters

Name	In	Type	Required	Description
recipeId	path	string	true	The ID of the recipe.
body	body	BuildRecipeSql	false	none

Example responses¶


{
  "sql": "string"
}

Responses¶

Status	Meaning	Description	Schema
201	Created	none	BuildRecipeSqlResponse
409	Conflict	Input source data is not ready yet.	None
422	Unprocessable Entity	Failed to build SQL.	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/recipes/{recipeId}/timeseriesTransformationPlans/¶

Generate a list of recipe operations, which serve as the plan to transform a regular dataset into a time series dataset.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/recipes/{recipeId}/timeseriesTransformationPlans/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "baselinePeriods": [
    1
  ],
  "datetimePartitionColumn": "string",
  "doNotDeriveColumns": [],
  "excludeLowInfoColumns": true,
  "featureDerivationWindows": [
    300
  ],
  "featureReductionThreshold": 0.9,
  "forecastDistances": [
    0
  ],
  "knownInAdvanceColumns": [],
  "maxLagOrder": 100,
  "multiseriesIdColumn": "string",
  "numberOfOperationsToUse": 0,
  "targetColumn": "string"
}

Parameters

Name	In	Type	Required	Description
recipeId	path	string	true	The ID of the recipe.
body	body	GenerateTransformationPlan	false	none

Example responses¶


{
  "code": 0,
  "created": "2019-08-24T14:15:22Z",
  "description": "",
  "message": "",
  "status": "INITIALIZED",
  "statusId": "e900225c-0629-4e96-be6e-86a17a309645",
  "statusType": ""
}

Responses¶

Status	Meaning	Description	Schema
202	Accepted	none	StatusResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/recipes/{recipeId}/timeseriesTransformationPlans/{id}/¶

Returns a list of recipe operations, which serve as the plan to transform a regular dataset into a time series dataset.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/recipes/{recipeId}/timeseriesTransformationPlans/{id}/ \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
recipeId	path	string	true	The ID of the recipe.
id	path	string	true	The ID of the transformation plan.

Example responses¶


{
  "id": "string",
  "inputParameters": {
    "baselinePeriods": [
      1
    ],
    "datetimePartitionColumn": "string",
    "doNotDeriveColumns": [],
    "excludeLowInfoColumns": true,
    "featureDerivationWindows": [
      300
    ],
    "featureReductionThreshold": 0.9,
    "forecastDistances": [
      0
    ],
    "knownInAdvanceColumns": [],
    "maxLagOrder": 100,
    "multiseriesIdColumn": "string",
    "numberOfOperationsToUse": 0,
    "targetColumn": "string"
  },
  "status": "INITIALIZED",
  "suggestedOperations": [
    {
      "arguments": {
        "baselinePeriods": [
          1
        ],
        "datetimePartitionColumn": "string",
        "forecastDistances": [
          0
        ],
        "forecastPoint": "2019-08-24T14:15:22Z",
        "knownInAdvanceColumns": [],
        "multiseriesIdColumn": null,
        "rollingMedianUserDefinedFunction": null,
        "rollingMostFrequentUserDefinedFunction": null,
        "targetColumn": "string",
        "taskPlan": [
          {
            "column": "string",
            "taskList": [
              {
                "arguments": {
                  "methods": [
                    "avg"
                  ],
                  "windowSize": 300
                },
                "name": "numeric-stats"
              }
            ]
          }
        ]
      },
      "directive": "time-series"
    }
  ]
}

Responses¶

Status	Meaning	Description	Schema
200	OK	none	TransformationPlanResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

Schemas¶

AggregateDirectiveArguments


{
  "aggregations": [
    {
      "feature": null,
      "functions": [
        "sum"
      ]
    }
  ],
  "groupBy": [
    "string"
  ]
}

The aggregation description.

Properties¶

Name	Type	Required	Restrictions	Description
aggregations	[Aggregation]	true	minItems: 1	The aggregations.
groupBy	[string]	true	minItems: 1	The column(s) to group by.

Aggregation


{
  "feature": null,
  "functions": [
    "sum"
  ]
}

Properties¶

Name	Type	Required	Restrictions	Description
feature	string,null	false		The feature.
functions	[string]	true	minItems: 1	The functions.

BaseCategoricalStatsArguments


{
  "methods": [
    "most-frequent"
  ],
  "windowSize": 300
}

Task arguments.

Properties¶

Name	Type	Required	Restrictions	Description
methods	[string]	true	maxItems: 10 minItems: 1	Window method: most-frequent
windowSize	integer	true	maximum: 300	Rolling window size, defined in terms of rows. Left end is exclusive, right end is inclusive.

BaseLagsArguments


{
  "orders": [
    300
  ]
}

Task arguments.

Properties¶

Name	Type	Required	Restrictions	Description
orders	[integer]	true	maxItems: 100 minItems: 1	Lag orders.

BaseNumericStatsArguments


{
  "methods": [
    "avg"
  ],
  "windowSize": 300
}

Task arguments.

Properties¶

Name	Type	Required	Restrictions	Description
methods	[string]	true	maxItems: 10 minItems: 1	Methods to apply in a rolling window.
windowSize	integer	true	maximum: 300	Rolling window size, defined in terms of rows. Left end is exclusive, right end is inclusive.

BuildRecipeSql


{
  "inputsAsAliases": false,
  "operations": [
    {
      "arguments": {
        "conditions": [
          {
            "column": "string",
            "function": "between",
            "functionArguments": []
          }
        ],
        "keepRows": true,
        "operator": "and"
      },
      "directive": "filter"
    }
  ]
}

Properties¶

Name	Type	Required	Restrictions	Description
inputsAsAliases	boolean	false		Produce the SQL that uses the input aliases instead of the real table names.
operations	[OneOfDirective]	false	maxItems: 1000	List of operations to override the recipe operations when building SQL with default null. It doesn't modify the recipe itself. Missing operations field or null give original recipe SQL. Empty operations list produces basic query of a format: `SELECT <list of columns> FROM <table name>`

BuildRecipeSqlResponse


{
  "sql": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
sql	string	true		Generated sql.

CatalogPasswordCredentials


{
  "catalogVersionId": "string",
  "password": "string",
  "url": "string",
  "user": "string"
}

Properties¶

Name	Type	Required	Description
catalogVersionId	string	false	Identifier of the catalog version
password	string	true	The password (in cleartext) for database authentication. The password will be encrypted on the server side as part of the HTTP request and never saved or stored.
url	string	false	URL that is subject to credentials.
user	string	true	The username for database authentication.

ComputeNewDirectiveArguments


{
  "expression": "string",
  "newFeatureName": "string"
}

The transformation description.

Properties¶

Name	Type	Required	Restrictions	Description
expression	string	true		The expression for new feature computation.
newFeatureName	string	true		The new feature name which will hold results of expression evaluation.

DataStoreExtendedColumnNoKeysResponse


{
  "columnDefaultValue": "string",
  "dataType": "string",
  "dataTypeInt": 0,
  "isInPrimaryKey": true,
  "isNullable": "NO",
  "name": "string",
  "precision": 0,
  "scale": 0
}

JDBC result column description

Properties¶

Name	Type	Required	Description
columnDefaultValue	string,null	false	Default value of the column.
dataType	string	true	DataType of the column.
dataTypeInt	integer	false	Integer value of the column data type.
isInPrimaryKey	boolean	false	True if the column is in the primary key .
isNullable	string,null	false	If the column values can be null.
name	string	true	Name of the column.
precision	integer	false	Precision of the column.
scale	integer	false	Scale of the column.

Enumerated Values¶

Property	Value
isNullable	[`NO`, `UNKNOWN`, `YES`]

DatasetDefinition


{
  "catalogId": "string",
  "catalogVersionId": "string",
  "featureListId": "string",
  "identifier": "string",
  "primaryTemporalKey": "string",
  "snapshotPolicy": "specified"
}

Properties¶

Name	Type	Required	Restrictions	Description
catalogId	string	true		ID of the catalog item.
catalogVersionId	string	true		ID of the catalog item version.
featureListId	string,null	false		ID of the feature list. This decides which columns in the dataset are used for feature generation.
identifier	string	true	maxLength: 20 minLength: 1 minLength: 1	Short name of the dataset (used directly as part of the generated feature names).
primaryTemporalKey	string,null	false		Name of the column indicating time of record creation.
snapshotPolicy	string	false		Policy for using dataset snapshots when creating a project or making predictions. Must be one of the following values: 'specified': Use specific snapshot specified by catalogVersionId. 'latest': Use latest snapshot from the same catalog item. 'dynamic': Get data from the source (only applicable for JDBC datasets).

Enumerated Values¶

Property	Value
snapshotPolicy	[`specified`, `latest`, `dynamic`]

DatasetFeaturePlotDataResponse


{
  "count": 0,
  "label": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
count	number	true		Number of values in the bin.
label	string	true		Bin start for numerical/uncapped, or string value for categorical. The bin `==Missing==` is created for rows that did not have the feature.

DatasetInputCreate


{
  "sampling": {
    "arguments": {
      "rows": 10000,
      "seed": 0
    },
    "directive": "random-sample"
  }
}

Dataset configuration.

Properties¶

Name	Type	Required	Restrictions	Description
sampling	DatasetInputSampling	false		Sampling data transformation.

DatasetInputSampling


{
  "arguments": {
    "rows": 10000,
    "seed": 0
  },
  "directive": "random-sample"
}

Sampling data transformation.

Properties¶

oneOf

Name	Type	Required	Description
anonymous	object	false	none
» arguments	RandomSampleArgumentsCreate	false	The interactive sampling config.
» directive	string	true	The directive name.

xor

Name	Type	Required	Description
anonymous	object	false	none
» arguments	LimitDirectiveArguments	true	The interactive sampling config.
» directive	string	true	The directive name.

xor

Name	Type	Required	Description
anonymous	object	false	none
» arguments	DatetimeSampleArgumentsCreate	false	The interactive sampling config.
» directive	string	true	The directive name.

Enumerated Values¶

Property	Value
directive	`random-sample`
directive	`limit`
directive	`datetime-sample`

DatetimeSampleArgumentsCreate


{
  "datetimePartitionColumn": "string",
  "multiseriesIdColumn": null,
  "rows": 10000,
  "selectedSeries": [
    "string"
  ],
  "strategy": "earliest"
}

The interactive sampling config.

Properties¶

Name	Type	Required	Restrictions	Description
datetimePartitionColumn	string	true		The datetime partition column to order by.
multiseriesIdColumn	string,null	false		The series ID column, if present.
rows	integer	false	maximum: 10000 minimum: 1	The number of rows to be sampled.
selectedSeries	[string]	false	maxItems: 1000 minItems: 1	The selected series to be sampled. Requires "multiseriesIdColumn".
strategy	string	true		Sets whether to take the latest or earliest rows relative to the datetime partition column.

Enumerated Values¶

Property	Value
strategy	[`earliest`, `latest`]

DownsamplingRandomDirectiveArguments


{
  "rows": 0,
  "seed": null
}

The downsampling configuration.

Properties¶

Name	Type	Required	Restrictions	Description
rows	integer	true		The number of sampled rows.
seed	integer,null	false		The start number of the random number generator

DropColumnsArguments


{
  "columns": [
    "string"
  ]
}

The transformation description.

Properties¶

Name	Type	Required	Restrictions	Description
columns	[string]	true	maxItems: 1000 minItems: 1	The list of columns.

ExperimentContainerUserResponse


{
  "email": "string",
  "fullName": "string",
  "id": "string",
  "userhash": "string",
  "username": "string"
}

A users associated with a Use Case.

Properties¶

Name	Type	Required	Description
email	string,null	true	The email address of the user.
fullName	string,null	false	The full name of the user.
id	string	true	The id of the user.
userhash	string,null	false	User's gravatar hash.
username	string,null	false	The username of the user.

FeatureDerivationWindow


{
  "end": 0,
  "start": 0,
  "unit": "MILLISECOND"
}

Properties¶

Name	Type	Required	Restrictions	Description
end	integer	true	maximum: 0	How many featureDerivationWindowUnits of each dataset's primary temporal key into the past relative to the datetimePartitionColumn the feature derivation window should end. Will be a non-positive integer, if present. If present, time-aware joins will be used. Only applicable when table1Identifier is not provided.
start	integer	true		How many featureDerivationWindowUnits of each dataset's primary temporal key into the past relative to the datetimePartitionColumn the feature derivation window should begin. Will be a negative integer, if present. If present, time-aware joins will be used. Only applicable when table1Identifier is not provided.
unit	string	true		Time unit of the feature derivation window. Supported values are MILLISECOND, SECOND, MINUTE, HOUR, DAY, WEEK, MONTH, QUARTER, YEAR. If present, time-aware joins will be used. Only applicable when table1Identifier is not provided.

Enumerated Values¶

Property	Value
unit	[`MILLISECOND`, `SECOND`, `MINUTE`, `HOUR`, `DAY`, `WEEK`, `MONTH`, `QUARTER`, `YEAR`]

FeatureDiscoverySettingResponse


{
  "description": "string",
  "family": "string",
  "name": "string",
  "settingType": "string",
  "value": true,
  "verboseName": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
description	string	true		Description of this feature discovery setting
family	string	true		Family of this feature discovery setting
name	string	true	maxLength: 100	Name of this feature discovery setting
settingType	string	true		Type of this feature discovery setting
value	boolean	true		Value of this feature discovery setting
verboseName	string	true		Human readable name of this feature discovery setting

FeatureKeySummaryDetailsResponseValidatorMultilabel


{
  "max": 0,
  "mean": 0,
  "median": 0,
  "min": 0,
  "pctRows": 0,
  "stdDev": 0
}

Statistics of the key.

Properties¶

Name	Type	Required	Description
max	number	true	Maximum value of the key.
mean	number	true	Mean value of the key.
median	number	true	Median value of the key.
min	number	true	Minimum value of the key.
pctRows	number	true	Percentage occurrence of key in the EDA sample of the feature.
stdDev	number	true	Standard deviation of the key.

FeatureKeySummaryDetailsResponseValidatorSummarizedCategorical


{
  "dataQualities": "ISSUES_FOUND",
  "max": 0,
  "mean": 0,
  "median": 0,
  "min": 0,
  "pctRows": 0,
  "stdDev": 0
}

Statistics of the key.

Properties¶

Name	Type	Required	Description
dataQualities	string	true	The indicator of data quality assessment of the feature.
max	number	true	Maximum value of the key.
mean	number	true	Mean value of the key.
median	number	true	Median value of the key.
min	number	true	Minimum value of the key.
pctRows	number	true	Percentage occurrence of key in the EDA sample of the feature.
stdDev	number	true	Standard deviation of the key.

Enumerated Values¶

Property	Value
dataQualities	[`ISSUES_FOUND`, `NOT_ANALYZED`, `NO_ISSUES_FOUND`]

FeatureKeySummaryResponseValidatorMultilabel


{
  "key": "string",
  "summary": {
    "max": 0,
    "mean": 0,
    "median": 0,
    "min": 0,
    "pctRows": 0,
    "stdDev": 0
  }
}

Properties¶

Name	Type	Required	Restrictions	Description
key	string	true		Name of the key.
summary	FeatureKeySummaryDetailsResponseValidatorMultilabel	true		Statistics of the key.

FeatureKeySummaryResponseValidatorSummarizedCategorical


{
  "key": "string",
  "summary": {
    "dataQualities": "ISSUES_FOUND",
    "max": 0,
    "mean": 0,
    "median": 0,
    "min": 0,
    "pctRows": 0,
    "stdDev": 0
  }
}

For a Summarized Categorical columns, this will contain statistics for the top 50 keys (truncated to 103 characters)

Properties¶

Name	Type	Required	Restrictions	Description
key	string	true		Name of the key.
summary	FeatureKeySummaryDetailsResponseValidatorSummarizedCategorical	true		Statistics of the key.

FilterCondition


{
  "column": "string",
  "function": "between",
  "functionArguments": []
}

Properties¶

Name	Type	Required	Restrictions	Description
column	string	true		The column name.
function	string	true		The function used to evaluate each value.
functionArguments	[anyOf]	false	maxItems: 2	The arguments to use with the function.

anyOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		none

or

Name	Type	Required	Restrictions	Description
» anonymous	integer	false		none

or

Name	Type	Required	Restrictions	Description
» anonymous	number	false		none

Enumerated Values¶

Property	Value
function	[`between`, `contains`, `eq`, `gt`, `gte`, `lt`, `lte`, `neq`, `notnull`, `null`]

FilterDirectiveArguments


{
  "conditions": [
    {
      "column": "string",
      "function": "between",
      "functionArguments": []
    }
  ],
  "keepRows": true,
  "operator": "and"
}

The transformation description.

Properties¶

Name	Type	Required	Restrictions	Description
conditions	[FilterCondition]	true	maxItems: 1000	The list of conditions.
keepRows	boolean	true		Determines whether matching rows should be kept or dropped.
operator	string	true		The operator to apply on multiple conditions.

Enumerated Values¶

Property	Value
operator	[`and`, `or`]

GenerateTransformationPlan


{
  "baselinePeriods": [
    1
  ],
  "datetimePartitionColumn": "string",
  "doNotDeriveColumns": [],
  "excludeLowInfoColumns": true,
  "featureDerivationWindows": [
    300
  ],
  "featureReductionThreshold": 0.9,
  "forecastDistances": [
    0
  ],
  "knownInAdvanceColumns": [],
  "maxLagOrder": 100,
  "multiseriesIdColumn": "string",
  "numberOfOperationsToUse": 0,
  "targetColumn": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
baselinePeriods	[integer]	false	maxItems: 10 minItems: 1	A list of periodicities used to calculate naive target features.
datetimePartitionColumn	string	true		The column that is used to order the data.
doNotDeriveColumns	[string]	false	maxItems: 200	Columns to exclude from derivation; for them only the first lag is suggested.
excludeLowInfoColumns	boolean	false		Whether to ignore columns with low signal (only include features that pass a "reasonableness" check that determines whether they contain information useful for building a generalizable model).
featureDerivationWindows	[integer]	true	maxItems: 5 minItems: 1	A list of rolling windows of past values, defined in terms of rows, that are used to derive features for the modeling dataset.
featureReductionThreshold	number,null	false	maximum: 1	Threshold for feature reduction. For example, 0.9 means that features which cumulatively reach 90 % of importance are returned. Additionally, no more than 200 features are returned.
forecastDistances	[integer]	true	maxItems: 20 minItems: 1	A list of forecast distances, which defines the number of rows into the future to predict.
knownInAdvanceColumns	[string]	false	maxItems: 200	Columns that are known in advance (future values are known). Values for these known columns must be specified at prediction time.
maxLagOrder	integer,null	false	maximum: 100	The maximum lag order. This value cannot be greater than the largest feature derivation window.
multiseriesIdColumn	string,null	false		The series ID column, if present. This column partitions data to create a multiseries modeling project.
numberOfOperationsToUse	integer	false		If set, a transformation plan is suggested after the specified number of operations.
targetColumn	string	true		The column intended to be used as the target for modeling. This parameter is required for generating naive features.

GenericRecipeFromDataset


{
  "datasetId": "string",
  "dialect": "snowflake",
  "status": "preview"
}

Properties¶

oneOf

Name	Type	Required	Description
anonymous	object	false	none
» datasetId	string	true	Dataset ID to create a Recipe from.
» dialect	string	true	Source type data was retrieved from. Should be omitted for dataset rewrangling.
» status	string	true	Preview recipe

xor

Name	Type	Required	Restrictions	Description
anonymous	object	false		none
» datasetId	string	true		Dataset ID to create a Recipe from.
» datasetVersionId	string,null	false		Dataset version ID to create a Recipe from.
» dialect	string	false		Source type data was retrieved from. Should be omitted for dataset rewrangling and feature discovery recipes.
» experimentContainerId	string	false		[DEPRECATED - replaced with use_case_id] ID assigned to the Use Case, which is an experimental container for the recipe.
» inputs	[DatasetInputCreate]	false	maxItems: 1 minItems: 1	List of recipe inputs. Should be omitted on dataset wrangling when dataset is created from recipe.
» recipeType	string	true		Type of the recipe workflow.
» snapshotPolicy	string	false		Snapshot policy to use the created recipe.
» status	string	false		Wrangling recipe
» useCaseId	string	false		ID of the Use Case associated with the recipe.

Enumerated Values¶

Property	Value
dialect	[`snowflake`, `bigquery`, `databricks`, `spark`, `postgres`]
status	`preview`
dialect	[`snowflake`, `bigquery`, `databricks`, `spark`, `postgres`]
recipeType	[`sql`, `Sql`, `SQL`, `wrangling`, `Wrangling`, `WRANGLING`, `featureDiscovery`, `FeatureDiscovery`, `FEATURE_DISCOVERY`, `featureDiscoveryPrivatePreview`, `FeatureDiscoveryPrivatePreview`, `FEATURE_DISCOVERY_PRIVATE_PREVIEW`]
snapshotPolicy	[`fixed`, `latest`]
status	`draft`

InputParametersResponse


{
  "baselinePeriods": [
    1
  ],
  "datetimePartitionColumn": "string",
  "doNotDeriveColumns": [],
  "excludeLowInfoColumns": true,
  "featureDerivationWindows": [
    300
  ],
  "featureReductionThreshold": 0.9,
  "forecastDistances": [
    0
  ],
  "knownInAdvanceColumns": [],
  "maxLagOrder": 100,
  "multiseriesIdColumn": "string",
  "numberOfOperationsToUse": 0,
  "targetColumn": "string"
}

The input parameters corresponding to the suggested operations.

Properties¶

Name	Type	Required	Restrictions	Description
baselinePeriods	[integer]	false	maxItems: 10 minItems: 1	A list of periodicities used to calculate naive target features.
datetimePartitionColumn	string	true		The column that is used to order the data.
doNotDeriveColumns	[string]	false	maxItems: 200	Columns to exclude from derivation; for them only the first lag is suggested.
excludeLowInfoColumns	boolean	false		Whether to ignore columns with low signal (only include features that pass a "reasonableness" check that determines whether they contain information useful for building a generalizable model).
featureDerivationWindows	[integer]	true	maxItems: 5 minItems: 1	A list of rolling windows of past values, defined in terms of rows, that are used to derive features for the modeling dataset.
featureReductionThreshold	number,null	false	maximum: 1	Threshold for feature reduction. For example, 0.9 means that features which cumulatively reach 90 % of importance are returned. Additionally, no more than 200 features are returned.
forecastDistances	[integer]	true	maxItems: 20 minItems: 1	A list of forecast distances, which defines the number of rows into the future to predict.
knownInAdvanceColumns	[string]	false	maxItems: 200	Columns that are known in advance (future values are known). Values for these known columns must be specified at prediction time.
maxLagOrder	integer,null	false	maximum: 100	The maximum lag order. This value cannot be greater than the largest feature derivation window.
multiseriesIdColumn	string,null	false		The series ID column, if present. This column partitions data to create a multiseries modeling project.
numberOfOperationsToUse	integer	false		If set, a transformation plan is suggested after the specified number of operations.
targetColumn	string	true		The column intended to be used as the target for modeling. This parameter is required for generating naive features.

JDBCTableDataSourceInputCreate


{
  "canonicalName": "string",
  "catalog": "string",
  "sampling": {
    "arguments": {
      "rows": 10000,
      "seed": 0
    },
    "directive": "random-sample"
  },
  "schema": "string",
  "table": "string"
}

Data source configuration.

Properties¶

Name	Type	Required	Restrictions	Description
canonicalName	string	true		Data source canonical name.
catalog	string	false	maxLength: 256	Catalog name in the database if supported.
sampling	SampleDirectiveCreate	false		The input data transformation steps.
schema	string	false	maxLength: 256	Schema associated with the table or view in the database if the data source is not query based.
table	string	true	maxLength: 256	Table or view name in the database if the data source is not query based.

JobErrorCode

Properties¶

Name	Type	Required	Restrictions	Description
JobErrorCode	integer	false		Possible job error codes. This enum exists for consistency with the DataRobot Status API.

Enumerated Values¶

Property	Value
JobErrorCode	[`0`, `1`]

JobExecutionState


"INITIALIZED"

JobExecutionState

Properties¶

Name	Type	Required	Restrictions	Description
JobExecutionState	string	false		Possible job states. Values match the DataRobot Status API.

Enumerated Values¶

Property	Value
JobExecutionState	[`INITIALIZED`, `RUNNING`, `COMPLETED`, `ERROR`, `ABORTED`, `EXPIRED`]

JoinArguments


{
  "joinType": "inner",
  "leftKeys": [
    "string"
  ],
  "rightDataSourceId": "string",
  "rightKeys": [
    "string"
  ],
  "source": "table"
}

The transformation description.

Properties¶

Name	Type	Required	Restrictions	Description
joinType	string	true		The join type between primary and secondary data sources.
leftKeys	[string]	true	maxItems: 10000 minItems: 1	The list of columns to be used in the "ON" clause from the primary data source.
rightDataSourceId	string	true		The ID of the input data source.
rightKeys	[string]	true	maxItems: 10000 minItems: 1	The list of columns to be used in the "ON" clause from a secondary data source.
source	string	true		The source type.

Enumerated Values¶

Property	Value
joinType	[`inner`, `left`]
source	`table`

LimitDirectiveArguments


{
  "rows": 1000
}

The interactive sampling config.

Properties¶

Name	Type	Required	Restrictions	Description
rows	integer	true	maximum: 10000 minimum: 1	The number of rows to be selected.

OneOfDirective


{
  "arguments": {
    "conditions": [
      {
        "column": "string",
        "function": "between",
        "functionArguments": []
      }
    ],
    "keepRows": true,
    "operator": "and"
  },
  "directive": "filter"
}

Properties¶

oneOf

Name	Type	Required	Description
anonymous	object	false	none
» arguments	FilterDirectiveArguments	true	The transformation description.
» directive	string	true	The single data transformation step.

xor

Name	Type	Required	Description
anonymous	object	false	none
» arguments	ReplaceDirectiveArguments	true	The transformation description.
» directive	string	true	The single data transformation step.

xor

Name	Type	Required	Description
anonymous	object	false	none
» arguments	ComputeNewDirectiveArguments	true	The transformation description.
» directive	string	true	The single data transformation step.

xor

Name	Type	Required	Restrictions	Description
anonymous	object	false		none
» directive	string	true		The single data transformation step.

xor

Name	Type	Required	Description
anonymous	object	false	none
» arguments	DropColumnsArguments	true	The transformation description.
» directive	string	true	The single data transformation step.

xor

Name	Type	Required	Description
anonymous	object	false	none
» arguments	RenameColumnsArguments	true	The transformation description.
» directive	string	true	The single data transformation step.

xor

Name	Type	Required	Description
anonymous	object	false	none
» arguments	JoinArguments	true	The transformation description.
» directive	string	true	The single data transformation step.

xor

Name	Type	Required	Description
anonymous	object	false	none
» arguments	AggregateDirectiveArguments	true	The aggregation description.
» directive	string	true	The single data transformation step.

xor

Name	Type	Required	Description
anonymous	object	false	none
» arguments	TimeSeriesDirectiveArguments	true	Time series directive arguments.
» directive	string	true	Time series processing directive, which prepares the dataset for time series modeling. All windows are row-based.

Enumerated Values¶

Property	Value
directive	`filter`
directive	`replace`
directive	`compute-new`
directive	`dedupe-rows`
directive	`drop-columns`
directive	`rename-columns`
directive	`join`
directive	`aggregate`
directive	`time-series`

OneOfDownsamplingDirective


{
  "arguments": {
    "rows": 0,
    "seed": null
  },
  "directive": "random-sample"
}

Data transformation step.

Properties¶

oneOf

Name	Type	Required	Description
anonymous	object	false	none
» arguments	DownsamplingRandomDirectiveArguments	true	The downsampling configuration.
» directive	string	true	The downsampling method.

xor

Name	Type	Required	Description
anonymous	object	false	none
» arguments	SmartDownsamplingArguments	true	The downsampling configuration.
» directive	string	true	The downsampling method.

Enumerated Values¶

Property	Value
directive	`random-sample`
directive	`smart-downsampling`

OneOfTransforms


{
  "arguments": {
    "methods": [
      "avg"
    ],
    "windowSize": 300
  },
  "name": "numeric-stats"
}

Properties¶

oneOf

Name	Type	Required	Description
anonymous	object	false	none
» arguments	BaseNumericStatsArguments	true	Task arguments.
» name	string	true	Task name.

xor

Name	Type	Required	Description
anonymous	object	false	none
» arguments	BaseCategoricalStatsArguments	true	Task arguments.
» name	string	true	Task name.

xor

Name	Type	Required	Description
anonymous	object	false	none
» arguments	BaseLagsArguments	true	Task arguments.
» name	string	true	Task name.

Enumerated Values¶

Property	Value
name	`numeric-stats`
name	`categorical-stats`
name	`lags`

OperationDetails

{}

Properties¶

None

PatchRecipe


{
  "description": "string",
  "name": "string",
  "recipeType": "sql",
  "sql": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
description	string	false	maxLength: 1000	New recipe description.
name	string	false	maxLength: 255	New recipe name.
recipeType	string	false		The recipe workflow type.
sql	string	false	maxLength: 64000	Recipe SQL query.

Enumerated Values¶

Property	Value
recipeType	[`sql`, `Sql`, `SQL`, `wrangling`, `Wrangling`, `WRANGLING`, `featureDiscovery`, `FeatureDiscovery`, `FEATURE_DISCOVERY`, `featureDiscoveryPrivatePreview`, `FeatureDiscoveryPrivatePreview`, `FEATURE_DISCOVERY_PRIVATE_PREVIEW`]

RandomSampleArgumentsCreate


{
  "rows": 10000,
  "seed": 0
}

The interactive sampling config.

Properties¶

Name	Type	Required	Restrictions	Description
rows	integer	false	maximum: 10000 minimum: 1	The number of rows to be sampled.
seed	integer	false	minimum: 0	The starting number of the random number generator.

RecipeDownsamplingUpdate


{
  "downsampling": {
    "arguments": {
      "rows": 0,
      "seed": null
    },
    "directive": "random-sample"
  }
}

Properties¶

Name	Type	Required	Restrictions	Description
downsampling	OneOfDownsamplingDirective	true		Data transformation step.

RecipeFromDataSourceCreate


{
  "dataSourceType": "dr-database-v1",
  "dataStoreId": "string",
  "dialect": "snowflake",
  "experimentContainerId": "string",
  "inputs": [
    {
      "canonicalName": "string",
      "catalog": "string",
      "sampling": {
        "arguments": {
          "rows": 10000,
          "seed": 0
        },
        "directive": "random-sample"
      },
      "schema": "string",
      "table": "string"
    }
  ],
  "recipeType": "sql",
  "useCaseId": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
dataSourceType	string	true		Data source type.
dataStoreId	string	true		Data store ID for this data source.
dialect	string	true		Source type data was retrieved from.
experimentContainerId	string	false		[DEPRECATED - replaced with use_case_id] ID assigned to the Use Case, which is an experimental container for the recipe.
inputs	[JDBCTableDataSourceInputCreate]	true	maxItems: 1000 minItems: 1	List of recipe inputs
recipeType	string	true		Type of the recipe workflow.
useCaseId	string	false		ID of the Use Case associated with the recipe.

Enumerated Values¶

Property	Value
dataSourceType	[`dr-database-v1`, `jdbc`]
dialect	[`snowflake`, `bigquery`, `databricks`, `spark`, `postgres`]
recipeType	[`sql`, `wrangling`]

RecipeFromRecipeCreate


{
  "name": "string",
  "recipeId": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
name	string	false	maxLength: 255	The recipe name.
recipeId	string	true		Recipe ID to create a Recipe from.

RecipeInput


{
  "alias": "string",
  "dataSourceId": "string",
  "dataStoreId": "string",
  "datasetId": "string",
  "inputType": "datasource",
  "sampling": {
    "arguments": {
      "rows": 10000,
      "seed": 0
    },
    "directive": "random-sample"
  }
}

Properties¶

oneOf

Name	Type	Required	Restrictions	Description
anonymous	object	false		none
» alias	string,null	false	maxLength: 256	The alias for the data source table.
» dataSourceId	string,null	false		The ID of the input data source.
» dataStoreId	string,null	false		The ID of the input data store.
» datasetId	string,null	false		The ID of the input dataset.
» inputType	string	true		The data that comes from a database connection.
» sampling	SampleDirectiveCreate	false		The input data transformation steps.

xor

Name	Type	Required	Restrictions	Description
anonymous	object	false		none
» alias	string,null	false	maxLength: 256	The alias for the data source table.
» datasetId	string	true		The ID of the input dataset.
» datasetVersionId	string,null	false		The version ID of the input dataset.
» inputType	string	true		The data that comes from the Data Registry.
» sampling	SampleDirectiveCreate	false		The input data transformation steps.
» snapshotPolicy	string	false		Snapshot policy to use for this input.

Enumerated Values¶

Property	Value
inputType	`datasource`
inputType	`dataset`
snapshotPolicy	[`fixed`, `latest`]

RecipeInputResponse


{
  "alias": "string",
  "dataSourceId": "string",
  "dataStoreId": "string",
  "datasetId": "string",
  "inputType": "datasource",
  "sampling": {
    "arguments": {
      "rows": 10000,
      "seed": 0
    },
    "directive": "random-sample"
  }
}

Properties¶

oneOf

Name	Type	Required	Restrictions	Description
anonymous	object	false		none
» alias	string,null	true	maxLength: 256	The alias for the data source table.
» dataSourceId	string,null	true		The ID of the input data source.
» dataStoreId	string,null	true		The ID of the input data store.
» datasetId	string,null	true		The ID of the input dataset.
» inputType	string	true		The data that comes from a database connection.
» sampling	SampleDirectiveCreate	false		The input data transformation steps.

xor

Name	Type	Required	Restrictions	Description
anonymous	object	false		none
» alias	string,null	true	maxLength: 256	The alias for the data source table.
» datasetId	string	true		The ID of the input dataset.
» datasetVersionId	string,null	true		The version ID of the input dataset.
» inputType	string	true		The data that comes from the Data Registry.
» sampling	SampleDirectiveCreate	false		The input data transformation steps.
» snapshotPolicy	string	true		Snapshot policy to use for this input.

Enumerated Values¶

Property	Value
inputType	`datasource`
inputType	`dataset`
snapshotPolicy	[`fixed`, `latest`]

RecipeInputStatsResponse


{
  "alias": "string",
  "columnCount": 0,
  "connectionName": "string",
  "dataSourceId": "string",
  "dataStoreId": "string",
  "datasetId": "string",
  "datasetVersionId": "string",
  "inputType": "datasource",
  "name": "string",
  "rowCount": 0,
  "sampling": {
    "arguments": {
      "rows": 10000,
      "seed": 0
    },
    "directive": "random-sample"
  },
  "snapshotPolicy": "fixed",
  "status": "ABORTED"
}

Properties¶

Name	Type	Required	Restrictions	Description
alias	string,null	true	maxLength: 256	The alias for the data source table.
columnCount	integer,null	true		Number of features in original (not sampled) data source
connectionName	string,null	true		The user-friendly name of the data store.
dataSourceId	string,null	true		The ID of the input data source.
dataStoreId	string,null	true		The ID of the input data store.
datasetId	string,null	false		The ID of the input data source.
datasetVersionId	string,null	false		The ID of the input data source,
inputType	string	true		Source type data came from
name	string,null	true		Combination of "catalog", "schema" and "table" from data source
rowCount	integer,null	true		Number of rows in original (not sampled) data source
sampling	SampleDirectiveCreate	false		The input data transformation steps.
snapshotPolicy	string	false		Snapshot policy to use for this input.
status	string	true		Input preparation status

Enumerated Values¶

Property	Value
inputType	[`datasource`, `dataset`]
snapshotPolicy	[`fixed`, `latest`]
status	[`ABORTED`, `COMPLETED`, `ERROR`, `EXPIRED`, `INITIALIZED`, `RUNNING`]

RecipeInputUpdate


{
  "inputs": [
    {
      "alias": "string",
      "dataSourceId": "string",
      "dataStoreId": "string",
      "datasetId": "string",
      "inputType": "datasource",
      "sampling": {
        "arguments": {
          "rows": 10000,
          "seed": 0
        },
        "directive": "random-sample"
      }
    }
  ]
}

Properties¶

Name	Type	Required	Restrictions	Description
inputs	[RecipeInput]	true	maxItems: 1000 minItems: 1	List of data sources and their sampling configurations.

RecipeInputsResponse


{
  "inputs": [
    {
      "alias": "string",
      "columnCount": 0,
      "connectionName": "string",
      "dataSourceId": "string",
      "dataStoreId": "string",
      "datasetId": "string",
      "datasetVersionId": "string",
      "inputType": "datasource",
      "name": "string",
      "rowCount": 0,
      "sampling": {
        "arguments": {
          "rows": 10000,
          "seed": 0
        },
        "directive": "random-sample"
      },
      "snapshotPolicy": "fixed",
      "status": "ABORTED"
    }
  ]
}

Properties¶

Name	Type	Required	Restrictions	Description
inputs	[RecipeInputStatsResponse]	true	maxItems: 1000 minItems: 1	List of recipe inputs

RecipeOperationsUpdate


{
  "force": false,
  "operations": [
    {
      "arguments": {
        "conditions": [
          {
            "column": "string",
            "function": "between",
            "functionArguments": []
          }
        ],
        "keepRows": true,
        "operator": "and"
      },
      "directive": "filter"
    }
  ]
}

Properties¶

Name	Type	Required	Restrictions	Description
force	boolean	false		If `true` then operations are stored even if they contain errors
operations	[OneOfDirective]	true	maxItems: 1000	List of directives to run for the recipe.

RecipePreviewResponse


{
  "byteSize": 0,
  "columns": [
    "string"
  ],
  "count": 0,
  "data": [
    [
      "string"
    ]
  ],
  "estimatedSizeExceedsLimit": true,
  "next": "http://example.com",
  "previous": "http://example.com",
  "resultSchema": [
    {
      "columnDefaultValue": "string",
      "dataType": "string",
      "dataTypeInt": 0,
      "isInPrimaryKey": true,
      "isNullable": "NO",
      "name": "string",
      "precision": 0,
      "scale": 0
    }
  ],
  "totalCount": 0
}

Properties¶

Name	Type	Required	Restrictions	Description
byteSize	integer	true	minimum: 0	Data memory usage
columns	[string]	true	maxItems: 10000	List of columns in data preview
count	integer	false		Number of items returned on this page.
data	[array]	true	maxItems: 1000	List of records output by the query.
estimatedSizeExceedsLimit	boolean	true		Defines if downsampling should be done based on sample size
next	string,null(uri)	true		URL pointing to the next page (if null, there is no next page).
previous	string,null(uri)	true		URL pointing to the previous page (if null, there is no previous page).
resultSchema	[DataStoreExtendedColumnNoKeysResponse]	true	maxItems: 10000	JDBC result schema
totalCount	integer	true		The total number of items across all pages.

RecipeResponse


{
  "createdAt": "2019-08-24T14:15:22Z",
  "createdBy": {
    "email": "string",
    "fullName": "string",
    "id": "string",
    "userhash": "string",
    "username": "string"
  },
  "description": "string",
  "dialect": "snowflake",
  "downsampling": {
    "arguments": {
      "rows": 0,
      "seed": null
    },
    "directive": "random-sample"
  },
  "errorMessage": null,
  "failedOperationsIndex": null,
  "inputs": [
    {
      "alias": "string",
      "dataSourceId": "string",
      "dataStoreId": "string",
      "datasetId": "string",
      "inputType": "datasource",
      "sampling": {
        "arguments": {
          "rows": 10000,
          "seed": 0
        },
        "directive": "random-sample"
      }
    }
  ],
  "name": "string",
  "operations": [
    {
      "arguments": {
        "conditions": [
          {
            "column": "string",
            "function": "between",
            "functionArguments": []
          }
        ],
        "keepRows": true,
        "operator": "and"
      },
      "directive": "filter"
    }
  ],
  "recipeId": "string",
  "recipeType": "sql",
  "settings": {
    "featureDiscoveryProjectId": "string",
    "featureDiscoverySupervisedFeatureReduction": null,
    "predictionPoint": "string",
    "relationshipsConfigurationId": "string",
    "target": "string",
    "weightsFeature": "string"
  },
  "sql": "string",
  "status": "draft",
  "updatedAt": "2019-08-24T14:15:22Z",
  "updatedBy": {
    "email": "string",
    "fullName": "string",
    "id": "string",
    "userhash": "string",
    "username": "string"
  }
}

Properties¶

Name	Type	Required	Restrictions	Description
createdAt	string,null(date-time)	true		ISO 8601-formatted date/time when the recipe was created.
createdBy	ExperimentContainerUserResponse	true		A users associated with a Use Case.
description	string	true	maxLength: 1000	The recipe description.
dialect	string	true		Source type data was retrieved from.
downsampling	OneOfDownsamplingDirective	true		Data transformation step.
errorMessage	string,null	true		Error message related to the specific operation
failedOperationsIndex	integer,null	true		Index of the first operation where error appears.
inputs	[RecipeInputResponse]	true	maxItems: 1000	List of data sources.
name	string,null	true	maxLength: 255	The recipe name.
operations	[OneOfDirective]	true	maxItems: 1000	List of transformations
recipeId	string	true		The ID of the recipe.
recipeType	string	true		Type of the recipe workflow.
settings	RecipeSettingsResponse	true		Recipe settings reusable at a modeling stage.
sql	string,null	true	maxLength: 64000	Recipe SQL query.
status	string	true		Recipe publication status.
updatedAt	string,null(date-time)	true		ISO 8601-formatted date/time when the recipe was last updated.
updatedBy	ExperimentContainerUserResponse	true		A users associated with a Use Case.

Enumerated Values¶

Property	Value
dialect	[`snowflake`, `bigquery`, `spark-feature-discovery`, `databricks`, `spark`, `postgres`]
recipeType	[`sql`, `Sql`, `SQL`, `wrangling`, `Wrangling`, `WRANGLING`, `featureDiscovery`, `FeatureDiscovery`, `FEATURE_DISCOVERY`, `featureDiscoveryPrivatePreview`, `FeatureDiscoveryPrivatePreview`, `FEATURE_DISCOVERY_PRIVATE_PREVIEW`]
status	[`draft`, `preview`, `published`]

RecipeRunPreviewAsync


{
  "credentialId": "string",
  "numberOfOperationsToUse": 0
}

Properties¶

Name	Type	Required	Restrictions	Description
credentialId	string	false		The ID of the credentials to use for the connection. If not given, the default credentials for the connection will be used.
numberOfOperationsToUse	integer	false	minimum: 0	The number indicating how many operations from the beginning to compute a preview for.

RecipeSettingsResponse


{
  "featureDiscoveryProjectId": "string",
  "featureDiscoverySupervisedFeatureReduction": null,
  "predictionPoint": "string",
  "relationshipsConfigurationId": "string",
  "target": "string",
  "weightsFeature": "string"
}

Recipe settings reusable at a modeling stage.

Properties¶

Name	Type	Required	Restrictions	Description
featureDiscoveryProjectId	string,null	true		Associated feature discovery project ID.
featureDiscoverySupervisedFeatureReduction	boolean,null	true		Run supervised feature reduction for Feature Discovery.
predictionPoint	string,null	true	maxLength: 255	The date column to be used as the prediction point for time-based feature engineering.
relationshipsConfigurationId	string,null	true		Associated relationships configuration ID.
target	string,null	false		The feature to use as the target at the modeling stage.
weightsFeature	string,null	false		The weights feature.

RecipeSettingsUpdate


{
  "featureDiscoverySupervisedFeatureReduction": true,
  "predictionPoint": "string",
  "relationshipsConfigurationId": "string",
  "target": "string",
  "weightsFeature": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
featureDiscoverySupervisedFeatureReduction	boolean,null	false		Run supervised feature reduction for Feature Discovery.
predictionPoint	string,null	false	maxLength: 255	The date column to be used as the prediction point for time-based feature engineering.
relationshipsConfigurationId	string,null	false		[Deprecated] No effect. The relationships configuration ID field is immutable.
target	string,null	false		The feature to use as the target at the modeling stage.
weightsFeature	string,null	false		The weights feature.

RecipesListResponse


{
  "count": 0,
  "data": [
    {
      "createdAt": "2019-08-24T14:15:22Z",
      "createdBy": {
        "email": "string",
        "fullName": "string",
        "id": "string",
        "userhash": "string",
        "username": "string"
      },
      "description": "string",
      "dialect": "snowflake",
      "downsampling": {
        "arguments": {
          "rows": 0,
          "seed": null
        },
        "directive": "random-sample"
      },
      "errorMessage": null,
      "failedOperationsIndex": null,
      "inputs": [
        {
          "alias": "string",
          "dataSourceId": "string",
          "dataStoreId": "string",
          "datasetId": "string",
          "inputType": "datasource",
          "sampling": {
            "arguments": {
              "rows": 10000,
              "seed": 0
            },
            "directive": "random-sample"
          }
        }
      ],
      "name": "string",
      "operations": [
        {
          "arguments": {
            "conditions": [
              {
                "column": "string",
                "function": "between",
                "functionArguments": []
              }
            ],
            "keepRows": true,
            "operator": "and"
          },
          "directive": "filter"
        }
      ],
      "recipeId": "string",
      "recipeType": "sql",
      "settings": {
        "featureDiscoveryProjectId": "string",
        "featureDiscoverySupervisedFeatureReduction": null,
        "predictionPoint": "string",
        "relationshipsConfigurationId": "string",
        "target": "string",
        "weightsFeature": "string"
      },
      "sql": "string",
      "status": "draft",
      "updatedAt": "2019-08-24T14:15:22Z",
      "updatedBy": {
        "email": "string",
        "fullName": "string",
        "id": "string",
        "userhash": "string",
        "username": "string"
      }
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Properties¶

Name	Type	Required	Description
count	integer	false	Number of items returned on this page.
data	[RecipeResponse]	true	A list of the datasets in this Use Case.
next	string,null(uri)	true	URL pointing to the next page (if null, there is no next page).
previous	string,null(uri)	true	URL pointing to the previous page (if null, there is no previous page).
totalCount	integer	true	The total number of items across all pages.

RefinexInsightsResponse


{
  "count": 0,
  "data": [
    {
      "datasetId": "string",
      "datasetVersionId": "string",
      "dateFormat": "string",
      "featureType": "Boolean",
      "id": 0,
      "isZeroInflated": true,
      "keySummary": {
        "key": "string",
        "summary": {
          "dataQualities": "ISSUES_FOUND",
          "max": 0,
          "mean": 0,
          "median": 0,
          "min": 0,
          "pctRows": 0,
          "stdDev": 0
        }
      },
      "language": "string",
      "lowInformation": true,
      "lowerQuartile": "string",
      "majorityClassCount": 0,
      "max": "string",
      "mean": "string",
      "median": "string",
      "min": "string",
      "minorityClassCount": 0,
      "naCount": 0,
      "name": "string",
      "plot": [
        {
          "count": 0,
          "label": "string"
        }
      ],
      "sampleRows": 0,
      "stdDev": "string",
      "timeSeriesEligibilityReason": "string",
      "timeSeriesEligibilityReasonAggregation": "string",
      "timeSeriesEligible": true,
      "timeSeriesEligibleAggregation": true,
      "timeStep": 0,
      "timeStepAggregation": 0,
      "timeUnit": "string",
      "timeUnitAggregation": "string",
      "uniqueCount": 0,
      "upperQuartile": "string"
    }
  ],
  "message": "string",
  "next": "http://example.com",
  "previous": "http://example.com",
  "status": "ABORTED",
  "totalCount": 0
}

Properties¶

Name	Type	Required	Restrictions	Description
count	integer	false		Number of items returned on this page.
data	[WranglingFeatureResponse]	true	maxItems: 100	The list of features related to the requested dataset.
message	string	false		Status message.
next	string,null(uri)	true		URL pointing to the next page (if null, there is no next page).
previous	string,null(uri)	true		URL pointing to the previous page (if null, there is no previous page).
status	string	false		Job status.
totalCount	integer	true		The total number of items across all pages.

Enumerated Values¶

Property	Value
status	[`ABORTED`, `COMPLETED`, `ERROR`, `EXPIRED`, `INITIALIZED`, `RUNNING`]

Relationship


{
  "dataset1Identifier": "string",
  "dataset1Keys": [
    "string"
  ],
  "dataset2Identifier": "string",
  "dataset2Keys": [
    "string"
  ],
  "featureDerivationWindowEnd": 0,
  "featureDerivationWindowStart": 0,
  "featureDerivationWindowTimeUnit": "MILLISECOND",
  "featureDerivationWindows": [
    {
      "end": 0,
      "start": 0,
      "unit": "MILLISECOND"
    }
  ],
  "predictionPointRounding": 30,
  "predictionPointRoundingTimeUnit": "MILLISECOND"
}

Properties¶

Name	Type	Required	Restrictions	Description
dataset1Identifier	string,null	false	maxLength: 20 minLength: 1 minLength: 1	Identifier of the first dataset in the relationship. If this is not provided, it represents the primary dataset.
dataset1Keys	[string]	true	maxItems: 10 minItems: 1	column(s) in the first dataset that are used to join to the second dataset.
dataset2Identifier	string	true	maxLength: 20 minLength: 1 minLength: 1	Identifier of the second dataset in the relationship.
dataset2Keys	[string]	true	maxItems: 10 minItems: 1	column(s) in the second dataset that are used to join to the first dataset.
featureDerivationWindowEnd	integer	false	maximum: 0	How many featureDerivationWindowUnits of each dataset's primary temporal key into the past relative to the datetimePartitionColumn the feature derivation window should end. Will be a non-positive integer, if present. If present, time-aware joins will be used. Only applicable when table1Identifier is not provided.
featureDerivationWindowStart	integer	false		How many featureDerivationWindowUnits of each dataset's primary temporal key into the past relative to the datetimePartitionColumn the feature derivation window should begin. Will be a negative integer, if present. If present, time-aware joins will be used. Only applicable when table1Identifier is not provided.
featureDerivationWindowTimeUnit	string	false		Time unit of the feature derivation window. Supported values are MILLISECOND, SECOND, MINUTE, HOUR, DAY, WEEK, MONTH, QUARTER, YEAR. If present, time-aware joins will be used. Only applicable when table1Identifier is not provided.
featureDerivationWindows	[FeatureDerivationWindow]	false	maxItems: 3	List of feature derivation window definitions that will be used.
predictionPointRounding	integer	false	maximum: 30	Closest value of predictionPointRoundingTimeUnit to round the prediction point into the past when applying the feature derivation window. Will be a positive integer, if present. Only applicable when table1Identifier is not provided.
predictionPointRoundingTimeUnit	string	false		Time unit of the prediction point rounding. Supported values are MILLISECOND, SECOND, MINUTE, HOUR, DAY, WEEK, MONTH, QUARTER, YEAR. Only applicable when table1Identifier is not provided.

Enumerated Values¶

Property	Value
featureDerivationWindowTimeUnit	[`MILLISECOND`, `SECOND`, `MINUTE`, `HOUR`, `DAY`, `WEEK`, `MONTH`, `QUARTER`, `YEAR`]
predictionPointRoundingTimeUnit	[`MILLISECOND`, `SECOND`, `MINUTE`, `HOUR`, `DAY`, `WEEK`, `MONTH`, `QUARTER`, `YEAR`]

RelationshipQualityAssessmentsCreate


{
  "credentials": [
    {
      "catalogVersionId": "string",
      "credentialId": "string",
      "url": "string"
    }
  ],
  "datetimePartitionColumn": "string",
  "featureEngineeringPredictionPoint": "string",
  "relationshipsConfiguration": {
    "datasetDefinitions": [
      {
        "catalogId": "string",
        "catalogVersionId": "string",
        "featureListId": "string",
        "identifier": "string",
        "primaryTemporalKey": "string",
        "snapshotPolicy": "specified"
      }
    ],
    "featureDiscoveryMode": "default",
    "featureDiscoverySettings": [
      {
        "description": "string",
        "family": "string",
        "name": "string",
        "settingType": "string",
        "value": true,
        "verboseName": "string"
      }
    ],
    "id": "string",
    "relationships": [
      {
        "dataset1Identifier": "string",
        "dataset1Keys": [
          "string"
        ],
        "dataset2Identifier": "string",
        "dataset2Keys": [
          "string"
        ],
        "featureDerivationWindowEnd": 0,
        "featureDerivationWindowStart": 0,
        "featureDerivationWindowTimeUnit": "MILLISECOND",
        "featureDerivationWindows": [
          {
            "end": 0,
            "start": 0,
            "unit": "MILLISECOND"
          }
        ],
        "predictionPointRounding": 30,
        "predictionPointRoundingTimeUnit": "MILLISECOND"
      }
    ],
    "snowflakePushDownCompatible": true
  },
  "userId": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
credentials	[oneOf]	false	maxItems: 30	Credentials for dynamic policy secondary datasets.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	StoredCredentials	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	CatalogPasswordCredentials	false		none

continued

Name	Type	Required	Description
datetimePartitionColumn	string,null	false	If a datetime partition column was used, the name of the column.
featureEngineeringPredictionPoint	string,null	false	The date column to be used as the prediction point for time-based feature engineering.
relationshipsConfiguration	RelationshipsConfigPayload	true	Object describing how secondary datasets are related to the primary dataset
userId	string	false	Mongo Id of the User who created the request

RelationshipsConfigPayload


{
  "datasetDefinitions": [
    {
      "catalogId": "string",
      "catalogVersionId": "string",
      "featureListId": "string",
      "identifier": "string",
      "primaryTemporalKey": "string",
      "snapshotPolicy": "specified"
    }
  ],
  "featureDiscoveryMode": "default",
  "featureDiscoverySettings": [
    {
      "description": "string",
      "family": "string",
      "name": "string",
      "settingType": "string",
      "value": true,
      "verboseName": "string"
    }
  ],
  "id": "string",
  "relationships": [
    {
      "dataset1Identifier": "string",
      "dataset1Keys": [
        "string"
      ],
      "dataset2Identifier": "string",
      "dataset2Keys": [
        "string"
      ],
      "featureDerivationWindowEnd": 0,
      "featureDerivationWindowStart": 0,
      "featureDerivationWindowTimeUnit": "MILLISECOND",
      "featureDerivationWindows": [
        {
          "end": 0,
          "start": 0,
          "unit": "MILLISECOND"
        }
      ],
      "predictionPointRounding": 30,
      "predictionPointRoundingTimeUnit": "MILLISECOND"
    }
  ],
  "snowflakePushDownCompatible": true
}

Object describing how secondary datasets are related to the primary dataset

Properties¶

Name	Type	Required	Restrictions	Description
datasetDefinitions	[DatasetDefinition]	true	maxItems: 30 minItems: 1	A list of datasets
featureDiscoveryMode	string,null	false		Mode of feature discovery. Supported values are 'default' and 'manual'.
featureDiscoverySettings	[FeatureDiscoverySettingResponse]	false	maxItems: 100	List of feature discovery settings used to customize the feature discovery process.
id	string	true		Id of the relationship configuration
relationships	[Relationship]	true	maxItems: 70 minItems: 1	A list of relationships
snowflakePushDownCompatible	boolean,null	false		Flag indicating if the relationships configuration is compatible with Snowflake push down processing.

Enumerated Values¶

Property	Value
featureDiscoveryMode	[`default`, `manual`]

RenameColumn


{
  "newName": "string",
  "originalName": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
newName	string	true		The new column name.
originalName	string	true		The original column name.

RenameColumnsArguments


{
  "columnMappings": [
    {
      "newName": "string",
      "originalName": "string"
    }
  ]
}

The transformation description.

Properties¶

Name	Type	Required	Restrictions	Description
columnMappings	[RenameColumn]	true	maxItems: 1000 minItems: 1	The list of name mappings.

ReplaceDirectiveArguments


{
  "isCaseSensitive": true,
  "matchMode": "partial",
  "origin": "string",
  "replacement": "",
  "searchFor": "string"
}

The transformation description.

Properties¶

Name	Type	Required	Description
isCaseSensitive	boolean	false	The flag indicating if the "search_for" value is case-sensitive.
matchMode	string	true	The match mode to use when detecting "search_for" values.
origin	string	true	The place name to look for in values.
replacement	string	false	The replacement value.
searchFor	string	true	Indicates what needs to be replaced.

Enumerated Values¶

Property	Value
matchMode	[`partial`, `exact`, `regex`]

SampleDirectiveCreate


{
  "arguments": {
    "rows": 10000,
    "seed": 0
  },
  "directive": "random-sample"
}

The input data transformation steps.

Properties¶

oneOf

Name	Type	Required	Description
anonymous	object	false	none
» arguments	RandomSampleArgumentsCreate	false	The interactive sampling config.
» directive	string	true	The directive name.

xor

Name	Type	Required	Description
anonymous	object	false	none
» arguments	DatetimeSampleArgumentsCreate	false	The interactive sampling config.
» directive	string	true	The directive name.

xor

Name	Type	Required	Description
anonymous	object	false	none
» arguments	LimitDirectiveArguments	true	The interactive sampling config.
» directive	string	true	The directive name.

xor

Name	Type	Required	Description
anonymous	object	false	none
» arguments	TableSampleArgumentsCreate	false	The sampling config.
» directive	string	true	The directive name.

Enumerated Values¶

Property	Value
directive	`random-sample`
directive	`datetime-sample`
directive	`limit`
directive	`tablesample`

SmartDownsamplingArguments


{
  "method": "binary",
  "rows": 2,
  "seed": null
}

The downsampling configuration.

Properties¶

Name	Type	Required	Restrictions	Description
method	string	true		The smart downsampling method.
rows	integer	true	minimum: 2	The number of sampled rows.
seed	integer,null	false		The starting number for the random number generator

Enumerated Values¶

Property	Value
method	[`binary`, `zero-inflated`]

StatusResponse


{
  "code": 0,
  "created": "2019-08-24T14:15:22Z",
  "description": "",
  "message": "",
  "status": "INITIALIZED",
  "statusId": "e900225c-0629-4e96-be6e-86a17a309645",
  "statusType": ""
}

StatusResponse

Properties¶

Name	Type	Required	Description
code	JobErrorCode	true	The error code associated with the job.
created	string(date-time)	true	The creation date of the job (ISO 8601 formatted).
description	string	false	The description associated with the job.
message	string	false	The error message associated with the job.
status	JobExecutionState	true	The execution status of the job.
statusId	string(uuid)	true	ID that can be used with GET /api/v2/status/{statusId}/ to poll for the testing job's status.
statusType	string	false	The type of the status object.

StoredCredentials


{
  "catalogVersionId": "string",
  "credentialId": "string",
  "url": "string"
}

Properties¶

Name	Type	Required	Description
catalogVersionId	string	false	Identifier of the catalog version
credentialId	string	true	ID of the credentials object in credential store.Can only be used along with catalogVersionId.
url	string,null	false	URL that is subject to credentials.

TableSampleArgumentsCreate


{
  "percent": 100,
  "seed": 0
}

The sampling config.

Properties¶

Name	Type	Required	Restrictions	Description
percent	number	true	maximum: 100 minimum: 0	The percent of the table to be sampled.
seed	integer	false	minimum: 0	The starting number of the random number generator.

TaskPlanItem


{
  "column": "string",
  "taskList": [
    {
      "arguments": {
        "methods": [
          "avg"
        ],
        "windowSize": 300
      },
      "name": "numeric-stats"
    }
  ]
}

Properties¶

Name	Type	Required	Restrictions	Description
column	string	true		Column to apply transformations to.
taskList	[OneOfTransforms]	true	maxItems: 15 minItems: 1	Tasks to apply to the specific column.

TimeSeriesDirective


{
  "arguments": {
    "baselinePeriods": [
      1
    ],
    "datetimePartitionColumn": "string",
    "forecastDistances": [
      0
    ],
    "forecastPoint": "2019-08-24T14:15:22Z",
    "knownInAdvanceColumns": [],
    "multiseriesIdColumn": null,
    "rollingMedianUserDefinedFunction": null,
    "rollingMostFrequentUserDefinedFunction": null,
    "targetColumn": "string",
    "taskPlan": [
      {
        "column": "string",
        "taskList": [
          {
            "arguments": {
              "methods": [
                "avg"
              ],
              "windowSize": 300
            },
            "name": "numeric-stats"
          }
        ]
      }
    ]
  },
  "directive": "time-series"
}

Properties¶

Name	Type	Required	Restrictions	Description
arguments	TimeSeriesDirectiveArguments	true		Time series directive arguments.
directive	string	true		Time series processing directive, which prepares the dataset for time series modeling. All windows are row-based.

Enumerated Values¶

Property	Value
directive	`time-series`

TimeSeriesDirectiveArguments


{
  "baselinePeriods": [
    1
  ],
  "datetimePartitionColumn": "string",
  "forecastDistances": [
    0
  ],
  "forecastPoint": "2019-08-24T14:15:22Z",
  "knownInAdvanceColumns": [],
  "multiseriesIdColumn": null,
  "rollingMedianUserDefinedFunction": null,
  "rollingMostFrequentUserDefinedFunction": null,
  "targetColumn": "string",
  "taskPlan": [
    {
      "column": "string",
      "taskList": [
        {
          "arguments": {
            "methods": [
              "avg"
            ],
            "windowSize": 300
          },
          "name": "numeric-stats"
        }
      ]
    }
  ]
}

Time series directive arguments.

Properties¶

Name	Type	Required	Restrictions	Description
baselinePeriods	[integer]	false	maxItems: 10 minItems: 1	A list of periodicities used to calculate naive target features.
datetimePartitionColumn	string	true		The column that is used to order the data.
forecastDistances	[integer]	true	maxItems: 20 minItems: 1	A list of forecast distances, which defines the number of rows into the future to predict.
forecastPoint	string(date-time)	false		Filter output by given forecast point. Can be applied only at the prediction time.
knownInAdvanceColumns	[string]	false	maxItems: 200	Columns that are known in advance (future values are known). Values for these known columns must be specified at prediction time.
multiseriesIdColumn	string,null	false		The series ID column, if present. This column partitions data to create a multiseries modeling project.
rollingMedianUserDefinedFunction	string,null	false		To optimize rolling median calculation with relational database sources, pass qualified path to the UDF, as follows: "DB_NAME.SCHEMA_NAME.FUNCTION_NAME". Contact DataRobot Support to fetch a suggested SQL function.
rollingMostFrequentUserDefinedFunction	string,null	false		To optimize rolling most frequent calculation with relational database sources, pass qualified path to the UDF, as follows: "DB_NAME.SCHEMA_NAME.FUNCTION_NAME". Contact DataRobot Support to fetch a suggested SQL function.
targetColumn	string	true		The column intended to be used as the target for modeling. This parameter is required for generating naive features.
taskPlan	[TaskPlanItem]	true	maxItems: 200 minItems: 1	Task plan to describe time series specific transformations.

TransformationPlanResponse


{
  "id": "string",
  "inputParameters": {
    "baselinePeriods": [
      1
    ],
    "datetimePartitionColumn": "string",
    "doNotDeriveColumns": [],
    "excludeLowInfoColumns": true,
    "featureDerivationWindows": [
      300
    ],
    "featureReductionThreshold": 0.9,
    "forecastDistances": [
      0
    ],
    "knownInAdvanceColumns": [],
    "maxLagOrder": 100,
    "multiseriesIdColumn": "string",
    "numberOfOperationsToUse": 0,
    "targetColumn": "string"
  },
  "status": "INITIALIZED",
  "suggestedOperations": [
    {
      "arguments": {
        "baselinePeriods": [
          1
        ],
        "datetimePartitionColumn": "string",
        "forecastDistances": [
          0
        ],
        "forecastPoint": "2019-08-24T14:15:22Z",
        "knownInAdvanceColumns": [],
        "multiseriesIdColumn": null,
        "rollingMedianUserDefinedFunction": null,
        "rollingMostFrequentUserDefinedFunction": null,
        "targetColumn": "string",
        "taskPlan": [
          {
            "column": "string",
            "taskList": [
              {
                "arguments": {
                  "methods": [
                    "avg"
                  ],
                  "windowSize": 300
                },
                "name": "numeric-stats"
              }
            ]
          }
        ]
      },
      "directive": "time-series"
    }
  ]
}

Properties¶

Name	Type	Required	Restrictions	Description
id	string	true		The identifier of the transformation plan.
inputParameters	InputParametersResponse	true		The input parameters corresponding to the suggested operations.
status	string	true		Transformation preparation status
suggestedOperations	[TimeSeriesDirective]	true	maxItems: 10	The list of operations to apply to a recipe to get the dataset ready for time series modeling.

Enumerated Values¶

Property	Value
status	[`INITIALIZED`, `COMPLETED`, `ERROR`]

WranglingFeatureResponse


{
  "datasetId": "string",
  "datasetVersionId": "string",
  "dateFormat": "string",
  "featureType": "Boolean",
  "id": 0,
  "isZeroInflated": true,
  "keySummary": {
    "key": "string",
    "summary": {
      "dataQualities": "ISSUES_FOUND",
      "max": 0,
      "mean": 0,
      "median": 0,
      "min": 0,
      "pctRows": 0,
      "stdDev": 0
    }
  },
  "language": "string",
  "lowInformation": true,
  "lowerQuartile": "string",
  "majorityClassCount": 0,
  "max": "string",
  "mean": "string",
  "median": "string",
  "min": "string",
  "minorityClassCount": 0,
  "naCount": 0,
  "name": "string",
  "plot": [
    {
      "count": 0,
      "label": "string"
    }
  ],
  "sampleRows": 0,
  "stdDev": "string",
  "timeSeriesEligibilityReason": "string",
  "timeSeriesEligibilityReasonAggregation": "string",
  "timeSeriesEligible": true,
  "timeSeriesEligibleAggregation": true,
  "timeStep": 0,
  "timeStepAggregation": 0,
  "timeUnit": "string",
  "timeUnitAggregation": "string",
  "uniqueCount": 0,
  "upperQuartile": "string"
}

Properties¶

Name	Type	Required	Description
datasetId	string	true	The ID of the dataset the feature belongs to
datasetVersionId	string	true	The ID of the dataset version the feature belongs to.
dateFormat	string,null	true	The date format string for how this feature was interpreted (or null if not a date feature). If not null, it will be compatible with https://docs.python.org/2/library/time.html#time.strftime .
featureType	string	true	Feature type.
id	integer	true	The number of the column in the dataset.
isZeroInflated	boolean,null	false	whether feature has an excessive number of zeros
keySummary	any	false	Per key summaries for Summarized Categorical or Multicategorical columns

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	FeatureKeySummaryResponseValidatorSummarizedCategorical	false		For a Summarized Categorical columns, this will contain statistics for the top 50 keys (truncated to 103 characters)

xor

Name	Type	Required	Restrictions	Description
» anonymous	[FeatureKeySummaryResponseValidatorMultilabel]	false		For a Multicategorical columns, this will contain statistics for the top classes

continued

Name	Type	Required	Description
language	string	false	Detected language of the feature.
lowInformation	boolean	false	Whether feature has too few values to be informative.
lowerQuartile	any	false	Lower quartile point of EDA sample of the feature.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		Lower quartile point of EDA sample of the feature.

xor

Name	Type	Required	Restrictions	Description
» anonymous	number	false		Lower quartile point of EDA sample of the feature.

continued

Name	Type	Required	Restrictions	Description
majorityClassCount	integer,null	true		The number of rows with a majority class value if smart downsampling is applicable to this feature.
max	any	false		Maximum value of the EDA sample of the feature.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		Maximum value of the EDA sample of the feature.

xor

Name	Type	Required	Restrictions	Description
» anonymous	number	false		Maximum value of the EDA sample of the feature.

continued

Name	Type	Required	Restrictions	Description
mean	any	false		Arithmetic mean of the EDA sample of the feature.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		Arithmetic mean of the EDA sample of the feature.

xor

Name	Type	Required	Restrictions	Description
» anonymous	number	false		Arithmetic mean of the EDA sample of the feature.

continued

Name	Type	Required	Restrictions	Description
median	any	false		Median of the EDA sample of the feature.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		Median of the EDA sample of the feature.

xor

Name	Type	Required	Restrictions	Description
» anonymous	number	false		Median of the EDA sample of the feature.

continued

Name	Type	Required	Restrictions	Description
min	any	false		Minimum value of the EDA sample of the feature.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		Minimum value of the EDA sample of the feature.

xor

Name	Type	Required	Restrictions	Description
» anonymous	number	false		Minimum value of the EDA sample of the feature.

continued

Name	Type	Required	Description
minorityClassCount	integer,null	true	The number of rows with neither null nor majority class value if smart downsampling is applicable to this feature.
naCount	integer,null	false	Number of missing values.
name	string	true	Feature name
plot	[DatasetFeaturePlotDataResponse]	false	Plot data based on feature values.
sampleRows	integer	true	The number of rows in the sample used to calculate the statistics.
stdDev	any	false	Standard deviation of EDA sample of the feature.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		Standard deviation of EDA sample of the feature.

xor

Name	Type	Required	Restrictions	Description
» anonymous	number	false		Standard deviation of EDA sample of the feature.

continued

Name	Type	Required	Description
timeSeriesEligibilityReason	string,null	false	why the feature is ineligible for time series projects, or 'suitable' if it is eligible.
timeSeriesEligibilityReasonAggregation	string,null	false	why the feature is ineligible for aggregation, or 'suitable' if it is eligible.
timeSeriesEligible	boolean	false	whether this feature can be used as a datetime partitioning feature for time series projects. Only sufficiently regular date features can be selected as the datetime feature for time series projects. Always false for non-date features. Date features that cannot be used in datetime partitioning for a time series project may be eligible for an OTV project, which has less stringent requirements.
timeSeriesEligibleAggregation	boolean	false	whether this feature can be used as a datetime feature for aggregationfor time series data prep. Always false for non-date features.
timeStep	integer,null	false	The minimum time step that can be used to specify time series windows. The units for this value are the `timeUnit`. When specifying windows for time series projects, all windows must have durations that are integer multiples of this number. Only present for date features that are eligible for time series projects and null otherwise.
timeStepAggregation	integer,null	false	The minimum time step that can be used to aggregate using this feature for time series data prep. The units for this value are the `timeUnit`. Only present for date features that are eligible for aggregation in time series data prep and null otherwise.
timeUnit	string,null	false	The unit for the interval between values of this feature, e.g. DAY, MONTH, HOUR. When specifying windows for time series projects, the windows are expressed in terms of this unit. Only present for date features eligible for time series projects, and null otherwise.
timeUnitAggregation	string,null	false	The unit for the interval between values of this feature, e.g. DAY, MONTH, HOUR. Only present for date features eligible for aggregation, and null otherwise.
uniqueCount	integer,null	false	Number of unique values.
upperQuartile	any	false	Upper quartile point of EDA sample of the feature.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		Upper quartile point of EDA sample of the feature.

xor

Name	Type	Required	Restrictions	Description
» anonymous	number	false		Upper quartile point of EDA sample of the feature.

Enumerated Values¶

Property	Value
featureType	[`Boolean`, `Categorical`, `Currency`, `Date`, `Date Duration`, `Document`, `Image`, `Interaction`, `Length`, `Location`, `Multicategorical`, `Numeric`, `Percentage`, `Summarized Categorical`, `Text`, `Time`]