DataRobot API resources > API reference documentation > DataRobot REST API > Data preparation > Data Registry

Data Registry¶

This page outlines the operations, endpoints, parameters, and example requests and responses for the Data Registry.

GET /api/v2/catalogItems/¶

List all catalog items accessible by the user.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/catalogItems/?offset=0&limit=0&initialCacheSize=500 \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
offset	query	integer	true	Specifies the number of results to skip for pagination.
limit	query	integer	true	Sets the maximum number of results returned. Enter 0 to specify no limit.
initialCacheSize	query	integer	true	The initial cache size, for Mongo search only.
useCache	query	string	false	Sets whether to use the cache, for Mongo search only.
orderBy	query	string	false	The attribute sort order applied to the returned catalog list: 'catalogName', 'originalName', 'description', 'created', or 'relevance'. For all options other than 'relevance', prefix the attribute name with a dash to sort in descending order. e.g., orderBy='-catalogName'. Defaults to '-created'.
searchFor	query	string	false	A value to search for in the dataset's name, description, tags, column names, categories, and latest errors. The search is case insensitive. If no value is provided, or if the empty string is used, or if the string contains only whitespace, no filtering occurs. Partial matching is performed on the dataset name and description fields; all other fields require an exact match.
tag	query	any	false	Filter results to display only items with the specified catalog item tags, in lower case, with no spaces.
accessType	query	string	false	Access type used to filter returned results. Valid options are 'owner', 'shared', 'created', and 'any' (the default): 'owner' items are owned by the requester, 'shared' items have been shared with the requester, 'created' items have been created by the requester, and 'any' items matches all.
datasourceType	query	any	false	Data source types used for filtering.
category	query	any	false	Category type(s) used for filtering. Searches are case sensitive and support '&' and 'OR' operators.
filterFailed	query	string	false	Sets whether to exclude from the search results all catalog items that failed during import. If True, invalid catalog items will be excluded; default is False.
ownerUserId	query	any	false	Filter results to display only those owned by user(s) identified by the specified UID.
ownerUsername	query	any	false	Filter results to display only those owned by user(s) identified by the specified username.
type	query	string	false	Filter results by catalog type. The 'dataset' option matches both 'snapshot_dataset' and 'remote_dataset'.
isUxrPreviewable	query	boolean	false	Filter results by catalogType = 'snapshot_dataset' and catalogType = 'remote_dataset' and data_origin in ['snowflake', 'bigquery-v1']

Enumerated Values¶

Parameter	Value
useCache	[`false`, `False`, `true`, `True`]
orderBy	[`originalName`, `-originalName`, `catalogName`, `-catalogName`, `description`, `-description`, `created`, `-created`, `relevance`, `-relevance`]
accessType	[`owner`, `shared`, `any`, `created`]
filterFailed	[`false`, `False`, `true`, `True`]
type	[`dataset`, `snapshot_dataset`, `remote_dataset`, `user_blueprint`]

Example responses¶


{
  "cacheHit": true,
  "count": 0,
  "data": [
    {
      "canShareDatasetData": true,
      "canUseDatasetData": true,
      "catalogName": "string",
      "catalogType": "unknown_dataset_type",
      "dataEngineQueryId": "string",
      "dataOrigin": "string",
      "dataSourceId": "string",
      "description": "string",
      "error": "string",
      "id": "string",
      "infoCreationDate": "string",
      "infoCreatorFullName": "string",
      "infoModificationDate": "string",
      "infoModifierFullName": "string",
      "isDataEngineEligible": true,
      "isFirstVersion": true,
      "lastModificationDate": "string",
      "lastModifierFullName": "string",
      "originalName": "string",
      "processingState": 0,
      "projectsUsedInCount": 0,
      "recipeId": "string",
      "relevance": 0,
      "tags": [
        "string"
      ],
      "uri": "string",
      "userBlueprintId": "string"
    }
  ],
  "next": "string",
  "previous": "string",
  "totalCount": 0
}

Responses¶

Status	Meaning	Description	Schema
200	OK	Paginated list of catalog items is returned.	CatalogListSearchResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/catalogItems/{catalogId}/¶

Retrieves latest version information, by ID, for catalog items.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/catalogItems/{catalogId}/ \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
catalogId	path	string	true	Catalog item ID.

Example responses¶


{
  "createdAt": "2019-08-24T14:15:22Z",
  "createdBy": "string",
  "description": "string",
  "id": "string",
  "message": "string",
  "modifiedAt": "2019-08-24T14:15:22Z",
  "modifiedBy": "string",
  "name": "string",
  "status": "COMPLETED",
  "tags": [
    "string"
  ],
  "type": "unknown_dataset_type"
}

Responses¶

Status	Meaning	Description	Schema
200	OK	Catalog item details retrieved successfully.	CatalogDetailsRetrieveResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

PATCH /api/v2/catalogItems/{catalogId}/¶

Update the name, description, or tags for the requested catalog item.

Code samples¶


curl -X PATCH https://app.datarobot.com/api/v2/catalogItems/{catalogId}/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "description": "string",
  "name": "string",
  "tags": [
    "string"
  ]
}

Parameters

Name	In	Type	Required	Description
catalogId	path	string	true	Catalog item ID.
body	body	UpdateCatalogMetadata	false	none

Example responses¶


{
  "canShareDatasetData": true,
  "canUseDatasetData": true,
  "catalogName": "string",
  "catalogType": "unknown_dataset_type",
  "dataEngineQueryId": "string",
  "dataSourceId": "string",
  "description": "string",
  "error": "string",
  "id": "string",
  "infoCreationDate": "string",
  "infoCreatorFullName": "string",
  "infoModificationDate": "string",
  "infoModifierFullName": "string",
  "isDataEngineEligible": true,
  "isFirstVersion": true,
  "lastModificationDate": "string",
  "lastModifierFullName": "string",
  "originalName": "string",
  "processingState": 0,
  "projectsUsedInCount": 0,
  "recipeId": "string",
  "relevance": 0,
  "tags": [
    "string"
  ],
  "uri": "string",
  "userBlueprintId": "string"
}

Responses¶

Status	Meaning	Description	Schema
200	OK	Extended details of the updated catalog item.	CatalogExtendedDetailsResponse
403	Forbidden	User does not have permission to update this catalog item.	None
410	Gone	Requested catalog item was previously deleted.	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/dataEngineQueryGenerators/¶

Create a data engine query generator.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/dataEngineQueryGenerators/ \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{CreateDataEngineQueryGenerator}'

Body parameter¶


{
  "datasets": [
    {
      "alias": "string",
      "datasetId": "string",
      "datasetVersionId": "string"
    }
  ],
  "generatorSettings": {
    "datetimePartitionColumn": "string",
    "defaultCategoricalAggregationMethod": "last",
    "defaultNumericAggregationMethod": "mean",
    "defaultTextAggregationMethod": "concat",
    "endToSeriesMaxDatetime": true,
    "multiseriesIdColumns": [
      "string"
    ],
    "startFromSeriesMinDatetime": true,
    "target": "string",
    "timeStep": 0,
    "timeUnit": "DAY"
  },
  "generatorType": "TimeSeries"
}

Parameters

Name	In	Type	Required	Description
body	body	CreateDataEngineQueryGenerator	false	none

Responses¶

Status	Meaning	Description	Schema
202	Accepted	Creation has successfully started. See the Location header.	None
403	Forbidden	User does not have access to this functionality.	None
422	Unprocessable Entity	Unable to process data engine query generation.	None

Response Headers¶

Status	Header	Type	Format	Description
202	Location	string		A url that can be polled to check the status.

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/dataEngineQueryGenerators/{dataEngineQueryGeneratorId}/¶

Retrieve a data engine query generator

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/dataEngineQueryGenerators/{dataEngineQueryGeneratorId}/ \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
dataEngineQueryGeneratorId	path	string	true	The ID of the data engine query generator.

Example responses¶


{
  "datasets": [
    {
      "alias": "string",
      "datasetId": "string",
      "datasetVersionId": "string"
    }
  ],
  "generatorSettings": {
    "datetimePartitionColumn": "string",
    "defaultCategoricalAggregationMethod": "last",
    "defaultNumericAggregationMethod": "mean",
    "defaultTextAggregationMethod": "concat",
    "endToSeriesMaxDatetime": true,
    "multiseriesIdColumns": [
      "string"
    ],
    "startFromSeriesMinDatetime": true,
    "target": "string",
    "timeStep": 0,
    "timeUnit": "DAY"
  },
  "generatorType": "TimeSeries",
  "id": "string",
  "query": "string"
}

Responses¶

Status	Meaning	Description	Schema
200	OK	none	RetrieveDataEngineQueryResponse
403	Forbidden	User does not have access to this functionality.	None
404	Not Found	Specified query generator was not found.	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/dataEngineWorkspaceStates/¶

Create Data Engine workspace state in database.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/dataEngineWorkspaceStates/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{CreateWorkspaceState}'

Body parameter¶


{
  "datasets": [
    {
      "alias": "string",
      "datasetId": "string",
      "datasetVersionId": "string"
    }
  ],
  "language": "SQL",
  "query": "string"
}

Parameters

Name	In	Type	Required	Description
body	body	CreateWorkspaceState	false	none

Example responses¶


{
  "workspaceStateId": "string"
}

Responses¶

Status	Meaning	Description	Schema
200	OK	The Data Engine Workspace state	WorkspaceSourceCreatedResponse
410	Gone	Specified workspace state was already deleted.	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/dataEngineWorkspaceStates/fromDataEngineQueryGenerator/¶

Create Data Engine workspace state in database from a query generator.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/dataEngineWorkspaceStates/fromDataEngineQueryGenerator/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{CreateWorkspaceStateFromQueryGenerator}'

Body parameter¶


{
  "datasetId": "string",
  "datasetVersionId": "string",
  "queryGeneratorId": "string"
}

Parameters

Name	In	Type	Required	Description
body	body	CreateWorkspaceStateFromQueryGenerator	false	none

Example responses¶


{
  "datasets": [
    {
      "alias": "string",
      "datasetId": "string",
      "datasetVersionId": "string"
    }
  ],
  "language": "string",
  "query": "string",
  "queryGeneratorId": "string",
  "workspaceStateId": "string"
}

Responses¶

Status	Meaning	Description	Schema
201	Created	The Data Engine Workspace state	WorkspaceStateCreatedFromQueryGeneratorResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/dataEngineWorkspaceStates/{workspaceStateId}/¶

Read and return previously stored Data Engine workspace state.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/dataEngineWorkspaceStates/{workspaceStateId}/ \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
workspaceStateId	path	string	true	The ID of the data engine workspace state.

Example responses¶


{
  "datasets": [
    {
      "alias": "string",
      "datasetId": "string",
      "datasetVersionId": "string",
      "needsCredentials": true
    }
  ],
  "language": "string",
  "query": "string",
  "queryGeneratorId": "string",
  "runTime": 0
}

Responses¶

Status	Meaning	Description	Schema
200	OK	The Data Engine workspace state	WorkspaceStateResponse
410	Gone	Specified workspace state was already deleted.	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/datasets/¶

List all datasets accessible by the user.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/datasets/?limit=100&offset=0 \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
category	query	string	false	If specified, only dataset versions that have the specified category will be included in the results. Categories identify the intended use of the dataset.
orderBy	query	string	false	Sorting order which will be applied to catalog list.
limit	query	integer	true	At most this many results are returned.
offset	query	integer	true	This many results will be skipped.
filterFailed	query	string	false	Whether datasets that failed during import should be excluded from the results. If True invalid datasets will be excluded.
datasetVersionIds	query	any	false	If specified will only return datasets that are associated with specified dataset versions. Cannot be used as the same time with experiment_container_ids.

Enumerated Values¶

Parameter	Value
category	[`TRAINING`, `PREDICTION`, `SAMPLE`]
orderBy	[`created`, `-created`]
filterFailed	[`false`, `False`, `true`, `True`]

Example responses¶


{
  "count": 0,
  "data": [
    {
      "categories": [
        "BATCH_PREDICTIONS"
      ],
      "columnCount": 0,
      "createdBy": "string",
      "creationDate": "2019-08-24T14:15:22Z",
      "dataPersisted": true,
      "datasetId": "string",
      "datasetSize": 0,
      "isDataEngineEligible": true,
      "isLatestVersion": true,
      "isSnapshot": true,
      "name": "string",
      "processingState": "COMPLETED",
      "rowCount": 0,
      "sampleSize": {
        "type": "rows",
        "value": 1000000
      },
      "timeSeriesProperties": {
        "isMostlyImputed": null
      },
      "versionId": "string"
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Responses¶

Status	Meaning	Description	Schema
200	OK	A paginated list of datasets	DatasetListResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

PATCH /api/v2/datasets/¶

Execute the specified bulk action on multiple datasets.

Code samples¶


curl -X PATCH https://app.datarobot.com/api/v2/datasets/ \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{BulkDatasetAction}'

Body parameter¶


{
  "datasetIds": [
    "string"
  ],
  "payload": {
    "action": "delete"
  }
}

Parameters

Name	In	Type	Required	Description
body	body	BulkDatasetAction	false	none

Responses¶

Status	Meaning	Description	Schema
204	No Content	Successfully executed	None
409	Conflict	Cannot delete a dataset that has refresh jobs.	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/datasets/fromDataEngineWorkspaceState/¶

Create a dataset from a Data Engine workspace state.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/datasets/fromDataEngineWorkspaceState/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{DatasetCreateFromWorkspaceState}'

Body parameter¶


{
  "credentials": "string",
  "datasetName": "string",
  "doSnapshot": true,
  "workspaceStateId": "string"
}

Parameters

Name	In	Type	Required	Description
body	body	DatasetCreateFromWorkspaceState	false	none

Example responses¶


{
  "datasetId": "string",
  "datasetVersionId": "string",
  "statusId": "string"
}

Responses¶

Status	Meaning	Description	Schema
202	Accepted	Creation has successfully started. See the Location header.	CreatedDatasetDataEngineResponse
410	Gone	Specified query output was already deleted.	None

Response Headers¶

Status	Header	Type	Format	Description
202	Location	string		A url that can be polled to check the status.

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/datasets/fromDataSource/¶

Create a Dataset Item from a data source.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/datasets/fromDataSource/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{Datasource}'

Body parameter¶


{
  "categories": "BATCH_PREDICTIONS",
  "credentialData": {
    "credentialType": "basic",
    "password": "string",
    "user": "string"
  },
  "credentialId": "string",
  "dataSourceId": "string",
  "doSnapshot": true,
  "password": "string",
  "persistDataAfterIngestion": true,
  "sampleSize": {
    "type": "rows",
    "value": 1000000
  },
  "useKerberos": false,
  "user": "string"
}

Parameters

Name	In	Type	Required	Description
body	body	Datasource	false	none

Example responses¶


{
  "catalogId": "string",
  "catalogVersionId": "string",
  "statusId": "string"
}

Responses¶

Status	Meaning	Description	Schema
202	Accepted	Creation has successfully started. See the Location header.	CreatedDatasetResponse

Response Headers¶

Status	Header	Type	Format	Description
202	Location	string		A url that can be polled to check the status.

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/datasets/fromFile/¶

Create a dataset from a file.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/datasets/fromFile/ \
  -H "Content-Type: multipart/form-data" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{DatasetFromFile}'

Body parameter¶


categories: BATCH_PREDICTIONS
file: string

Parameters

Name	In	Type	Required	Description
body	body	DatasetFromFile	false	none

Example responses¶


{
  "catalogId": "string",
  "catalogVersionId": "string",
  "statusId": "string"
}

Responses¶

Status	Meaning	Description	Schema
202	Accepted	Creation has successfully started. See the Location header.	CreatedDatasetResponse
422	Unprocessable Entity	The request cannot be processed. The request did not contain file contents.	None

Response Headers¶

Status	Header	Type	Format	Description
202	Location	string		A url that can be polled to check the status.

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/datasets/fromHDFS/¶

Create a Dataset Item from an HDFS URL.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/datasets/fromHDFS/ \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{Hdfs}'

Body parameter¶


{
  "categories": "BATCH_PREDICTIONS",
  "doSnapshot": true,
  "namenodeWebhdfsPort": 0,
  "password": "string",
  "persistDataAfterIngestion": true,
  "url": "http://example.com",
  "user": "string"
}

Parameters

Name	In	Type	Required	Description
body	body	Hdfs	false	none

Responses¶

Status	Meaning	Description	Schema
200	OK	none	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/datasets/fromRecipe/¶

Create a dataset item and version from a recipe.During publishing, an immutable copy of the recipe is created, as well as a copy of the recipe's data source.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/datasets/fromRecipe/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{CreateFromRecipe}'

Body parameter¶


{
  "categories": "BATCH_PREDICTIONS",
  "credentialData": {
    "credentialType": "basic",
    "password": "string",
    "user": "string"
  },
  "credentialId": "string",
  "doSnapshot": true,
  "materializationDestination": {
    "catalog": "string",
    "schema": "string",
    "table": "string"
  },
  "name": "string",
  "persistDataAfterIngestion": true,
  "recipeId": "string",
  "skipDuplicateDatesValidation": false,
  "useKerberos": false
}

Parameters

Name	In	Type	Required	Description
body	body	CreateFromRecipe	false	none

Example responses¶


{
  "catalogId": "string",
  "catalogVersionId": "string",
  "statusId": "string"
}

Responses¶

Status	Meaning	Description	Schema
202	Accepted	Creation has successfully started. See the Location header.	CreatedDatasetResponse

Response Headers¶

Status	Header	Type	Format	Description
202	Location	string		A url that can be polled to check the status.

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/datasets/fromStage/¶

Create a dataset from a data stage.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/datasets/fromStage/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{DatasetFromStage}'

Body parameter¶


{
  "categories": "BATCH_PREDICTIONS",
  "stageId": "string"
}

Parameters

Name	In	Type	Required	Description
body	body	DatasetFromStage	false	none

Example responses¶


{
  "catalogId": "string",
  "catalogVersionId": "string",
  "statusId": "string"
}

Responses¶

Status	Meaning	Description	Schema
202	Accepted	Creation has successfully started. See the Location header.	CreatedDatasetResponse
403	Forbidden	You do not have permission to use data stages.	None
404	Not Found	Data Stage not found.	None
409	Conflict	Data Stage not finalized	None
410	Gone	Data Stage failed	None
422	Unprocessable Entity	The request cannot be processed. The request did not contain data stage.	None

Response Headers¶

Status	Header	Type	Format	Description
202	Location	string		A url that can be polled to check the status.

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/datasets/fromURL/¶

Create a Dataset Item from a URL.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/datasets/fromURL/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{Url}'

Body parameter¶


{
  "categories": "BATCH_PREDICTIONS",
  "doSnapshot": true,
  "persistDataAfterIngestion": true,
  "sampleSize": {
    "type": "rows",
    "value": 1000000
  },
  "url": "string"
}

Parameters

Name	In	Type	Required	Description
body	body	Url	false	none

Example responses¶


{
  "catalogId": "string",
  "catalogVersionId": "string",
  "statusId": "string"
}

Responses¶

Status	Meaning	Description	Schema
202	Accepted	Creation has successfully started. See the Location header.	CreatedDatasetResponse

Response Headers¶

Status	Header	Type	Format	Description
202	Location	string		A url that can be polled to check the status.

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

DELETE /api/v2/datasets/{datasetId}/¶

Marks the dataset with the given ID as deleted.

Code samples¶


curl -X DELETE https://app.datarobot.com/api/v2/datasets/{datasetId}/ \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset.

Responses¶

Status	Meaning	Description	Schema
204	No Content	Successfully deleted	None
409	Conflict	Cannot delete a dataset that has refresh jobs.	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/datasets/{datasetId}/¶

Retrieves the details of the dataset with given ID.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/datasets/{datasetId}/ \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset.

Example responses¶


{
  "categories": [
    "BATCH_PREDICTIONS"
  ],
  "columnCount": 0,
  "createdBy": "string",
  "creationDate": "2019-08-24T14:15:22Z",
  "dataEngineQueryId": "string",
  "dataPersisted": true,
  "dataSourceId": "string",
  "dataSourceType": "string",
  "datasetId": "string",
  "datasetSize": 0,
  "description": "string",
  "eda1ModificationDate": "2019-08-24T14:15:22Z",
  "eda1ModifierFullName": "string",
  "entityCountByType": {
    "numCalendars": 0,
    "numExternalModelPackages": 0,
    "numFeatureDiscoveryConfigs": 0,
    "numPredictionDatasets": 0,
    "numProjects": 0,
    "numSparkSqlQueries": 0
  },
  "error": "string",
  "featureCount": 0,
  "featureCountByType": [
    {
      "count": 0,
      "featureType": "string"
    }
  ],
  "featureDiscoveryProjectId": "string",
  "isDataEngineEligible": true,
  "isLatestVersion": true,
  "isSnapshot": true,
  "isWranglingEligible": true,
  "lastModificationDate": "2019-08-24T14:15:22Z",
  "lastModifierFullName": "string",
  "name": "string",
  "processingState": "COMPLETED",
  "recipeId": "string",
  "rowCount": 0,
  "sampleSize": {
    "type": "rows",
    "value": 1000000
  },
  "tags": [
    "string"
  ],
  "timeSeriesProperties": {
    "isMostlyImputed": null
  },
  "uri": "string",
  "versionId": "string"
}

Responses¶

Status	Meaning	Description	Schema
200	OK	The dataset details	FullDatasetDetailsResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

PATCH /api/v2/datasets/{datasetId}/¶

Modifies the specified dataset.

Code samples¶


curl -X PATCH https://app.datarobot.com/api/v2/datasets/{datasetId}/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "categories": "BATCH_PREDICTIONS",
  "name": "string"
}

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset.
body	body	PatchDataset	false	none

Example responses¶


{
  "categories": [
    "BATCH_PREDICTIONS"
  ],
  "createdBy": "string",
  "creationDate": "2019-08-24T14:15:22Z",
  "dataPersisted": true,
  "datasetId": "string",
  "isDataEngineEligible": true,
  "isLatestVersion": true,
  "isSnapshot": true,
  "name": "string",
  "processingState": "COMPLETED",
  "sampleSize": {
    "type": "rows",
    "value": 1000000
  },
  "timeSeriesProperties": {
    "isMostlyImputed": null
  },
  "versionId": "string"
}

Responses¶

Status	Meaning	Description	Schema
200	OK	Dataset successfully modified	BasicDatasetDetailsResponse
422	Unprocessable Entity	The categories are not applicable to the dataset.	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/datasets/{datasetId}/accessControl/¶

List the users and their associated roles for the specified dataset.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/datasets/{datasetId}/accessControl/?offset=0&limit=100 \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
userId	query	string	false	Only return the access control information for a user with this user ID.
username	query	string	false	Only return the access control information for a user with this username.
offset	query	integer	true	This many results will be skipped.
limit	query	integer	true	At most this many results are returned.
datasetId	path	string	true	The ID of the dataset.

Example responses¶


{
  "count": 0,
  "data": [
    {
      "canShare": true,
      "canUseData": true,
      "role": "OWNER",
      "userFullName": "string",
      "username": "string"
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Responses¶

Status	Meaning	Description	Schema
200	OK	A paginated list of user permissions	DatasetAccessControlListResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

PATCH /api/v2/datasets/{datasetId}/accessControl/¶

Grant access to the dataset at the specified role level, or remove access to the dataset.

Code samples¶


curl -X PATCH https://app.datarobot.com/api/v2/datasets/{datasetId}/accessControl/ \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "applyGrantToLinkedObjects": false,
  "data": [
    {
      "canShare": true,
      "canUseData": true,
      "role": "CONSUMER",
      "username": "string"
    }
  ]
}

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset.
body	body	DatasetAccessSet	false	none

Responses¶

Status	Meaning	Description	Schema
204	No Content	Successfully modified	None
409	Conflict	Duplicate entry for a user in permission list or the request would leave the dataset without an owner.	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/datasets/{datasetId}/allFeaturesDetails/¶

Return detailed information on all the features and transforms for this dataset.If the Dataset Item has attribute snapshot = True, all optional fields also appear

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/datasets/{datasetId}/allFeaturesDetails/?limit=100&offset=0&orderBy=featureType \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
limit	query	integer	true	At most this many results are returned. The default may change and a maximum limit may be imposed without notice.
offset	query	integer	true	This many results will be skipped.
orderBy	query	string	true	How the features should be ordered.
includePlot	query	string	false	Include histogram plot data in the response.
searchFor	query	string	false	A value to search for in the feature name. The search is case insensitive. If no value is provided, an empty string is used, or the string contains only whitespace, no filtering occurs.
featurelistId	query	string	false	ID of a featurelist. If specified, only returns features that are present in the specified featurelist.
includeDataQuality	query	string	false	Include detected data quality issue types in the response.
datasetId	path	string	true	The ID of the dataset.

Enumerated Values¶

Parameter	Value
orderBy	[`featureType`, `name`, `id`, `unique`, `missing`, `stddev`, `mean`, `median`, `min`, `max`, `dataQualityIssues`, `-featureType`, `-name`, `-id`, `-unique`, `-missing`, `-stddev`, `-mean`, `-median`, `-min`, `-max`, `-dataQualityIssues`]
includePlot	[`false`, `False`, `true`, `True`]
includeDataQuality	[`false`, `False`, `true`, `True`]

Example responses¶


{
  "count": 0,
  "data": [
    {
      "dataQualityIssues": "ISSUES_FOUND",
      "dataQualityIssuesTypes": [
        "disguised_missing_values"
      ],
      "datasetId": "string",
      "datasetVersionId": "string",
      "dateFormat": "string",
      "featureType": "Boolean",
      "id": 0,
      "isZeroInflated": true,
      "keySummary": {
        "key": "string",
        "summary": {
          "dataQualities": "ISSUES_FOUND",
          "max": 0,
          "mean": 0,
          "median": 0,
          "min": 0,
          "pctRows": 0,
          "stdDev": 0
        }
      },
      "language": "string",
      "lowInformation": true,
      "lowerQuartile": "string",
      "max": "string",
      "mean": "string",
      "median": "string",
      "min": "string",
      "naCount": 0,
      "name": "string",
      "plot": [
        {
          "count": 0,
          "label": "string"
        }
      ],
      "sampleRows": 0,
      "stdDev": "string",
      "timeSeriesEligibilityReason": "string",
      "timeSeriesEligibilityReasonAggregation": "string",
      "timeSeriesEligible": true,
      "timeSeriesEligibleAggregation": true,
      "timeStep": 0,
      "timeStepAggregation": 0,
      "timeUnit": "string",
      "timeUnitAggregation": "string",
      "uniqueCount": 0,
      "upperQuartile": "string"
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Responses¶

Status	Meaning	Description	Schema
200	OK	A paginated list of feature info	DatasetFeaturesListResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

PATCH /api/v2/datasets/{datasetId}/deleted/¶

Recover the dataset item with given datasetId from deleted.

Code samples¶


curl -X PATCH https://app.datarobot.com/api/v2/datasets/{datasetId}/deleted/ \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶

{}

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset.
body	body	UpdateDatasetDeleted	false	none

Responses¶

Status	Meaning	Description	Schema
200	OK	Item was not deleted: nothing to recover.	None
204	No Content	Successfully recovered	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/datasets/{datasetId}/featureHistograms/{featureName}/¶

Get histogram chart data for a specific feature in the specified dataset.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/datasets/{datasetId}/featureHistograms/{featureName}/?binLimit=60 \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
binLimit	query	integer	true	Maximum number of bins in the returned plot.
key	query	string	false	Only required for the Summarized categorical feature. Name of the top 50 key for which plot to be retrieved.
usePlot2	query	string	false	Use frequent values plot data instead of histogram for supported feature types.
datasetId	path	string	true	The ID of the dataset entry to retrieve.
featureName	path	string	true	The name of the feature.

Example responses¶


{
  "plot": [
    {
      "count": 0,
      "label": "string"
    }
  ]
}

Responses¶

Status	Meaning	Description	Schema
200	OK	The feature histogram	DatasetFeatureHistogramResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/datasets/{datasetId}/featureTransforms/¶

Retrieves the transforms of the dataset with given ID.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/datasets/{datasetId}/featureTransforms/?limit=100&offset=0 \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
limit	query	integer	true	At most this many results are returned. The default may change and a maximum limit may be imposed without notice.
offset	query	integer	true	This many results will be skipped.
datasetId	path	string	true	The ID of the dataset.

Example responses¶


{
  "count": 0,
  "data": [
    {
      "dateExtraction": "year",
      "name": "string",
      "parentName": "string",
      "replacement": "string",
      "variableType": "text"
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Responses¶

Status	Meaning	Description	Schema
200	OK	A paginated list of feature transforms	DatasetTransformListResponse
410	Gone	Dataset deleted.	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/datasets/{datasetId}/featureTransforms/¶

Create a new feature by changing the type of an existing one.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/datasets/{datasetId}/featureTransforms/ \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "dateExtraction": "year",
  "name": "string",
  "parentName": "string",
  "replacement": "string",
  "variableType": "text"
}

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset.
body	body	FeatureTransform	false	none

Responses¶

Status	Meaning	Description	Schema
202	Accepted	Creation has successfully started. See the Location header.	None
409	Conflict	Feature name already exists.	None
410	Gone	Dataset deleted.	None
422	Unprocessable Entity	In case of an invalid transformation or when dataset does not have profile data or sample files available.	None

Response Headers¶

Status	Header	Type	Format	Description
202	Location	string		A url that can be polled to check the status.

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/datasets/{datasetId}/featureTransforms/{featureName}/¶

Retrieve the specified feature with descriptive information.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/datasets/{datasetId}/featureTransforms/{featureName}/ \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The dataset to select feature from.
featureName	path	string	true	The name of the feature. Note that DataRobot renames some features, so the feature name may not be the one from your original data. Non-ascii features names should be utf-8-encoded (before URL-quoting).

Example responses¶


{
  "dateExtraction": "year",
  "name": "string",
  "parentName": "string",
  "replacement": "string",
  "variableType": "text"
}

Responses¶

Status	Meaning	Description	Schema
200	OK	The feature transform	DatasetTransformResponse
410	Gone	Dataset deleted.	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/datasets/{datasetId}/featurelists/¶

Retrieves the featurelists of the dataset with given ID and the latest dataset version.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/datasets/{datasetId}/featurelists/?limit=100&offset=0&orderBy=name \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
limit	query	integer	true	At most this many results are returned. The default may change and a maximum limit may be imposed without notice.
offset	query	integer	true	This many results will be skipped.
orderBy	query	string	true	How the feature lists should be ordered.
searchFor	query	string	false	A value to search for in the featurelist name. The search is case insensitive. If no value is provided, an empty string is used, or the string contains only whitespace, no filtering occurs.
datasetId	path	string	true	The ID of the dataset.

Enumerated Values¶

Parameter	Value
orderBy	[`name`, `description`, `featuresNumber`, `creationDate`, `userCreated`, `-name`, `-description`, `-featuresNumber`, `-creationDate`, `-userCreated`]

Example responses¶


{
  "count": 0,
  "data": [
    {
      "createdBy": "string",
      "creationDate": "2019-08-24T14:15:22Z",
      "datasetId": "string",
      "datasetVersionId": "string",
      "features": [
        "string"
      ],
      "id": "string",
      "name": "string",
      "userCreated": true
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Responses¶

Status	Meaning	Description	Schema
200	OK	A paginated list of featurelists	DatasetFeaturelistListResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/datasets/{datasetId}/featurelists/¶

Create featurelist for specified dataset.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/datasets/{datasetId}/featurelists/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "description": "string",
  "features": [
    "string"
  ],
  "name": "string"
}

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset.
body	body	FeatureListCreate	false	none

Example responses¶


{
  "createdBy": "string",
  "creationDate": "2019-08-24T14:15:22Z",
  "datasetId": "string",
  "datasetVersionId": "string",
  "features": [
    "string"
  ],
  "id": "string",
  "name": "string",
  "userCreated": true
}

Responses¶

Status	Meaning	Description	Schema
201	Created	Successfully created	DatasetFeaturelistResponse
409	Conflict	Feature list with specified name already exists	None
422	Unprocessable Entity	One or more of the specified features does not exist in the dataset	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

DELETE /api/v2/datasets/{datasetId}/featurelists/{featurelistId}/¶

Deletes the indicated featurelist of the dataset with given ID.

Code samples¶


curl -X DELETE https://app.datarobot.com/api/v2/datasets/{datasetId}/featurelists/{featurelistId}/ \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset.
featurelistId	path	string	true	The ID of the featurelist.

Responses¶

Status	Meaning	Description	Schema
204	No Content	Successfully deleted	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/datasets/{datasetId}/featurelists/{featurelistId}/¶

Retrieves the specified featurelist of the dataset with given ID and the latest dataset version.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/datasets/{datasetId}/featurelists/{featurelistId}/ \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset.
featurelistId	path	string	true	The ID of the featurelist.

Example responses¶


{
  "createdBy": "string",
  "creationDate": "2019-08-24T14:15:22Z",
  "datasetId": "string",
  "datasetVersionId": "string",
  "features": [
    "string"
  ],
  "id": "string",
  "name": "string",
  "userCreated": true
}

Responses¶

Status	Meaning	Description	Schema
200	OK	The featurelist	DatasetFeaturelistResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

PATCH /api/v2/datasets/{datasetId}/featurelists/{featurelistId}/¶

Modifies the indicated featurelist of the dataset with given ID.

Code samples¶


curl -X PATCH https://app.datarobot.com/api/v2/datasets/{datasetId}/featurelists/{featurelistId}/ \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "description": "string",
  "name": "string"
}

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset.
featurelistId	path	string	true	The ID of the featurelist.
body	body	FeatureListModify	false	none

Responses¶

Status	Meaning	Description	Schema
200	OK	Successfully modified	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/datasets/{datasetId}/file/¶

Retrieve all the originally uploaded data, in CSV form.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/datasets/{datasetId}/file/ \
  -H "Accept: application/vnd.apache.parquet" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset.

Example responses¶

Responses¶

Status	Meaning	Description	Schema
200	OK	The original dataset data	string
409	Conflict	Ingest info is missing for dataset version.	None
422	Unprocessable Entity	Dataset cannot be downloaded. Possible reasons include "dataPersisted" being false for the dataset, the dataset not being a snapshot, and this dataset is too big to be downloaded (maximum download size depends on a config of your installation).	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/datasets/{datasetId}/permissions/¶

Describe what permissions current user has for given dataset.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/datasets/{datasetId}/permissions/ \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset.

Example responses¶


{
  "canCreateFeaturelist": true,
  "canDeleteDataset": true,
  "canDownloadDatasetData": true,
  "canGetCatalogItemInfo": true,
  "canGetDatasetInfo": true,
  "canGetDatasetPermissions": true,
  "canGetFeatureInfo": true,
  "canGetFeaturelists": true,
  "canPatchCatalogInfo": true,
  "canPatchDatasetAliases": true,
  "canPatchDatasetInfo": true,
  "canPatchDatasetPermissions": true,
  "canPatchFeaturelists": true,
  "canPostDataset": true,
  "canReloadDataset": true,
  "canShareDataset": true,
  "canSnapshotDataset": true,
  "canUndeleteDataset": true,
  "canUseDatasetData": true,
  "canUseFeaturelists": true,
  "datasetId": "string",
  "uid": "string",
  "username": "string"
}

Responses¶

Status	Meaning	Description	Schema
200	OK	The dataset permissions	DatasetDescribePermissionsResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/datasets/{datasetId}/projects/¶

Retrieves a dataset's projects by dataset ID.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/datasets/{datasetId}/projects/?limit=100&offset=0 \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
limit	query	integer	true	Only this many items are returned.
offset	query	integer	true	Skip this many items.
datasetId	path	string	true	The ID of the dataset.

Example responses¶


{
  "count": 0,
  "data": [
    {
      "id": "string",
      "url": "string"
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Responses¶

Status	Meaning	Description	Schema
200	OK	A paginated list of projects	DatasetProjectListResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/datasets/{datasetId}/refreshJobs/¶

Paginated list of scheduled jobs descriptions for a specific dataset with given dataset ID, sorted by time of the last update.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/datasets/{datasetId}/refreshJobs/?limit=100&offset=0 \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
limit	query	integer	true	Only this many items are returned.
offset	query	integer	true	Skip this many items.
datasetId	path	string	true	The ID of the dataset.

Example responses¶


{
  "count": 0,
  "data": [
    {
      "categories": "BATCH_PREDICTIONS",
      "createdBy": "string",
      "credentialId": "string",
      "credentials": "string",
      "datasetId": "string",
      "enabled": true,
      "jobId": "string",
      "name": "string",
      "schedule": {
        "dayOfMonth": [
          "*"
        ],
        "dayOfWeek": [
          "*"
        ],
        "hour": [
          0
        ],
        "minute": [
          0
        ],
        "month": [
          "*"
        ]
      },
      "scheduleReferenceDate": "2019-08-24T14:15:22Z",
      "updatedAt": "2019-08-24T14:15:22Z",
      "updatedBy": "string",
      "useKerberos": true
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Responses¶

Status	Meaning	Description	Schema
200	OK	List of a dataset's scheduled job information retrieved successfully.	DatasetRefreshJobsListResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/datasets/{datasetId}/refreshJobs/¶

Create a dataset refresh job that will automatically create dataset snapshots on a schedule.

Optionally if the limit of enabled jobs per user is reached the following metadata will be added to the default error response payload:

datasetsWithJob (array) - The list of datasets IDs that have at least one enabled job.
errorType (string) - (New in version v2.21) The type of error that happened, possible values include (but are not limited to): Generic Limit Reached, Max Job Limit Reached for Dataset, and Max Job Limit Reached for User.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/datasets/{datasetId}/refreshJobs/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "categories": "BATCH_PREDICTIONS",
  "credentialId": null,
  "credentials": "string",
  "enabled": true,
  "name": "string",
  "schedule": {
    "dayOfMonth": [
      "*"
    ],
    "dayOfWeek": [
      "*"
    ],
    "hour": [
      0
    ],
    "minute": [
      0
    ],
    "month": [
      "*"
    ]
  },
  "scheduleReferenceDate": "2019-08-24T14:15:22Z",
  "useKerberos": false
}

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset.
body	body	DatasetRefreshJobCreate	false	none

Example responses¶


{
  "categories": "BATCH_PREDICTIONS",
  "createdBy": "string",
  "credentialId": "string",
  "credentials": "string",
  "datasetId": "string",
  "enabled": true,
  "jobId": "string",
  "name": "string",
  "schedule": {
    "dayOfMonth": [
      "*"
    ],
    "dayOfWeek": [
      "*"
    ],
    "hour": [
      0
    ],
    "minute": [
      0
    ],
    "month": [
      "*"
    ]
  },
  "scheduleReferenceDate": "2019-08-24T14:15:22Z",
  "updatedAt": "2019-08-24T14:15:22Z",
  "updatedBy": "string",
  "useKerberos": true
}

Responses¶

Status	Meaning	Description	Schema
201	Created	Dataset refresh job created.	DatasetRefreshJobResponse
409	Conflict	The maximum number of enabled jobs is reached.	None
422	Unprocessable Entity	Refresh job could not be created. Possible reasons include, the job does not belong to the given dataset, credential ID required when Kerberos authentication enabled, or the schedule is not valid or cannot be understood.	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

DELETE /api/v2/datasets/{datasetId}/refreshJobs/{jobId}/¶

Deletes an existing dataset refresh job.

Code samples¶


curl -X DELETE https://app.datarobot.com/api/v2/datasets/{datasetId}/refreshJobs/{jobId}/ \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The dataset associated with the scheduled refresh job.
jobId	path	string	true	ID of the user scheduled dataset refresh job.

Responses¶

Status	Meaning	Description	Schema
204	No Content	Scheduled Job deleted.	None
422	Unprocessable Entity	Invalid job ID or dataset ID provided.	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/datasets/{datasetId}/refreshJobs/{jobId}/¶

Gets configuration of a user scheduled dataset refresh job by job ID.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/datasets/{datasetId}/refreshJobs/{jobId}/ \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The dataset associated with the scheduled refresh job.
jobId	path	string	true	ID of the user scheduled dataset refresh job.

Example responses¶


{
  "categories": "BATCH_PREDICTIONS",
  "createdBy": "string",
  "credentialId": "string",
  "credentials": "string",
  "datasetId": "string",
  "enabled": true,
  "jobId": "string",
  "name": "string",
  "schedule": {
    "dayOfMonth": [
      "*"
    ],
    "dayOfWeek": [
      "*"
    ],
    "hour": [
      0
    ],
    "minute": [
      0
    ],
    "month": [
      "*"
    ]
  },
  "scheduleReferenceDate": "2019-08-24T14:15:22Z",
  "updatedAt": "2019-08-24T14:15:22Z",
  "updatedBy": "string",
  "useKerberos": true
}

Responses¶

Status	Meaning	Description	Schema
200	OK	Job information retrieved successfully.	DatasetRefreshJobResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

PATCH /api/v2/datasets/{datasetId}/refreshJobs/{jobId}/¶

Update a dataset refresh job.

Optionally if the limit of enabled jobs per user is reached the following metadata will be added to the default error response payload:

datasetsWithJob (array) - The list of datasets IDs that have at least one enabled job.
errorType (string) - (New in version v2.21) The type of error that happened, possible values include (but are not limited to): Generic Limit Reached, Max Job Limit Reached for Dataset, and Max Job Limit Reached for User.

Code samples¶


curl -X PATCH https://app.datarobot.com/api/v2/datasets/{datasetId}/refreshJobs/{jobId}/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "categories": "BATCH_PREDICTIONS",
  "credentialId": "string",
  "credentials": "string",
  "enabled": true,
  "name": "string",
  "schedule": {
    "dayOfMonth": [
      "*"
    ],
    "dayOfWeek": [
      "*"
    ],
    "hour": [
      0
    ],
    "minute": [
      0
    ],
    "month": [
      "*"
    ]
  },
  "scheduleReferenceDate": "2019-08-24T14:15:22Z",
  "useKerberos": true
}

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The dataset associated with the scheduled refresh job.
jobId	path	string	true	ID of the user scheduled dataset refresh job.
body	body	DatasetRefreshJobUpdate	false	none

Example responses¶


{
  "categories": "BATCH_PREDICTIONS",
  "createdBy": "string",
  "credentialId": "string",
  "credentials": "string",
  "datasetId": "string",
  "enabled": true,
  "jobId": "string",
  "name": "string",
  "schedule": {
    "dayOfMonth": [
      "*"
    ],
    "dayOfWeek": [
      "*"
    ],
    "hour": [
      0
    ],
    "minute": [
      0
    ],
    "month": [
      "*"
    ]
  },
  "scheduleReferenceDate": "2019-08-24T14:15:22Z",
  "updatedAt": "2019-08-24T14:15:22Z",
  "updatedBy": "string",
  "useKerberos": true
}

Responses¶

Status	Meaning	Description	Schema
200	OK	Scheduled Job configuration updated.	DatasetRefreshJobResponse
409	Conflict	The maximum number of enabled jobs is reached.	None
422	Unprocessable Entity	Refresh job could not be updated. Possible reasons include, the job does not belong to the given dataset, credential ID required when Kerberos authentication enabled, or the schedule is not valid or cannot be understood.	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/datasets/{datasetId}/refreshJobs/{jobId}/executionResults/¶

Paginated list of execution results for refresh job with the given ID and dataset with the given ID, sorted from newest to oldest.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/datasets/{datasetId}/refreshJobs/{jobId}/executionResults/ \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
limit	query	integer	false	Maximum number of results returned. The default may change and a maximum limit may be imposed without notice.
offset	query	integer	false	Number of results that will be skipped.
datasetId	path	string	true	The dataset associated with the scheduled refresh job.
jobId	path	string	true	ID of the user scheduled dataset refresh job.

Example responses¶


{
  "count": 0,
  "data": [
    {
      "completedAt": "2019-08-24T14:15:22Z",
      "datasetId": "string",
      "datasetVersionId": "string",
      "executionId": "string",
      "jobId": "string",
      "message": "string",
      "startedAt": "2019-08-24T14:15:22Z",
      "status": "INITIALIZING"
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Responses¶

Status	Meaning	Description	Schema
200	OK	Paginated list of dataset refresh job results, sorted from latest to oldest.	DatasetRefreshJobRetrieveExecutionResultsResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/datasets/{datasetId}/relationships/¶

Retrieve a list of the dataset relationships for a specific dataset.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/datasets/{datasetId}/relationships/?limit=100&offset=0 \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
limit	query	integer	true	At most this many results are returned.
offset	query	integer	true	This many results will be skipped.
linkedDatasetId	query	string	false	Providing `linkedDatasetId` will filter such that only relationships between `datasetId` (from the path) and `linkedDatasetId` will be returned.
datasetId	path	string	true	The ID of the dataset.

Example responses¶


{
  "count": 0,
  "data": [
    {
      "createdBy": "string",
      "creationDate": "2019-08-24T14:15:22Z",
      "id": "string",
      "linkedDatasetId": "string",
      "linkedFeatures": [
        "string"
      ],
      "modificationDate": "2019-08-24T14:15:22Z",
      "modifiedBy": "string",
      "sourceDatasetId": "string",
      "sourceFeatures": [
        "string"
      ]
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Responses¶

Status	Meaning	Description	Schema
200	OK	A paginated list of dataset relationships	DatasetRelationshipListResponse
410	Gone	Dataset deleted	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/datasets/{datasetId}/relationships/¶

Create a dataset relationship.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/datasets/{datasetId}/relationships/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "linkedDatasetId": "string",
  "linkedFeatures": [
    "string"
  ],
  "sourceFeatures": [
    "string"
  ]
}

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset.
body	body	DatasetRelationshipCreate	false	none

Example responses¶


{
  "createdBy": "string",
  "creationDate": "2019-08-24T14:15:22Z",
  "id": "string",
  "linkedDatasetId": "string",
  "linkedFeatures": [
    "string"
  ],
  "modificationDate": "2019-08-24T14:15:22Z",
  "modifiedBy": "string",
  "sourceDatasetId": "string",
  "sourceFeatures": [
    "string"
  ]
}

Responses¶

Status	Meaning	Description	Schema
201	Created	Successfully created	DatasetRelationshipResponse
409	Conflict	Relationship already exists.	None
410	Gone	Dataset deleted.	None
422	Unprocessable Entity	Missing or unrecognized fields.	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

DELETE /api/v2/datasets/{datasetId}/relationships/{datasetRelationshipId}/¶

Delete a dataset relationship.

Code samples¶


curl -X DELETE https://app.datarobot.com/api/v2/datasets/{datasetId}/relationships/{datasetRelationshipId}/ \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset.
datasetRelationshipId	path	string	true	The ID of the dataset relationship to delete.

Responses¶

Status	Meaning	Description	Schema
204	No Content	Successfully deleted	None
410	Gone	Dataset deleted.	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

PATCH /api/v2/datasets/{datasetId}/relationships/{datasetRelationshipId}/¶

Update a dataset relationship.

Code samples¶


curl -X PATCH https://app.datarobot.com/api/v2/datasets/{datasetId}/relationships/{datasetRelationshipId}/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "linkedDatasetId": "string",
  "linkedFeatures": [
    "string"
  ],
  "sourceFeatures": [
    "string"
  ]
}

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset.
datasetRelationshipId	path	string	true	The ID of the dataset relationship to delete.
body	body	DatasetRelationshipUpdate	false	none

Example responses¶


{
  "createdBy": "string",
  "creationDate": "2019-08-24T14:15:22Z",
  "id": "string",
  "linkedDatasetId": "string",
  "linkedFeatures": [
    "string"
  ],
  "modificationDate": "2019-08-24T14:15:22Z",
  "modifiedBy": "string",
  "sourceDatasetId": "string",
  "sourceFeatures": [
    "string"
  ]
}

Responses¶

Status	Meaning	Description	Schema
200	OK	Successfully updated	DatasetRelationshipResponse
409	Conflict	Relationship already exists	None
410	Gone	Dataset deleted	None
422	Unprocessable Entity	Bad payload: missing or unrecognized fields	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/datasets/{datasetId}/sharedRoles/¶

Get a list of users, groups and organizations who have access to this dataset and their roles.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/datasets/{datasetId}/sharedRoles/?offset=0&limit=100 \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
id	query	string	false	Only return the access control information for a organization, group or user with this ID.
name	query	string	false	Only return the access control information for a organization, group or user with this name.
shareRecipientType	query	string	false	It describes the recipient type.
offset	query	integer	true	This many results will be skipped.
limit	query	integer	true	At most this many results are returned.
datasetId	path	string	true	The ID of the dataset.

Enumerated Values¶

Parameter	Value
shareRecipientType	[`user`, `group`, `organization`]

Example responses¶


{
  "count": 0,
  "data": [
    {
      "canShare": true,
      "canUseData": true,
      "id": "string",
      "name": "string",
      "role": "CONSUMER",
      "shareRecipientType": "user",
      "userFullName": "string"
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Responses¶

Status	Meaning	Description	Schema
200	OK	A paginated list of user permissions	SharedRolesListResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

PATCH /api/v2/datasets/{datasetId}/sharedRoles/¶

Grant access, remove access or update roles for organizations, groups or users on this dataset. Up to 100 roles may be set per array in a single request.

Code samples¶


curl -X PATCH https://app.datarobot.com/api/v2/datasets/{datasetId}/sharedRoles/ \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "applyGrantToLinkedObjects": false,
  "operation": "updateRoles",
  "roles": [
    {
      "canShare": false,
      "canUseData": true,
      "name": "string",
      "role": "CONSUMER",
      "shareRecipientType": "user"
    }
  ]
}

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset.
body	body	DatasetSharedRoles	false	none

Responses¶

Status	Meaning	Description	Schema
204	No Content	Successfully modified	None
409	Conflict	Duplicate entry for the org/group/user in permission listor the request would leave the dataset without an owner.	None
422	Unprocessable Entity	Request is unprocessable. For example, `name` is stated for not user recipient.	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/datasets/{datasetId}/versions/¶

List all versions associated with given datasetId and which match the specified query parameters.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/datasets/{datasetId}/versions/?limit=100&offset=0 \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
category	query	string	false	If specified, only dataset versions that have the specified category will be included in the results. Categories identify the intended use of the dataset.
orderBy	query	string	false	Sorting order which will be applied to catalog list.
limit	query	integer	true	At most this many results are returned.
offset	query	integer	true	This many results will be skipped.
filterFailed	query	string	false	Whether datasets that failed during import should be excluded from the results. If True invalid datasets will be excluded.
datasetId	path	string	true	The ID of the dataset.

Enumerated Values¶

Parameter	Value
category	[`TRAINING`, `PREDICTION`, `SAMPLE`]
orderBy	[`created`, `-created`]
filterFailed	[`false`, `False`, `true`, `True`]

Example responses¶


{
  "count": 0,
  "data": [
    {
      "categories": [
        "BATCH_PREDICTIONS"
      ],
      "columnCount": 0,
      "createdBy": "string",
      "creationDate": "2019-08-24T14:15:22Z",
      "dataPersisted": true,
      "datasetId": "string",
      "datasetSize": 0,
      "isDataEngineEligible": true,
      "isLatestVersion": true,
      "isSnapshot": true,
      "name": "string",
      "processingState": "COMPLETED",
      "rowCount": 0,
      "sampleSize": {
        "type": "rows",
        "value": 1000000
      },
      "timeSeriesProperties": {
        "isMostlyImputed": null
      },
      "versionId": "string"
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Responses¶

Status	Meaning	Description	Schema
200	OK	A paginated list of dataset versions	DatasetListResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/datasets/{datasetId}/versions/fromDataEngineWorkspaceState/¶

Create a new dataset version for a specified dataset from a Data Engine workspace state. The new dataset version should have the same schema as the specified dataset.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/datasets/{datasetId}/versions/fromDataEngineWorkspaceState/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "credentials": "string",
  "doSnapshot": true,
  "workspaceStateId": "string"
}

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset.
body	body	DatasetCreateVersionFromWorkspaceState	false	none

Example responses¶


{
  "datasetId": "string",
  "datasetVersionId": "string",
  "statusId": "string"
}

Responses¶

Status	Meaning	Description	Schema
202	Accepted	Creation has successfully started. See the Location header.	CreatedDatasetDataEngineResponse
410	Gone	Specified workspace was already deleted.	None
422	Unprocessable Entity	Type of new dataset version is incompatible with specified dataset.	None

Response Headers¶

Status	Header	Type	Format	Description
202	Location	string		A url that can be polled to check the status.

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/datasets/{datasetId}/versions/fromDataSource/¶

Create a new version for the specified dataset from specified Data Source. The dataset must have been created from a compatible data source originally.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/datasets/{datasetId}/versions/fromDataSource/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "categories": "BATCH_PREDICTIONS",
  "credentialData": {
    "credentialType": "basic",
    "password": "string",
    "user": "string"
  },
  "credentialId": "string",
  "dataSourceId": "string",
  "doSnapshot": true,
  "password": "string",
  "persistDataAfterIngestion": true,
  "sampleSize": {
    "type": "rows",
    "value": 1000000
  },
  "useKerberos": false,
  "user": "string"
}

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset.
body	body	Datasource	false	none

Example responses¶


{
  "catalogId": "string",
  "catalogVersionId": "string",
  "statusId": "string"
}

Responses¶

Status	Meaning	Description	Schema
202	Accepted	Creation has successfully started. See the Location header.	CreatedDatasetResponse

Response Headers¶

Status	Header	Type	Format	Description
202	Location	string		A url that can be polled to check the status.

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/datasets/{datasetId}/versions/fromFile/¶

Create a new version for the specified dataset from a file.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/datasets/{datasetId}/versions/fromFile/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "categories": "BATCH_PREDICTIONS",
  "file": "string"
}

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset.
body	body	DatasetFromFile	false	none

Example responses¶


{
  "catalogId": "string",
  "catalogVersionId": "string",
  "statusId": "string"
}

Responses¶

Status	Meaning	Description	Schema
202	Accepted	Creation has successfully started. See the Location header.	CreatedDatasetResponse

Response Headers¶

Status	Header	Type	Format	Description
202	Location	string		A url that can be polled to check the status.

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/datasets/{datasetId}/versions/fromHDFS/¶

Create a new version for the specified dataset from a HDFS URL. The dataset must have been created from the same HDFS URL originally.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/datasets/{datasetId}/versions/fromHDFS/ \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "categories": "BATCH_PREDICTIONS",
  "doSnapshot": true,
  "namenodeWebhdfsPort": 0,
  "password": "string",
  "persistDataAfterIngestion": true,
  "url": "http://example.com",
  "user": "string"
}

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset.
body	body	Hdfs	false	none

Responses¶

Status	Meaning	Description	Schema
200	OK	none	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/datasets/{datasetId}/versions/fromLatestVersion/¶

Create a new version of the specified dataset from the latest dataset version. This will reuse the same source of the data that was previously used. Not supported for datasets that were previously loaded from an uploaded file. If the dataset is currently a remote dataset, it will be converted to a snapshot dataset. NOTE: if the current version uses a Data Source, the user and password must be specified so the data can be accessed.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/datasets/{datasetId}/versions/fromLatestVersion/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "categories": "BATCH_PREDICTIONS",
  "credentialData": {
    "credentialType": "basic",
    "password": "string",
    "user": "string"
  },
  "credentialId": "string",
  "credentials": "string",
  "password": "string",
  "useKerberos": false,
  "useLatestSuccess": false,
  "user": "string"
}

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset.
body	body	FromLatest	false	none

Example responses¶


{
  "catalogId": "string",
  "catalogVersionId": "string",
  "statusId": "string"
}

Responses¶

Status	Meaning	Description	Schema
202	Accepted	Creation has successfully started. See the Location header.	CreatedDatasetResponse
409	Conflict	The latest version of the dataset is in an errored state.	None

Response Headers¶

Status	Header	Type	Format	Description
202	Location	string		A url that can be polled to check the status.

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/datasets/{datasetId}/versions/fromStage/¶

Create a new version for the specified dataset from a data stage.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/datasets/{datasetId}/versions/fromStage/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "categories": "BATCH_PREDICTIONS",
  "stageId": "string"
}

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset.
body	body	DatasetFromStage	false	none

Example responses¶


{
  "catalogId": "string",
  "catalogVersionId": "string",
  "statusId": "string"
}

Responses¶

Status	Meaning	Description	Schema
202	Accepted	Creation has successfully started. See the Location header.	CreatedDatasetResponse
403	Forbidden	You do not have permission to use data stages.	None
404	Not Found	Data Stage not found or Dataset not found	None
409	Conflict	Data Stage not finalized	None
410	Gone	Data Stage failed	None
422	Unprocessable Entity	The request cannot be processed. Possible reasons include the request did not contain data stage, dataset was previously created from a non data stage source	None

Response Headers¶

Status	Header	Type	Format	Description
202	Location	string		A url that can be polled to check the status.

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/datasets/{datasetId}/versions/fromURL/¶

Create a new version for the specified dataset from specified URL. The dataset must have been created from the same URL originally.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/datasets/{datasetId}/versions/fromURL/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "categories": "BATCH_PREDICTIONS",
  "doSnapshot": true,
  "persistDataAfterIngestion": true,
  "sampleSize": {
    "type": "rows",
    "value": 1000000
  },
  "url": "string"
}

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset.
body	body	Url	false	none

Example responses¶


{
  "catalogId": "string",
  "catalogVersionId": "string",
  "statusId": "string"
}

Responses¶

Status	Meaning	Description	Schema
202	Accepted	Creation has successfully started. See the Location header.	CreatedDatasetResponse

Response Headers¶

Status	Header	Type	Format	Description
202	Location	string		A url that can be polled to check the status.

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

DELETE /api/v2/datasets/{datasetId}/versions/{datasetVersionId}/¶

Marks the dataset version with the given ID as deleted.

Code samples¶


curl -X DELETE https://app.datarobot.com/api/v2/datasets/{datasetId}/versions/{datasetVersionId}/ \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset entry.
datasetVersionId	path	string	true	The ID of the dataset version.

Responses¶

Status	Meaning	Description	Schema
204	No Content	Successfully deleted	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/datasets/{datasetId}/versions/{datasetVersionId}/¶

Retrieves the details of the dataset with given ID and version ID.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/datasets/{datasetId}/versions/{datasetVersionId}/ \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset entry.
datasetVersionId	path	string	true	The ID of the dataset version.

Example responses¶


{
  "categories": [
    "BATCH_PREDICTIONS"
  ],
  "columnCount": 0,
  "createdBy": "string",
  "creationDate": "2019-08-24T14:15:22Z",
  "dataEngineQueryId": "string",
  "dataPersisted": true,
  "dataSourceId": "string",
  "dataSourceType": "string",
  "datasetId": "string",
  "datasetSize": 0,
  "description": "string",
  "eda1ModificationDate": "2019-08-24T14:15:22Z",
  "eda1ModifierFullName": "string",
  "entityCountByType": {
    "numCalendars": 0,
    "numExternalModelPackages": 0,
    "numFeatureDiscoveryConfigs": 0,
    "numPredictionDatasets": 0,
    "numProjects": 0,
    "numSparkSqlQueries": 0
  },
  "error": "string",
  "featureCount": 0,
  "featureCountByType": [
    {
      "count": 0,
      "featureType": "string"
    }
  ],
  "featureDiscoveryProjectId": "string",
  "isDataEngineEligible": true,
  "isLatestVersion": true,
  "isSnapshot": true,
  "isWranglingEligible": true,
  "lastModificationDate": "2019-08-24T14:15:22Z",
  "lastModifierFullName": "string",
  "name": "string",
  "processingState": "COMPLETED",
  "recipeId": "string",
  "rowCount": 0,
  "sampleSize": {
    "type": "rows",
    "value": 1000000
  },
  "tags": [
    "string"
  ],
  "timeSeriesProperties": {
    "isMostlyImputed": null
  },
  "uri": "string",
  "versionId": "string"
}

Responses¶

Status	Meaning	Description	Schema
200	OK	The dataset details	FullDatasetDetailsResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/datasets/{datasetId}/versions/{datasetVersionId}/allFeaturesDetails/¶

Return detailed information on all the features and transforms for this dataset.If the Dataset Item has attribute snapshot = True, all optional fields also appear

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/datasets/{datasetId}/versions/{datasetVersionId}/allFeaturesDetails/?limit=100&offset=0&orderBy=featureType \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
limit	query	integer	true	At most this many results are returned. The default may change and a maximum limit may be imposed without notice.
offset	query	integer	true	This many results will be skipped.
orderBy	query	string	true	How the features should be ordered.
includePlot	query	string	false	Include histogram plot data in the response.
searchFor	query	string	false	A value to search for in the feature name. The search is case insensitive. If no value is provided, an empty string is used, or the string contains only whitespace, no filtering occurs.
featurelistId	query	string	false	ID of a featurelist. If specified, only returns features that are present in the specified featurelist.
includeDataQuality	query	string	false	Include detected data quality issue types in the response.
datasetId	path	string	true	The ID of the dataset entry.
datasetVersionId	path	string	true	The ID of the dataset version.

Enumerated Values¶

Parameter	Value
orderBy	[`featureType`, `name`, `id`, `unique`, `missing`, `stddev`, `mean`, `median`, `min`, `max`, `dataQualityIssues`, `-featureType`, `-name`, `-id`, `-unique`, `-missing`, `-stddev`, `-mean`, `-median`, `-min`, `-max`, `-dataQualityIssues`]
includePlot	[`false`, `False`, `true`, `True`]
includeDataQuality	[`false`, `False`, `true`, `True`]

Example responses¶


{
  "count": 0,
  "data": [
    {
      "dataQualityIssues": "ISSUES_FOUND",
      "dataQualityIssuesTypes": [
        "disguised_missing_values"
      ],
      "datasetId": "string",
      "datasetVersionId": "string",
      "dateFormat": "string",
      "featureType": "Boolean",
      "id": 0,
      "isZeroInflated": true,
      "keySummary": {
        "key": "string",
        "summary": {
          "dataQualities": "ISSUES_FOUND",
          "max": 0,
          "mean": 0,
          "median": 0,
          "min": 0,
          "pctRows": 0,
          "stdDev": 0
        }
      },
      "language": "string",
      "lowInformation": true,
      "lowerQuartile": "string",
      "max": "string",
      "mean": "string",
      "median": "string",
      "min": "string",
      "naCount": 0,
      "name": "string",
      "plot": [
        {
          "count": 0,
          "label": "string"
        }
      ],
      "sampleRows": 0,
      "stdDev": "string",
      "timeSeriesEligibilityReason": "string",
      "timeSeriesEligibilityReasonAggregation": "string",
      "timeSeriesEligible": true,
      "timeSeriesEligibleAggregation": true,
      "timeStep": 0,
      "timeStepAggregation": 0,
      "timeUnit": "string",
      "timeUnitAggregation": "string",
      "uniqueCount": 0,
      "upperQuartile": "string"
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Responses¶

Status	Meaning	Description	Schema
200	OK	A paginated list of feature info	DatasetFeaturesListResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

PATCH /api/v2/datasets/{datasetId}/versions/{datasetVersionId}/deleted/¶

Recover the dataset version item with given datasetId and datasetVersionId from deleted.

Code samples¶


curl -X PATCH https://app.datarobot.com/api/v2/datasets/{datasetId}/versions/{datasetVersionId}/deleted/ \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶

{}

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset entry.
datasetVersionId	path	string	true	The ID of the dataset version.
body	body	UpdateDatasetDeleted	false	none

Responses¶

Status	Meaning	Description	Schema
200	OK	The item was not deleted: nothing to recover.	None
204	No Content	Successfully recovered	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/datasets/{datasetId}/versions/{datasetVersionId}/featureHistograms/{featureName}/¶

Get histogram chart data for a specific feature in the specified dataset.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/datasets/{datasetId}/versions/{datasetVersionId}/featureHistograms/{featureName}/?binLimit=60 \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
binLimit	query	integer	true	Maximum number of bins in the returned plot.
key	query	string	false	Only required for the Summarized categorical feature. Name of the top 50 key for which plot to be retrieved.
usePlot2	query	string	false	Use frequent values plot data instead of histogram for supported feature types.
datasetId	path	string	true	The ID of the dataset entry to retrieve.
datasetVersionId	path	string	true	The ID of the dataset version to retrieve.
featureName	path	string	true	The name of the feature.

Example responses¶


{
  "plot": [
    {
      "count": 0,
      "label": "string"
    }
  ]
}

Responses¶

Status	Meaning	Description	Schema
200	OK	The feature histogram	DatasetFeatureHistogramResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/datasets/{datasetId}/versions/{datasetVersionId}/featurelists/¶

Retrieves the featurelists of the dataset with given ID and the latest dataset version.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/datasets/{datasetId}/versions/{datasetVersionId}/featurelists/?limit=100&offset=0&orderBy=name \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
limit	query	integer	true	At most this many results are returned. The default may change and a maximum limit may be imposed without notice.
offset	query	integer	true	This many results will be skipped.
orderBy	query	string	true	How the feature lists should be ordered.
searchFor	query	string	false	A value to search for in the featurelist name. The search is case insensitive. If no value is provided, an empty string is used, or the string contains only whitespace, no filtering occurs.
datasetId	path	string	true	The ID of the dataset entry.
datasetVersionId	path	string	true	The ID of the dataset version.

Enumerated Values¶

Parameter	Value
orderBy	[`name`, `description`, `featuresNumber`, `creationDate`, `userCreated`, `-name`, `-description`, `-featuresNumber`, `-creationDate`, `-userCreated`]

Example responses¶


{
  "count": 0,
  "data": [
    {
      "createdBy": "string",
      "creationDate": "2019-08-24T14:15:22Z",
      "datasetId": "string",
      "datasetVersionId": "string",
      "features": [
        "string"
      ],
      "id": "string",
      "name": "string",
      "userCreated": true
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Responses¶

Status	Meaning	Description	Schema
200	OK	A paginated list of featurelists	DatasetFeaturelistListResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/datasets/{datasetId}/versions/{datasetVersionId}/featurelists/{featurelistId}/¶

Retrieves the specified featurelist of the dataset with given ID and the latest dataset version.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/datasets/{datasetId}/versions/{datasetVersionId}/featurelists/{featurelistId}/ \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset to retrieve featurelist for.
datasetVersionId	path	string	true	The ID of the dataset version to retrieve featurelists for.
featurelistId	path	string	true	The ID of the featurelist.

Example responses¶


{
  "createdBy": "string",
  "creationDate": "2019-08-24T14:15:22Z",
  "datasetId": "string",
  "datasetVersionId": "string",
  "features": [
    "string"
  ],
  "id": "string",
  "name": "string",
  "userCreated": true
}

Responses¶

Status	Meaning	Description	Schema
200	OK	The featurelist	DatasetFeaturelistResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/datasets/{datasetId}/versions/{datasetVersionId}/file/¶

Retrieve all the originally uploaded data, in CSV form.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/datasets/{datasetId}/versions/{datasetVersionId}/file/ \
  -H "Accept: application/vnd.apache.parquet" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset entry.
datasetVersionId	path	string	true	The ID of the dataset version.

Example responses¶

Responses¶

Status	Meaning	Description	Schema
200	OK	The original dataset data	string
409	Conflict	Ingest info is missing for dataset version.	None
422	Unprocessable Entity	Dataset version cannot be downloaded. Possible reasons include dataPersisted being false for the dataset, the dataset not being a snapshot, and this dataset version is too big to be downloaded (maximum download size depends on a config of your installation).	None

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

POST /api/v2/datasets/{datasetId}/versions/{datasetVersionId}/fromVersion/¶

Create a new version of the specified dataset from the specified dataset version. This will reuse the same source of the data that was previously used. Not supported for datasets that were previously loaded from an uploaded file. If the dataset is currently a remote dataset, it will be converted to a snapshot dataset. NOTE: If the specified version uses a Data Source, the user and password must be specified so the data can be accessed.

Code samples¶


curl -X POST https://app.datarobot.com/api/v2/datasets/{datasetId}/versions/{datasetVersionId}/fromVersion/ \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}" \
  -d '{undefined}'

Body parameter¶


{
  "categories": "BATCH_PREDICTIONS",
  "credentialData": {
    "credentialType": "basic",
    "password": "string",
    "user": "string"
  },
  "credentialId": "string",
  "credentials": "string",
  "password": "string",
  "useKerberos": false,
  "user": "string"
}

Parameters

Name	In	Type	Required	Description
datasetId	path	string	true	The ID of the dataset entry.
datasetVersionId	path	string	true	The ID of the dataset version.
body	body	FromSpecific	false	none

Example responses¶


{
  "catalogId": "string",
  "catalogVersionId": "string",
  "statusId": "string"
}

Responses¶

Status	Meaning	Description	Schema
202	Accepted	Creation has successfully started. See the Location header.	CreatedDatasetResponse
409	Conflict	The dataset item's specified version is in an errored state.	None

Response Headers¶

Status	Header	Type	Format	Description
202	Location	string		A url that can be polled to check the status.

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

GET /api/v2/datasets/{datasetId}/versions/{datasetVersionId}/projects/¶

Retrieves a dataset's projects for the specified catalog dataset and dataset version id.

Code samples¶


curl -X GET https://app.datarobot.com/api/v2/datasets/{datasetId}/versions/{datasetVersionId}/projects/?limit=100&offset=0 \
  -H "Accept: application/json" \
  -H "Authorization: Bearer {access-token}"

Parameters

Name	In	Type	Required	Description
limit	query	integer	true	Only this many items are returned.
offset	query	integer	true	Skip this many items.
datasetId	path	string	true	The ID of the dataset entry.
datasetVersionId	path	string	true	The ID of the dataset version.

Example responses¶


{
  "count": 0,
  "data": [
    {
      "id": "string",
      "url": "string"
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Responses¶

Status	Meaning	Description	Schema
200	OK	A paginated list of projects	GetDatasetVersionProjectsResponse

To perform this operation, you must be authenticated by means of one of the following methods:

BearerAuth

Schemas¶

AzureServicePrincipalCredentials


{
  "azureTenantId": "string",
  "clientId": "string",
  "clientSecret": "string",
  "configId": "string",
  "credentialType": "azure_service_principal"
}

Properties¶

Name	Type	Required	Description
azureTenantId	string	false	Tenant ID of the Azure AD service principal.
clientId	string	false	Client ID of the Azure AD service principal.
clientSecret	string	false	Client Secret of the Azure AD service principal.
configId	string	false	ID of secure configurations of credentials shared by admin.
credentialType	string	true	The type of these credentials, 'azure_service_principal' here.

Enumerated Values¶

Property	Value
credentialType	`azure_service_principal`

BasicCredentials


{
  "credentialType": "basic",
  "password": "string",
  "user": "string"
}

Properties¶

Name	Type	Required	Description
credentialType	string	true	The type of these credentials, 'basic' here.
password	string	true	The password for database authentication. The password is encrypted at rest and never saved / stored.
user	string	true	The username for database authentication.

Enumerated Values¶

Property	Value
credentialType	`basic`

BasicDatasetDetailsResponse


{
  "categories": [
    "BATCH_PREDICTIONS"
  ],
  "createdBy": "string",
  "creationDate": "2019-08-24T14:15:22Z",
  "dataPersisted": true,
  "datasetId": "string",
  "isDataEngineEligible": true,
  "isLatestVersion": true,
  "isSnapshot": true,
  "name": "string",
  "processingState": "COMPLETED",
  "sampleSize": {
    "type": "rows",
    "value": 1000000
  },
  "timeSeriesProperties": {
    "isMostlyImputed": null
  },
  "versionId": "string"
}

Properties¶

Name	Type	Required	Description
categories	[string]	true	An array of strings describing the intended use of the dataset.
createdBy	string,null	true	Username of the user who created the dataset.
creationDate	string(date-time)	true	The date when the dataset was created.
dataPersisted	boolean	true	If true, user is allowed to view extended data profile (which includes data statistics like min/max/median/mean, histogram, etc.) and download data. If false, download is not allowed and only the data schema (feature names and types) will be available.
datasetId	string	true	The ID of this dataset.
isDataEngineEligible	boolean	true	Whether this dataset can be a data source of a data engine query.
isLatestVersion	boolean	true	Whether this dataset version is the latest version of this dataset.
isSnapshot	boolean	true	Whether the dataset is an immutable snapshot of data which has previously been retrieved and saved to DataRobot.
name	string	true	The name of this dataset in the catalog.
processingState	string	true	Current ingestion process state of dataset.
sampleSize	SampleSize	false	Ingest size to use during dataset registration. Default behavior is to ingest full dataset.
timeSeriesProperties	TimeSeriesProperties	true	Properties related to time series data prep.
versionId	string	true	The object ID of the catalog_version the dataset belongs to.

Enumerated Values¶

Property	Value
processingState	[`COMPLETED`, `ERROR`, `RUNNING`]

BasicDatasetWithSizeResponse


{
  "categories": [
    "BATCH_PREDICTIONS"
  ],
  "columnCount": 0,
  "createdBy": "string",
  "creationDate": "2019-08-24T14:15:22Z",
  "dataPersisted": true,
  "datasetId": "string",
  "datasetSize": 0,
  "isDataEngineEligible": true,
  "isLatestVersion": true,
  "isSnapshot": true,
  "name": "string",
  "processingState": "COMPLETED",
  "rowCount": 0,
  "sampleSize": {
    "type": "rows",
    "value": 1000000
  },
  "timeSeriesProperties": {
    "isMostlyImputed": null
  },
  "versionId": "string"
}

Properties¶

Name	Type	Required	Description
categories	[string]	true	An array of strings describing the intended use of the dataset.
columnCount	integer	true	The number of columns in the dataset.
createdBy	string,null	true	Username of the user who created the dataset.
creationDate	string(date-time)	true	The date when the dataset was created.
dataPersisted	boolean	true	If true, user is allowed to view extended data profile (which includes data statistics like min/max/median/mean, histogram, etc.) and download data. If false, download is not allowed and only the data schema (feature names and types) will be available.
datasetId	string	true	The ID of this dataset.
datasetSize	integer	true	The size of the dataset as a CSV in bytes.
isDataEngineEligible	boolean	true	Whether this dataset can be a data source of a data engine query.
isLatestVersion	boolean	true	Whether this dataset version is the latest version of this dataset.
isSnapshot	boolean	true	Whether the dataset is an immutable snapshot of data which has previously been retrieved and saved to DataRobot.
name	string	true	The name of this dataset in the catalog.
processingState	string	true	Current ingestion process state of dataset.
rowCount	integer	true	The number of rows in the dataset.
sampleSize	SampleSize	false	Ingest size to use during dataset registration. Default behavior is to ingest full dataset.
timeSeriesProperties	TimeSeriesProperties	true	Properties related to time series data prep.
versionId	string	true	The object ID of the catalog_version the dataset belongs to.

Enumerated Values¶

Property	Value
processingState	[`COMPLETED`, `ERROR`, `RUNNING`]

BulkCatalogAppendTagsPayload


{
  "action": "tag",
  "tags": [
    "string"
  ]
}

Properties¶

Name	Type	Required	Restrictions	Description
action	string	true		The action to execute on the datasets. Has to be 'tag' for this payload.
tags	[string]	true	minItems: 1	The tags to append to the datasets. Tags will not be duplicated.

Enumerated Values¶

Property	Value
action	`tag`

BulkCatalogDeletePayload


{
  "action": "delete"
}

Properties¶

Name	Type	Required	Restrictions	Description
action	string	true		The action to execute on the datasets. Has to be 'delete' for this payload.

Enumerated Values¶

Property	Value
action	`delete`

BulkCatalogSharePayload


{
  "action": "updateRoles",
  "applyGrantToLinkedObjects": false,
  "roles": [
    {
      "canShare": false,
      "canUseData": true,
      "name": "string",
      "role": "CONSUMER",
      "shareRecipientType": "user"
    }
  ]
}

Properties¶

Name	Type	Required	Restrictions	Description
action	string	true		The action to execute on the datasets. Has to be 'updateRoles' for this payload.
applyGrantToLinkedObjects	boolean	false		If true for any users being granted access to the dataset, grant the user read access to any linked objects such as DataSources and DataStores that may be used by this dataset. Ignored if no such objects are relevant for dataset. Will not result in access being lowered for a user if the user already has higher access to linked objects than read access. However, if the target user does not have sharing permissions to the linked object, they will be given sharing access without lowering existing permissions. May result in an error if user making call does not have sufficient permissions to complete grant. Default value is false.
roles	[oneOf]	true	maxItems: 100 minItems: 1	An array of RoleRequest objects. May contain at most 100 such objects.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	DatasetRolesWithNames	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	DatasetRolesWithId	false		none

Enumerated Values¶

Property	Value
action	`updateRoles`

BulkDatasetAction


{
  "datasetIds": [
    "string"
  ],
  "payload": {
    "action": "delete"
  }
}

Properties¶

Name	Type	Required	Restrictions	Description
datasetIds	[string]	true	minItems: 1	The dataset IDs to execute the bulk action on.
payload	any	true		indicate which action to run and with what parameters.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	BulkCatalogDeletePayload	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	BulkCatalogAppendTagsPayload	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	BulkCatalogSharePayload	false		none

CatalogDetailsRetrieveResponse


{
  "createdAt": "2019-08-24T14:15:22Z",
  "createdBy": "string",
  "description": "string",
  "id": "string",
  "message": "string",
  "modifiedAt": "2019-08-24T14:15:22Z",
  "modifiedBy": "string",
  "name": "string",
  "status": "COMPLETED",
  "tags": [
    "string"
  ],
  "type": "unknown_dataset_type"
}

Properties¶

Name	Type	Required	Description
createdAt	string(date-time)	true	The ISO 8601-formatted date and time indicating when this item was created in the catalog.
createdBy	string	true	The full name or username of the user who added this item to the catalog.
description	string,null	true	Catalog item description.
id	string	true	Catalog item ID.
message	string,null	true	Details of exception(s) raised during ingestion process, if any.
modifiedAt	string(date-time)	true	The ISO 8601-formatted date and time indicating changes to the Info field(s) of this catalog item.
modifiedBy	string	true	The full name or username of the user who last modified the Info field(s) of this catalog item.
name	string	true	Catalog item name.
status	string,null	true	For datasets, the current ingestion process state of this catalog item.
tags	[string]	true	List of catalog item tags in the lower case with no spaces.
type	string	true	Catalog item type.

Enumerated Values¶

Property	Value
status	[`COMPLETED`, `ERROR`, `RUNNING`]
type	[`unknown_dataset_type`, `snapshot_dataset`, `remote_dataset`, `unknown_catalog_type`, `user_blueprint`]

CatalogExtendedDetailsDataOriginResponse


{
  "canShareDatasetData": true,
  "canUseDatasetData": true,
  "catalogName": "string",
  "catalogType": "unknown_dataset_type",
  "dataEngineQueryId": "string",
  "dataOrigin": "string",
  "dataSourceId": "string",
  "description": "string",
  "error": "string",
  "id": "string",
  "infoCreationDate": "string",
  "infoCreatorFullName": "string",
  "infoModificationDate": "string",
  "infoModifierFullName": "string",
  "isDataEngineEligible": true,
  "isFirstVersion": true,
  "lastModificationDate": "string",
  "lastModifierFullName": "string",
  "originalName": "string",
  "processingState": 0,
  "projectsUsedInCount": 0,
  "recipeId": "string",
  "relevance": 0,
  "tags": [
    "string"
  ],
  "uri": "string",
  "userBlueprintId": "string"
}

Properties¶

Name	Type	Required	Description
canShareDatasetData	boolean	true	Indicates if the dataset data can be shared.
canUseDatasetData	boolean	true	Indicates if the dataset data can be used.
catalogName	string	true	Catalog item name.
catalogType	string	true	Catalog item type.
dataEngineQueryId	string,null	true	The ID of the catalog item data engine query.
dataOrigin	string,null	true	Data origin of the datasource for this catalog item.
dataSourceId	string,null	true	The ID of the catalog item data source.
description	string,null	true	Catalog item description.
error	string,null	true	The latest error of the catalog item.
id	string	true	Catalog item ID.
infoCreationDate	string	true	The creation date of the catalog item.
infoCreatorFullName	string	true	The creator of the catalog item.
infoModificationDate	string	true	The date when the dataset metadata was last modified. This field is only applicable if the catalog item is a dataset.
infoModifierFullName	string,null	true	The user that last modified the dataset metadata. This field is only applicable if the catalog item is a dataset.
isDataEngineEligible	boolean	true	Indicates if the catalog item is eligible for use by the data engine.
isFirstVersion	boolean	false	Indicates if the catalog item is the first version.
lastModificationDate	string	true	The date when the catalog item was last modified.
lastModifierFullName	string	true	The user that last modified the catalog item.
originalName	string	true	Catalog item original name.
processingState	integer,null	true	The latest processing state of the catalog item.
projectsUsedInCount	integer	true	The number of projects that use the catalog item.
recipeId	string,null	true	The ID of the catalog item recipe.
relevance	number,null	true	ElasticSearch score value or null if search done in Mongo.
tags	[string]	true	List of catalog item tags in the lower case with no spaces.
uri	string,null	true	The URI to the datasource from which the catalog item was created, if it is a dataset.
userBlueprintId	string,null	true	The ID by which a user blueprint is referenced in User Blueprint API.

Enumerated Values¶

Property	Value
catalogType	[`unknown_dataset_type`, `snapshot_dataset`, `remote_dataset`, `unknown_catalog_type`, `user_blueprint`]

CatalogExtendedDetailsResponse


{
  "canShareDatasetData": true,
  "canUseDatasetData": true,
  "catalogName": "string",
  "catalogType": "unknown_dataset_type",
  "dataEngineQueryId": "string",
  "dataSourceId": "string",
  "description": "string",
  "error": "string",
  "id": "string",
  "infoCreationDate": "string",
  "infoCreatorFullName": "string",
  "infoModificationDate": "string",
  "infoModifierFullName": "string",
  "isDataEngineEligible": true,
  "isFirstVersion": true,
  "lastModificationDate": "string",
  "lastModifierFullName": "string",
  "originalName": "string",
  "processingState": 0,
  "projectsUsedInCount": 0,
  "recipeId": "string",
  "relevance": 0,
  "tags": [
    "string"
  ],
  "uri": "string",
  "userBlueprintId": "string"
}

Properties¶

Name	Type	Required	Description
canShareDatasetData	boolean	true	Indicates if the dataset data can be shared.
canUseDatasetData	boolean	true	Indicates if the dataset data can be used.
catalogName	string	true	Catalog item name.
catalogType	string	true	Catalog item type.
dataEngineQueryId	string,null	true	The ID of the catalog item data engine query.
dataSourceId	string,null	true	The ID of the catalog item data source.
description	string,null	true	Catalog item description.
error	string,null	true	The latest error of the catalog item.
id	string	true	Catalog item ID.
infoCreationDate	string	true	The creation date of the catalog item.
infoCreatorFullName	string	true	The creator of the catalog item.
infoModificationDate	string	true	The date when the dataset metadata was last modified. This field is only applicable if the catalog item is a dataset.
infoModifierFullName	string,null	true	The user that last modified the dataset metadata. This field is only applicable if the catalog item is a dataset.
isDataEngineEligible	boolean	true	Indicates if the catalog item is eligible for use by the data engine.
isFirstVersion	boolean	false	Indicates if the catalog item is the first version.
lastModificationDate	string	true	The date when the catalog item was last modified.
lastModifierFullName	string	true	The user that last modified the catalog item.
originalName	string	true	Catalog item original name.
processingState	integer,null	true	The latest processing state of the catalog item.
projectsUsedInCount	integer	true	The number of projects that use the catalog item.
recipeId	string,null	true	The ID of the catalog item recipe.
relevance	number,null	true	ElasticSearch score value or null if search done in Mongo.
tags	[string]	true	List of catalog item tags in the lower case with no spaces.
uri	string,null	true	The URI to the datasource from which the catalog item was created, if it is a dataset.
userBlueprintId	string,null	true	The ID by which a user blueprint is referenced in User Blueprint API.

Enumerated Values¶

Property	Value
catalogType	[`unknown_dataset_type`, `snapshot_dataset`, `remote_dataset`, `unknown_catalog_type`, `user_blueprint`]

CatalogListSearchResponse


{
  "cacheHit": true,
  "count": 0,
  "data": [
    {
      "canShareDatasetData": true,
      "canUseDatasetData": true,
      "catalogName": "string",
      "catalogType": "unknown_dataset_type",
      "dataEngineQueryId": "string",
      "dataOrigin": "string",
      "dataSourceId": "string",
      "description": "string",
      "error": "string",
      "id": "string",
      "infoCreationDate": "string",
      "infoCreatorFullName": "string",
      "infoModificationDate": "string",
      "infoModifierFullName": "string",
      "isDataEngineEligible": true,
      "isFirstVersion": true,
      "lastModificationDate": "string",
      "lastModifierFullName": "string",
      "originalName": "string",
      "processingState": 0,
      "projectsUsedInCount": 0,
      "recipeId": "string",
      "relevance": 0,
      "tags": [
        "string"
      ],
      "uri": "string",
      "userBlueprintId": "string"
    }
  ],
  "next": "string",
  "previous": "string",
  "totalCount": 0
}

Properties¶

Name	Type	Required	Description
cacheHit	boolean,null	true	Indicates if the catalog item is returned from the cache.
count	integer	true	Number of catalog items returned on this page.
data	[CatalogExtendedDetailsDataOriginResponse]	true	Detailed information for every found catalog item.
next	string,null	true	Location of the next page.
previous	string,null	true	Location of the previous page.
totalCount	integer	false	Total number of catalog items.

CreateDataEngineQueryGenerator


{
  "datasets": [
    {
      "alias": "string",
      "datasetId": "string",
      "datasetVersionId": "string"
    }
  ],
  "generatorSettings": {
    "datetimePartitionColumn": "string",
    "defaultCategoricalAggregationMethod": "last",
    "defaultNumericAggregationMethod": "mean",
    "defaultTextAggregationMethod": "concat",
    "endToSeriesMaxDatetime": true,
    "multiseriesIdColumns": [
      "string"
    ],
    "startFromSeriesMinDatetime": true,
    "target": "string",
    "timeStep": 0,
    "timeUnit": "DAY"
  },
  "generatorType": "TimeSeries"
}

Properties¶

Name	Type	Required	Restrictions	Description
datasets	[DataEngineDataset]	true	maxItems: 32	Source datasets in the Data Engine workspace.
generatorSettings	GeneratorSettings	true		Data engine generator settings of the given `generator_type`.
generatorType	string	true		Type of data engine query generator

Enumerated Values¶

Property	Value
generatorType	`TimeSeries`

CreateFromRecipe


{
  "categories": "BATCH_PREDICTIONS",
  "credentialData": {
    "credentialType": "basic",
    "password": "string",
    "user": "string"
  },
  "credentialId": "string",
  "doSnapshot": true,
  "materializationDestination": {
    "catalog": "string",
    "schema": "string",
    "table": "string"
  },
  "name": "string",
  "persistDataAfterIngestion": true,
  "recipeId": "string",
  "skipDuplicateDatesValidation": false,
  "useKerberos": false
}

Properties¶

Name	Type	Required	Restrictions	Description
categories	any	false		An array of strings describing the intended use of the dataset.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	[string]	false		none

continued

Name	Type	Required	Restrictions	Description
credentialData	any	false		The credentials to authenticate with the database, to be used instead of credential ID.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	BasicCredentials	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	S3Credentials	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	OAuthCredentials	false		none

continued

Name	Type	Required	Description
credentialId	string	false	The ID of the set of credentials to authenticate with the database.
doSnapshot	boolean	false	If true, create a snapshot dataset; if false, create a remote dataset. Creating snapshots from non-file sources requires an additional permission, `Enable Create Snapshot Data Source`.
materializationDestination	MaterializationDestination	false	Destination table information to create and materialize the recipe to. If None, the recipe will be materialized in DataRobot.
name	string	false	Name to be assigned to new Dataset.
persistDataAfterIngestion	boolean	false	If true, will enforce saving all data (for download and sampling) and will allow a user to view extended data profile (which includes data statistics like min/max/median/mean, histogram, etc.). If false, will not enforce saving data. The data schema (feature names and types) still will be available. Specifying this parameter to false and `doSnapshot` to true will result in an error.
recipeId	string	true	The identifier for the Wrangling Recipe to use as the source of data.
skipDuplicateDatesValidation	boolean	false	By default, if a recipe contains time series or a time series resampling operation, publishing fails if there are date duplicates to prevent data quality issues and ambiguous transformations. If set to True, then validation will be skipped.
useKerberos	boolean	false	If true, use kerberos authentication for database authentication.

Enumerated Values¶

Property	Value
anonymous	[`BATCH_PREDICTIONS`, `MULTI_SERIES_CALENDAR`, `PREDICTION`, `SAMPLE`, `SINGLE_SERIES_CALENDAR`, `TRAINING`]

CreateWorkspaceState


{
  "datasets": [
    {
      "alias": "string",
      "datasetId": "string",
      "datasetVersionId": "string"
    }
  ],
  "language": "SQL",
  "query": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
datasets	[DataEngineDataset]	false		The source datasets in the data engine workspace.
language	string	true		The language of the data engine query.
query	string	true	maxLength: 64000	The actual body of the data engine query.

Enumerated Values¶

Property	Value
language	`SQL`

CreateWorkspaceStateFromQueryGenerator


{
  "datasetId": "string",
  "datasetVersionId": "string",
  "queryGeneratorId": "string"
}

Properties¶

Name	Type	Required	Description
datasetId	string	false	The ID of the dataset.
datasetVersionId	string	false	The ID of the dataset version.
queryGeneratorId	string	true	The ID of the query generator.

CreatedDatasetDataEngineResponse


{
  "datasetId": "string",
  "datasetVersionId": "string",
  "statusId": "string"
}

Properties¶

Name	Type	Required	Description
datasetId	string	true	ID of the output dataset item.
datasetVersionId	string	true	ID of the output dataset version item.
statusId	string	true	ID that can be used with GET /api/v2/status/{statusId}/ to poll for the testing job's status.

CreatedDatasetResponse


{
  "catalogId": "string",
  "catalogVersionId": "string",
  "statusId": "string"
}

Properties¶

Name	Type	Required	Description
catalogId	string	true	The ID of the catalog entry.
catalogVersionId	string	true	The ID of the latest version of the catalog entry.
statusId	string	true	ID that can be used with GET /api/v2/status/{statusId}/ to poll for the testing job's status.

DataEngineDataset


{
  "alias": "string",
  "datasetId": "string",
  "datasetVersionId": "string"
}

Properties¶

Name	Type	Required	Description
alias	string	true	Alias to be used as the table name.
datasetId	string	false	The ID of the dataset.
datasetVersionId	string	false	The ID of the dataset version.

DatabricksAccessTokenCredentials


{
  "credentialType": "databricks_access_token_account",
  "databricksAccessToken": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
credentialType	string	true		The type of these credentials, 'databricks_access_token_account' here.
databricksAccessToken	string	true	minLength: 1 minLength: 1	Databricks personal access token.

Enumerated Values¶

Property	Value
credentialType	`databricks_access_token_account`

DatabricksServicePrincipalCredentials


{
  "clientId": "string",
  "clientSecret": "string",
  "configId": "string",
  "credentialType": "databricks_service_principal_account"
}

Properties¶

Name	Type	Required	Restrictions	Description
clientId	string	false	minLength: 1 minLength: 1	Client ID for Databricks service principal.
clientSecret	string	false	minLength: 1 minLength: 1	Client secret for Databricks service principal.
configId	string	false		The ID of the saved shared credentials. If specified, cannot include clientIdand clientSecret.
credentialType	string	true		The type of these credentials, 'databricks_service_principal_account' here.

Enumerated Values¶

Property	Value
credentialType	`databricks_service_principal_account`

DatasetAccessControlListResponse


{
  "count": 0,
  "data": [
    {
      "canShare": true,
      "canUseData": true,
      "role": "OWNER",
      "userFullName": "string",
      "username": "string"
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Properties¶

Name	Type	Required	Description
count	integer	false	Number of items returned on this page.
data	[DatasetAccessControlResponse]	true	An array of DatasetAccessControl objects.
next	string,null(uri)	true	URL pointing to the next page (if null, there is no next page).
previous	string,null(uri)	true	URL pointing to the previous page (if null, there is no previous page).
totalCount	integer	true	The total number of items across all pages.

DatasetAccessControlResponse


{
  "canShare": true,
  "canUseData": true,
  "role": "OWNER",
  "userFullName": "string",
  "username": "string"
}

Properties¶

Name	Type	Required	Description
canShare	boolean	true	True if this user can share with other users
canUseData	boolean	true	True if the user can view, download and process data (use to create projects, predictions, etc)
role	string	true	The role of the user on this data source.
userFullName	string	true	The full name of a user with access to this dataset. If the full name is not available, username is returned instead.
username	string	true	`username` of a user with access to this dataset.

Enumerated Values¶

Property	Value
role	[`OWNER`, `EDITOR`, `CONSUMER`]

DatasetAccessInner


{
  "canShare": true,
  "canUseData": true,
  "role": "CONSUMER",
  "username": "string"
}

Properties¶

Name	Type	Required	Description
canShare	boolean	false	whether the user should be able to share with other users. If true, the user will be able to grant any role up to and including their own to other users. If `role` is empty `canShare` is ignored.
canUseData	boolean	false	Whether the user should be able to view, download and process data (use to create projects, predictions, etc). For OWNER `canUseData` is always True. If `role` is empty `canUseData` is ignored.
role	string	true	the role to grant to the user, or "" (empty string) to remove the users access
username	string	true	username of the user to update the access role for.

Enumerated Values¶

Property	Value
role	[`CONSUMER`, `EDITOR`, `OWNER`, ``]

DatasetAccessSet


{
  "applyGrantToLinkedObjects": false,
  "data": [
    {
      "canShare": true,
      "canUseData": true,
      "role": "CONSUMER",
      "username": "string"
    }
  ]
}

Properties¶

Name	Type	Required	Restrictions	Description
applyGrantToLinkedObjects	boolean	false		If true for any users being granted access to the dataset, grant the user read access to any linked objects such as DataSources and DataStores that may be used by this dataset. Ignored if no such objects are relevant for dataset. Will not result in access being lowered for a user if the user already has higher access to linked objects than read access. However, if the target user does not have sharing permissions to the linked object, they will be given sharing access without lowering existing permissions. May result in an error if user making call does not have sufficient permissions to complete grant. Default value is false.
data	[DatasetAccessInner]	true	minItems: 1	array of DatasetAccessControl objects.

DatasetCreateFromWorkspaceState


{
  "credentials": "string",
  "datasetName": "string",
  "doSnapshot": true,
  "workspaceStateId": "string"
}

Properties¶

Name	Type	Required	Description
credentials	string	false	The JDBC credentials.
datasetName	string	false	The custom name for the created dataset.
doSnapshot	boolean	false	If true, create a snapshot dataset; if false, create a remote dataset. Creating snapshots from non-file sources requires an additional permission, `Enable Create Snapshot Data Source`.
workspaceStateId	string	true	The ID of the workspace state to use as the source of data.

DatasetCreateVersionFromWorkspaceState


{
  "credentials": "string",
  "doSnapshot": true,
  "workspaceStateId": "string"
}

Properties¶

Name	Type	Required	Description
credentials	string	false	JDBC credentials.
doSnapshot	boolean	false	If true, create a snapshot dataset; if false, create a remote dataset. Creating snapshots from non-file sources requires an additional permission, `Enable Create Snapshot Data Source`.
workspaceStateId	string	true	ID of the workspace state to use as the source of data.

DatasetDescribePermissionsResponse


{
  "canCreateFeaturelist": true,
  "canDeleteDataset": true,
  "canDownloadDatasetData": true,
  "canGetCatalogItemInfo": true,
  "canGetDatasetInfo": true,
  "canGetDatasetPermissions": true,
  "canGetFeatureInfo": true,
  "canGetFeaturelists": true,
  "canPatchCatalogInfo": true,
  "canPatchDatasetAliases": true,
  "canPatchDatasetInfo": true,
  "canPatchDatasetPermissions": true,
  "canPatchFeaturelists": true,
  "canPostDataset": true,
  "canReloadDataset": true,
  "canShareDataset": true,
  "canSnapshotDataset": true,
  "canUndeleteDataset": true,
  "canUseDatasetData": true,
  "canUseFeaturelists": true,
  "datasetId": "string",
  "uid": "string",
  "username": "string"
}

Properties¶

Name	Type	Required	Description
canCreateFeaturelist	boolean	true	True if the user can create a new featurelist for this dataset.
canDeleteDataset	boolean	true	True if the user can delete dataset.
canDownloadDatasetData	boolean	true	True if the user can download data.
canGetCatalogItemInfo	boolean	true	True if the user can view catalog info.
canGetDatasetInfo	boolean	true	True if the user can view dataset info.
canGetDatasetPermissions	boolean	true	True if the user can view dataset permissions.
canGetFeatureInfo	boolean	true	True if the user can retrieve feature info of dataset.
canGetFeaturelists	boolean	true	True if the user can view featurelist for this dataset.
canPatchCatalogInfo	boolean	true	True if the user can modify catalog info.
canPatchDatasetAliases	boolean	true	True if the user can modify dataset feature aliases.
canPatchDatasetInfo	boolean	true	True if the user can modify dataset info.
canPatchDatasetPermissions	boolean	true	True if the user can modify dataset permissions.
canPatchFeaturelists	boolean	true	True if the user can modify featurelists for this dataset.
canPostDataset	boolean	true	True if the user can create a new dataset.
canReloadDataset	boolean	true	True if the user can reload dataset.
canShareDataset	boolean	true	True if the user can share the dataset.
canSnapshotDataset	boolean	true	True if the user can save snapshot of dataset.
canUndeleteDataset	boolean	true	True if the user can undelete dataset.
canUseDatasetData	boolean	true	True if the user can use dataset data to create projects, train custom models or provide predictions.
canUseFeaturelists	boolean	true	True if the user can use featurelists for this dataset. (for project creation)
datasetId	string	true	The ID of the dataset.
uid	string	true	The ID of the user identified by username.
username	string	true	`username` of a user with access to this dataset.

DatasetFeatureHistogramResponse


{
  "plot": [
    {
      "count": 0,
      "label": "string"
    }
  ]
}

Properties¶

Name	Type	Required	Restrictions	Description
plot	[DatasetFeaturePlotDataResponse]	true		Plot data based on feature values.

DatasetFeaturePlotDataResponse


{
  "count": 0,
  "label": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
count	number	true		Number of values in the bin.
label	string	true		Bin start for numerical/uncapped, or string value for categorical. The bin `==Missing==` is created for rows that did not have the feature.

DatasetFeatureResponseWithDataQuality


{
  "dataQualityIssues": "ISSUES_FOUND",
  "dataQualityIssuesTypes": [
    "disguised_missing_values"
  ],
  "datasetId": "string",
  "datasetVersionId": "string",
  "dateFormat": "string",
  "featureType": "Boolean",
  "id": 0,
  "isZeroInflated": true,
  "keySummary": {
    "key": "string",
    "summary": {
      "dataQualities": "ISSUES_FOUND",
      "max": 0,
      "mean": 0,
      "median": 0,
      "min": 0,
      "pctRows": 0,
      "stdDev": 0
    }
  },
  "language": "string",
  "lowInformation": true,
  "lowerQuartile": "string",
  "max": "string",
  "mean": "string",
  "median": "string",
  "min": "string",
  "naCount": 0,
  "name": "string",
  "plot": [
    {
      "count": 0,
      "label": "string"
    }
  ],
  "sampleRows": 0,
  "stdDev": "string",
  "timeSeriesEligibilityReason": "string",
  "timeSeriesEligibilityReasonAggregation": "string",
  "timeSeriesEligible": true,
  "timeSeriesEligibleAggregation": true,
  "timeStep": 0,
  "timeStepAggregation": 0,
  "timeUnit": "string",
  "timeUnitAggregation": "string",
  "uniqueCount": 0,
  "upperQuartile": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
dataQualityIssues	string,null	false		The status of data quality issue detection.
dataQualityIssuesTypes	[string]	false	maxItems: 20	Data quality issue types.
datasetId	string	true		The ID of the dataset the feature belongs to
datasetVersionId	string	true		The ID of the dataset version the feature belongs to.
dateFormat	string,null	true		The date format string for how this feature was interpreted (or null if not a date feature). If not null, it will be compatible with https://docs.python.org/2/library/time.html#time.strftime .
featureType	string	true		Feature type.
id	integer	true		The number of the column in the dataset.
isZeroInflated	boolean,null	false		whether feature has an excessive number of zeros
keySummary	any	false		Per key summaries for Summarized Categorical or Multicategorical columns

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	FeatureKeySummaryResponseValidatorSummarizedCategorical	false		For a Summarized Categorical columns, this will contain statistics for the top 50 keys (truncated to 103 characters)

xor

Name	Type	Required	Restrictions	Description
» anonymous	[FeatureKeySummaryResponseValidatorMultilabel]	false		For a Multicategorical columns, this will contain statistics for the top classes

continued

Name	Type	Required	Description
language	string	false	Detected language of the feature.
lowInformation	boolean	false	Whether feature has too few values to be informative.
lowerQuartile	any	false	Lower quartile point of EDA sample of the feature.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		Lower quartile point of EDA sample of the feature.

xor

Name	Type	Required	Restrictions	Description
» anonymous	number	false		Lower quartile point of EDA sample of the feature.

continued

Name	Type	Required	Restrictions	Description
max	any	false		Maximum value of the EDA sample of the feature.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		Maximum value of the EDA sample of the feature.

xor

Name	Type	Required	Restrictions	Description
» anonymous	number	false		Maximum value of the EDA sample of the feature.

continued

Name	Type	Required	Restrictions	Description
mean	any	false		Arithmetic mean of the EDA sample of the feature.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		Arithmetic mean of the EDA sample of the feature.

xor

Name	Type	Required	Restrictions	Description
» anonymous	number	false		Arithmetic mean of the EDA sample of the feature.

continued

Name	Type	Required	Restrictions	Description
median	any	false		Median of the EDA sample of the feature.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		Median of the EDA sample of the feature.

xor

Name	Type	Required	Restrictions	Description
» anonymous	number	false		Median of the EDA sample of the feature.

continued

Name	Type	Required	Restrictions	Description
min	any	false		Minimum value of the EDA sample of the feature.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		Minimum value of the EDA sample of the feature.

xor

Name	Type	Required	Restrictions	Description
» anonymous	number	false		Minimum value of the EDA sample of the feature.

continued

Name	Type	Required	Description
naCount	integer,null	false	Number of missing values.
name	string	true	Feature name
plot	[DatasetFeaturePlotDataResponse]	false	Plot data based on feature values.
sampleRows	integer	true	The number of rows in the sample used to calculate the statistics.
stdDev	any	false	Standard deviation of EDA sample of the feature.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		Standard deviation of EDA sample of the feature.

xor

Name	Type	Required	Restrictions	Description
» anonymous	number	false		Standard deviation of EDA sample of the feature.

continued

Name	Type	Required	Description
timeSeriesEligibilityReason	string,null	false	why the feature is ineligible for time series projects, or 'suitable' if it is eligible.
timeSeriesEligibilityReasonAggregation	string,null	false	why the feature is ineligible for aggregation, or 'suitable' if it is eligible.
timeSeriesEligible	boolean	false	whether this feature can be used as a datetime partitioning feature for time series projects. Only sufficiently regular date features can be selected as the datetime feature for time series projects. Always false for non-date features. Date features that cannot be used in datetime partitioning for a time series project may be eligible for an OTV project, which has less stringent requirements.
timeSeriesEligibleAggregation	boolean	false	whether this feature can be used as a datetime feature for aggregationfor time series data prep. Always false for non-date features.
timeStep	integer,null	false	The minimum time step that can be used to specify time series windows. The units for this value are the `timeUnit`. When specifying windows for time series projects, all windows must have durations that are integer multiples of this number. Only present for date features that are eligible for time series projects and null otherwise.
timeStepAggregation	integer,null	false	The minimum time step that can be used to aggregate using this feature for time series data prep. The units for this value are the `timeUnit`. Only present for date features that are eligible for aggregation in time series data prep and null otherwise.
timeUnit	string,null	false	The unit for the interval between values of this feature, e.g. DAY, MONTH, HOUR. When specifying windows for time series projects, the windows are expressed in terms of this unit. Only present for date features eligible for time series projects, and null otherwise.
timeUnitAggregation	string,null	false	The unit for the interval between values of this feature, e.g. DAY, MONTH, HOUR. Only present for date features eligible for aggregation, and null otherwise.
uniqueCount	integer,null	false	Number of unique values.
upperQuartile	any	false	Upper quartile point of EDA sample of the feature.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		Upper quartile point of EDA sample of the feature.

xor

Name	Type	Required	Restrictions	Description
» anonymous	number	false		Upper quartile point of EDA sample of the feature.

Enumerated Values¶

Property	Value
dataQualityIssues	[`ISSUES_FOUND`, `NOT_ANALYZED`, `NO_ISSUES_FOUND`]
featureType	[`Boolean`, `Categorical`, `Currency`, `Date`, `Date Duration`, `Document`, `Image`, `Interaction`, `Length`, `Location`, `Multicategorical`, `Numeric`, `Percentage`, `Summarized Categorical`, `Text`, `Time`]

DatasetFeaturelistListResponse


{
  "count": 0,
  "data": [
    {
      "createdBy": "string",
      "creationDate": "2019-08-24T14:15:22Z",
      "datasetId": "string",
      "datasetVersionId": "string",
      "features": [
        "string"
      ],
      "id": "string",
      "name": "string",
      "userCreated": true
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Properties¶

Name	Type	Required	Description
count	integer	false	Number of items returned on this page.
data	[DatasetFeaturelistResponse]	true	An array of featurelists' details.
next	string,null(uri)	true	URL pointing to the next page (if null, there is no next page).
previous	string,null(uri)	true	URL pointing to the previous page (if null, there is no previous page).
totalCount	integer	true	The total number of items across all pages.

DatasetFeaturelistResponse


{
  "createdBy": "string",
  "creationDate": "2019-08-24T14:15:22Z",
  "datasetId": "string",
  "datasetVersionId": "string",
  "features": [
    "string"
  ],
  "id": "string",
  "name": "string",
  "userCreated": true
}

Properties¶

Name	Type	Required	Description
createdBy	string,null	true	`username` of the user who created the dataset.
creationDate	string(date-time)	true	the ISO 8601 formatted date and time when the dataset was created.
datasetId	string	true	The ID of the dataset.
datasetVersionId	string,null	true	The ID of the dataset version if the featurelist is associated with a specific dataset version, for example Informative Features, or null otherwise.
features	[string]	true	Features in the featurelist.
id	string	true	The ID of the featurelist
name	string	true	The name of the featurelist
userCreated	boolean	true	True if the featurelist was created by a user vs the system.

DatasetFeaturesListResponse


{
  "count": 0,
  "data": [
    {
      "dataQualityIssues": "ISSUES_FOUND",
      "dataQualityIssuesTypes": [
        "disguised_missing_values"
      ],
      "datasetId": "string",
      "datasetVersionId": "string",
      "dateFormat": "string",
      "featureType": "Boolean",
      "id": 0,
      "isZeroInflated": true,
      "keySummary": {
        "key": "string",
        "summary": {
          "dataQualities": "ISSUES_FOUND",
          "max": 0,
          "mean": 0,
          "median": 0,
          "min": 0,
          "pctRows": 0,
          "stdDev": 0
        }
      },
      "language": "string",
      "lowInformation": true,
      "lowerQuartile": "string",
      "max": "string",
      "mean": "string",
      "median": "string",
      "min": "string",
      "naCount": 0,
      "name": "string",
      "plot": [
        {
          "count": 0,
          "label": "string"
        }
      ],
      "sampleRows": 0,
      "stdDev": "string",
      "timeSeriesEligibilityReason": "string",
      "timeSeriesEligibilityReasonAggregation": "string",
      "timeSeriesEligible": true,
      "timeSeriesEligibleAggregation": true,
      "timeStep": 0,
      "timeStepAggregation": 0,
      "timeUnit": "string",
      "timeUnitAggregation": "string",
      "uniqueCount": 0,
      "upperQuartile": "string"
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Properties¶

Name	Type	Required	Description
count	integer	false	Number of items returned on this page.
data	[DatasetFeatureResponseWithDataQuality]	true	The list of features related to the requested dataset.
next	string,null(uri)	true	URL pointing to the next page (if null, there is no next page).
previous	string,null(uri)	true	URL pointing to the previous page (if null, there is no previous page).
totalCount	integer	true	The total number of items across all pages.

DatasetFromFile


{
  "categories": "BATCH_PREDICTIONS",
  "file": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
categories	any	false		An array of strings describing the intended use of the dataset.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	[string]	false		none

continued

Name	Type	Required	Restrictions	Description
file	string(binary)	true		The data to be used for the creation.

Enumerated Values¶

Property	Value
anonymous	[`BATCH_PREDICTIONS`, `MULTI_SERIES_CALENDAR`, `PREDICTION`, `SAMPLE`, `SINGLE_SERIES_CALENDAR`, `TRAINING`]

DatasetFromStage


{
  "categories": "BATCH_PREDICTIONS",
  "stageId": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
categories	any	false		An array of strings describing the intended use of the dataset.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	[string]	false	maxItems: 100	none

continued

Name	Type	Required	Restrictions	Description
stageId	string	true		The ID of the data stage which will be used to create the dataset item & version.

Enumerated Values¶

Property	Value
anonymous	[`BATCH_PREDICTIONS`, `MULTI_SERIES_CALENDAR`, `PREDICTION`, `SAMPLE`, `SINGLE_SERIES_CALENDAR`, `TRAINING`]

DatasetListResponse


{
  "count": 0,
  "data": [
    {
      "categories": [
        "BATCH_PREDICTIONS"
      ],
      "columnCount": 0,
      "createdBy": "string",
      "creationDate": "2019-08-24T14:15:22Z",
      "dataPersisted": true,
      "datasetId": "string",
      "datasetSize": 0,
      "isDataEngineEligible": true,
      "isLatestVersion": true,
      "isSnapshot": true,
      "name": "string",
      "processingState": "COMPLETED",
      "rowCount": 0,
      "sampleSize": {
        "type": "rows",
        "value": 1000000
      },
      "timeSeriesProperties": {
        "isMostlyImputed": null
      },
      "versionId": "string"
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Properties¶

Name	Type	Required	Description
count	integer	false	Number of items returned on this page.
data	[BasicDatasetWithSizeResponse]	true	An array of dataset details.
next	string,null(uri)	true	URL pointing to the next page (if null, there is no next page).
previous	string,null(uri)	true	URL pointing to the previous page (if null, there is no previous page).
totalCount	integer	true	The total number of items across all pages.

DatasetProject


{
  "id": "string",
  "url": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
id	string	true		The dataset's project ID.
url	string	true		The link to retrieve more information about the dataset version's project.

DatasetProjectListResponse


{
  "count": 0,
  "data": [
    {
      "id": "string",
      "url": "string"
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Properties¶

Name	Type	Required	Description
count	integer	false	Number of items returned on this page.
data	[DatasetProjectResponse]	true	An array of information about dataset's projects
next	string,null(uri)	true	URL pointing to the next page (if null, there is no next page).
previous	string,null(uri)	true	URL pointing to the previous page (if null, there is no previous page).
totalCount	integer	true	The total number of items across all pages.

DatasetProjectResponse


{
  "id": "string",
  "url": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
id	string	true		The dataset's project ID.
url	string	true		The link to retrieve more information about the dataset's project.

DatasetRefresh


{
  "dayOfMonth": [
    "*"
  ],
  "dayOfWeek": [
    "*"
  ],
  "hour": [
    0
  ],
  "minute": [
    0
  ],
  "month": [
    "*"
  ]
}

Schedule describing when to refresh the dataset. The smallest schedule allowed is daily.

Properties¶

Name	Type	Required	Restrictions	Description
dayOfMonth	[number,string]	false	maxItems: 31	The date(s) of the month that the job will run. Allowed values are either `[1 ... 31]` or `[""]` for all days of the month. This field is additive with `dayOfWeek`, meaning the job will run both on the date(s) defined in this field and the day specified by `dayOfWeek` (for example, dates 1st, 2nd, 3rd, plus every Tuesday). If `dayOfMonth` is set to `[""]` and `dayOfWeek` is defined, the scheduler will trigger on every day of the month that matches `dayOfWeek` (for example, Tuesday the 2nd, 9th, 16th, 23rd, 30th). Invalid dates such as February 31st are ignored.
dayOfWeek	[number,string]	false	maxItems: 7	The day(s) of the week that the job will run. Allowed values are `[0 .. 6]`, where (Sunday=0), or `["*"]`, for all days of the week. Strings, either 3-letter abbreviations or the full name of the day, can be used interchangeably (e.g., "sunday", "Sunday", "sun", or "Sun", all map to `[0]`. This field is additive with `dayOfMonth`, meaning the job will run both on the date specified by `dayOfMonth` and the day defined in this field.
hour	any	true		The hour(s) of the day that the job will run. Allowed values are `[0 ... 23]`.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	[string]	false		none

continued

Name	Type	Required	Restrictions	Description
minute	any	true		The minute(s) of the day that the job will run. Allowed values are `[0 ... 59]`.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	[string]	false		none

continued

Name	Type	Required	Restrictions	Description
month	[number,string]	false	maxItems: 12	The month(s) of the year that the job will run. Allowed values are either `[1 ... 12]` or `["*"]` for all months of the year. Strings, either 3-letter abbreviations or the full name of the month, can be used interchangeably (e.g., "jan" or "october"). Months that are not compatible with `dayOfMonth` are ignored, for example `{"dayOfMonth": [31], "month":["feb"]}`.

Enumerated Values¶

Property	Value
anonymous	[`0`, `1`, `2`, `3`, `4`, `5`, `6`, `7`, `8`, `9`, `10`, `11`, `12`, `13`, `14`, `15`, `16`, `17`, `18`, `19`, `20`, `21`, `22`, `23`]
anonymous	[`0`, `1`, `2`, `3`, `4`, `5`, `6`, `7`, `8`, `9`, `10`, `11`, `12`, `13`, `14`, `15`, `16`, `17`, `18`, `19`, `20`, `21`, `22`, `23`, `24`, `25`, `26`, `27`, `28`, `29`, `30`, `31`, `32`, `33`, `34`, `35`, `36`, `37`, `38`, `39`, `40`, `41`, `42`, `43`, `44`, `45`, `46`, `47`, `48`, `49`, `50`, `51`, `52`, `53`, `54`, `55`, `56`, `57`, `58`, `59`]

DatasetRefreshExecutionResult


{
  "completedAt": "2019-08-24T14:15:22Z",
  "datasetId": "string",
  "datasetVersionId": "string",
  "executionId": "string",
  "jobId": "string",
  "message": "string",
  "startedAt": "2019-08-24T14:15:22Z",
  "status": "INITIALIZING"
}

Properties¶

Name	Type	Required	Description
completedAt	string(date-time)	true	UTC completion date, in RFC-3339 format.
datasetId	string	true	Dataset ID associated with this result.
datasetVersionId	string	true	Dataset version ID associated with this result
executionId	string	true	Result ID.
jobId	string	true	Job ID associated with this result.
message	string	true	Current status of execution.
startedAt	string(date-time)	true	UTC start date, in RFC-3339 format.
status	string	true	Status of this dataset refresh.

Enumerated Values¶

Property	Value
status	[`INITIALIZING`, `REFRESHING`, `SUCCESS`, `ERROR`]

DatasetRefreshJobCreate


{
  "categories": "BATCH_PREDICTIONS",
  "credentialId": null,
  "credentials": "string",
  "enabled": true,
  "name": "string",
  "schedule": {
    "dayOfMonth": [
      "*"
    ],
    "dayOfWeek": [
      "*"
    ],
    "hour": [
      0
    ],
    "minute": [
      0
    ],
    "month": [
      "*"
    ]
  },
  "scheduleReferenceDate": "2019-08-24T14:15:22Z",
  "useKerberos": false
}

Properties¶

Name	Type	Required	Restrictions	Description
categories	any	false		An array of strings describing the intended use of the dataset. The supported options are `TRAINING`, and `PREDICTION`.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	[string]	false		none

continued

Name	Type	Required	Restrictions	Description
credentialId	string,null	false		The ID of the set of credentials to use to run the scheduled job when the Kerberos authentication service is utilized. Required when useKerberos is true.
credentials	string	false		A JSON string describing the data engine queries credentials to use when refreshing.
enabled	boolean	false		Boolean for whether the scheduled job is active (true) or inactive (false).
name	string	true	maxLength: 256	Scheduled job name.
schedule	DatasetRefresh	true		Schedule describing when to refresh the dataset. The smallest schedule allowed is daily.
scheduleReferenceDate	string(date-time)	false		The UTC reference date in RFC-3339 format of when the schedule starts from. This value is returned in `/api/v2/datasets/(datasetId)/refreshJobs/(jobId)/` to help build a more intuitive schedule picker. The default is the current time.
useKerberos	boolean	false		If true, the Kerberos authentication system is used in conjunction with a credential ID.

Enumerated Values¶

Property	Value
anonymous	[`BATCH_PREDICTIONS`, `MULTI_SERIES_CALENDAR`, `PREDICTION`, `SINGLE_SERIES_CALENDAR`, `TRAINING`]

DatasetRefreshJobResponse


{
  "categories": "BATCH_PREDICTIONS",
  "createdBy": "string",
  "credentialId": "string",
  "credentials": "string",
  "datasetId": "string",
  "enabled": true,
  "jobId": "string",
  "name": "string",
  "schedule": {
    "dayOfMonth": [
      "*"
    ],
    "dayOfWeek": [
      "*"
    ],
    "hour": [
      0
    ],
    "minute": [
      0
    ],
    "month": [
      "*"
    ]
  },
  "scheduleReferenceDate": "2019-08-24T14:15:22Z",
  "updatedAt": "2019-08-24T14:15:22Z",
  "updatedBy": "string",
  "useKerberos": true
}

Properties¶

Name	Type	Required	Restrictions	Description
categories	any	true		An array of strings describing the intended use of the dataset.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	[string]	false		none

continued

Name	Type	Required	Description
createdBy	string	true	The user who created this dataset refresh job.
credentialId	string,null	true	ID used to validate with Kerberos authentication service if Kerberos is enabled.
credentials	string	true	A JSON string describing the data engine queries credentials to use when refreshing.
datasetId	string	true	ID of the dataset the user scheduled job applies to.
enabled	boolean	true	Indicates whether the scheduled job is active (true) or inactive(false).
jobId	string	true	The scheduled job ID.
name	string	true	The scheduled job's name.
schedule	DatasetRefresh	true	Schedule describing when to refresh the dataset. The smallest schedule allowed is daily.
scheduleReferenceDate	string(date-time)	true	The UTC reference date in RFC-3339 format of when the schedule starts from. Can be used to help build a more intuitive schedule picker.
updatedAt	string(date-time)	true	The UTC date in RFC-3339 format of when the job was last updated.
updatedBy	string	true	The user who last modified this dataset refresh job.
useKerberos	boolean	true	Boolean (true) if the Kerberos authentication service is needed when refreshing a job.

Enumerated Values¶

Property	Value
anonymous	[`BATCH_PREDICTIONS`, `MULTI_SERIES_CALENDAR`, `PREDICTION`, `SINGLE_SERIES_CALENDAR`, `TRAINING`]

DatasetRefreshJobRetrieveExecutionResultsResponse


{
  "count": 0,
  "data": [
    {
      "completedAt": "2019-08-24T14:15:22Z",
      "datasetId": "string",
      "datasetVersionId": "string",
      "executionId": "string",
      "jobId": "string",
      "message": "string",
      "startedAt": "2019-08-24T14:15:22Z",
      "status": "INITIALIZING"
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Properties¶

Name	Type	Required	Description
count	integer	false	Number of items returned on this page.
data	[DatasetRefreshExecutionResult]	true	Array of dataset refresh results, returned latest first.
next	string,null(uri)	true	URL pointing to the next page (if null, there is no next page).
previous	string,null(uri)	true	URL pointing to the previous page (if null, there is no previous page).
totalCount	integer	true	The total number of items across all pages.

DatasetRefreshJobUpdate


{
  "categories": "BATCH_PREDICTIONS",
  "credentialId": "string",
  "credentials": "string",
  "enabled": true,
  "name": "string",
  "schedule": {
    "dayOfMonth": [
      "*"
    ],
    "dayOfWeek": [
      "*"
    ],
    "hour": [
      0
    ],
    "minute": [
      0
    ],
    "month": [
      "*"
    ]
  },
  "scheduleReferenceDate": "2019-08-24T14:15:22Z",
  "useKerberos": true
}

Properties¶

Name	Type	Required	Restrictions	Description
categories	any	false		An array of strings describing the intended use of the dataset. The supported options are `TRAINING`, and `PREDICTION`.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	[string]	false		none

continued

Name	Type	Required	Restrictions	Description
credentialId	string,null	false		The ID of the set of credentials to use to run the scheduled job when the Kerberos authentication service is utilized. Required when useKerberos is true.
credentials	string	false		A JSON string describing the data engine queries credentials to use when refreshing.
enabled	boolean	false		Boolean for whether the scheduled job is active (true) or inactive (false).
name	string	false	maxLength: 256	Scheduled job name.
schedule	DatasetRefresh	false		Schedule describing when to refresh the dataset. The smallest schedule allowed is daily.
scheduleReferenceDate	string(date-time)	false		The UTC reference date in RFC-3339 format of when the schedule starts from. This value is returned in `/api/v2/datasets/(datasetId)/refreshJobs/(jobId)/` to help build a more intuitive schedule picker. Required when `schedule` is being updated. The default is the current time.
useKerberos	boolean	false		If true, the Kerberos authentication system is used in conjunction with a credential ID.

Enumerated Values¶

Property	Value
anonymous	[`BATCH_PREDICTIONS`, `MULTI_SERIES_CALENDAR`, `PREDICTION`, `SINGLE_SERIES_CALENDAR`, `TRAINING`]

DatasetRefreshJobsListResponse


{
  "count": 0,
  "data": [
    {
      "categories": "BATCH_PREDICTIONS",
      "createdBy": "string",
      "credentialId": "string",
      "credentials": "string",
      "datasetId": "string",
      "enabled": true,
      "jobId": "string",
      "name": "string",
      "schedule": {
        "dayOfMonth": [
          "*"
        ],
        "dayOfWeek": [
          "*"
        ],
        "hour": [
          0
        ],
        "minute": [
          0
        ],
        "month": [
          "*"
        ]
      },
      "scheduleReferenceDate": "2019-08-24T14:15:22Z",
      "updatedAt": "2019-08-24T14:15:22Z",
      "updatedBy": "string",
      "useKerberos": true
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Properties¶

Name	Type	Required	Description
count	integer	false	Number of items returned on this page.
data	[DatasetRefreshJobResponse]	true	An array of information about scheduled dataset refresh jobs. Results are based on updatedAt value and returned in descending order (latest returned first).
next	string,null(uri)	true	URL pointing to the next page (if null, there is no next page).
previous	string,null(uri)	true	URL pointing to the previous page (if null, there is no previous page).
totalCount	integer	true	The total number of items across all pages.

DatasetRelationshipCreate


{
  "linkedDatasetId": "string",
  "linkedFeatures": [
    "string"
  ],
  "sourceFeatures": [
    "string"
  ]
}

Properties¶

Name	Type	Required	Restrictions	Description
linkedDatasetId	string	true		The ID of another dataset with which to create relationships.
linkedFeatures	[string]	true	minItems: 1	List of features belonging to the linked dataset.
sourceFeatures	[string]	true	minItems: 1	List of features belonging to the source dataset.

DatasetRelationshipListResponse


{
  "count": 0,
  "data": [
    {
      "createdBy": "string",
      "creationDate": "2019-08-24T14:15:22Z",
      "id": "string",
      "linkedDatasetId": "string",
      "linkedFeatures": [
        "string"
      ],
      "modificationDate": "2019-08-24T14:15:22Z",
      "modifiedBy": "string",
      "sourceDatasetId": "string",
      "sourceFeatures": [
        "string"
      ]
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Properties¶

Name	Type	Required	Description
count	integer	false	Number of items returned on this page.
data	[DatasetRelationshipResponse]	true	An array of relationships' details.
next	string,null(uri)	true	URL pointing to the next page (if null, there is no next page).
previous	string,null(uri)	true	URL pointing to the previous page (if null, there is no previous page).
totalCount	integer	true	The total number of items across all pages.

DatasetRelationshipResponse


{
  "createdBy": "string",
  "creationDate": "2019-08-24T14:15:22Z",
  "id": "string",
  "linkedDatasetId": "string",
  "linkedFeatures": [
    "string"
  ],
  "modificationDate": "2019-08-24T14:15:22Z",
  "modifiedBy": "string",
  "sourceDatasetId": "string",
  "sourceFeatures": [
    "string"
  ]
}

Properties¶

Name	Type	Required	Description
createdBy	string	true	The username of the user that created this relationship.
creationDate	string(date-time)	true	ISO-8601 formatted time/date that this record was created.
id	string	true	ID of the dataset relationship
linkedDatasetId	string	true	ID of the linked dataset.
linkedFeatures	[string]	true	List of features belonging to the linked dataset.
modificationDate	string(date-time)	true	ISO-8601 formatted time/date that this record was last updated.
modifiedBy	string	true	The username of the user that modified this relationship most recently.
sourceDatasetId	string	true	ID of the source dataset.
sourceFeatures	[string]	true	List of features belonging to the source dataset.

DatasetRelationshipUpdate


{
  "linkedDatasetId": "string",
  "linkedFeatures": [
    "string"
  ],
  "sourceFeatures": [
    "string"
  ]
}

Properties¶

Name	Type	Required	Restrictions	Description
linkedDatasetId	string	false		id of another dataset with which to create relationships
linkedFeatures	[string]	false	minItems: 1	list of features belonging to the linked dataset
sourceFeatures	[string]	false	minItems: 1	list of features belonging to the source dataset

DatasetRolesWithId


{
  "canShare": false,
  "canUseData": true,
  "id": "string",
  "role": "CONSUMER",
  "shareRecipientType": "user"
}

Properties¶

Name	Type	Required	Description
canShare	boolean	false	Whether the org/group/user should be able to share with others. If true, the org/group/user will be able to grant any role up to and including their own to other orgs/groups/user. If `role` is `NO_ROLE` `canShare` is ignored.
canUseData	boolean	false	Whether the user/group/org should be able to view, download and process data (use to create projects, predictions, etc). For OWNER `canUseData` is always True. If `role` is empty `canUseData` is ignored.
id	string	true	The org/group/user ID.
role	string	true	The role of the org/group/user on this dataset or "NO_ROLE" for removing access when used with route to modify access.
shareRecipientType	string	true	It describes the recipient type.

Enumerated Values¶

Property	Value
role	[`CONSUMER`, `EDITOR`, `OWNER`, `NO_ROLE`]
shareRecipientType	[`user`, `group`, `organization`]

DatasetRolesWithNames


{
  "canShare": false,
  "canUseData": true,
  "name": "string",
  "role": "CONSUMER",
  "shareRecipientType": "user"
}

Properties¶

Name	Type	Required	Description
canShare	boolean	false	Whether the org/group/user should be able to share with others. If true, the org/group/user will be able to grant any role up to and including their own to other orgs/groups/user. If `role` is `NO_ROLE` `canShare` is ignored.
canUseData	boolean	false	Whether the user/group/org should be able to view, download and process data (use to create projects, predictions, etc). For OWNER `canUseData` is always True. If `role` is empty `canUseData` is ignored.
name	string	true	Name of the user/group/org to update the access role for.
role	string	true	The role of the org/group/user on this dataset or "NO_ROLE" for removing access when used with route to modify access.
shareRecipientType	string	true	It describes the recipient type.

Enumerated Values¶

Property	Value
role	[`CONSUMER`, `EDITOR`, `OWNER`, `NO_ROLE`]
shareRecipientType	[`user`, `group`, `organization`]

DatasetSharedRoles


{
  "applyGrantToLinkedObjects": false,
  "operation": "updateRoles",
  "roles": [
    {
      "canShare": false,
      "canUseData": true,
      "name": "string",
      "role": "CONSUMER",
      "shareRecipientType": "user"
    }
  ]
}

Properties¶

Name	Type	Required	Restrictions	Description
applyGrantToLinkedObjects	boolean	false		If true for any users being granted access to the dataset, grant the user read access to any linked objects such as DataSources and DataStores that may be used by this dataset. Ignored if no such objects are relevant for dataset. Will not result in access being lowered for a user if the user already has higher access to linked objects than read access. However, if the target user does not have sharing permissions to the linked object, they will be given sharing access without lowering existing permissions. May result in an error if user making call does not have sufficient permissions to complete grant. Default value is false.
operation	string	true		The name of the action being taken. The only operation is "updateRoles".
roles	[oneOf]	true	maxItems: 100 minItems: 1	An array of RoleRequest objects. May contain at most 100 such objects.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	DatasetRolesWithNames	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	DatasetRolesWithId	false		none

Enumerated Values¶

Property	Value
operation	`updateRoles`

DatasetTransformListResponse


{
  "count": 0,
  "data": [
    {
      "dateExtraction": "year",
      "name": "string",
      "parentName": "string",
      "replacement": "string",
      "variableType": "text"
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Properties¶

Name	Type	Required	Description
count	integer	false	Number of items returned on this page.
data	[DatasetTransformResponse]	true	An array of transforms' details.
next	string,null(uri)	true	URL pointing to the next page (if null, there is no next page).
previous	string,null(uri)	true	URL pointing to the previous page (if null, there is no previous page).
totalCount	integer	true	The total number of items across all pages.

DatasetTransformResponse


{
  "dateExtraction": "year",
  "name": "string",
  "parentName": "string",
  "replacement": "string",
  "variableType": "text"
}

Properties¶

Name	Type	Required	Description
dateExtraction	string,null	true	The value to extract from the date column.
name	string	true	The feature name.
parentName	string	true	The name of the parent feature.
replacement	any	true	The replacement in case of a failed transformation.

anyOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		none

or

Name	Type	Required	Restrictions	Description
» anonymous	boolean	false		none

or

Name	Type	Required	Restrictions	Description
» anonymous	number	false		none

or

Name	Type	Required	Restrictions	Description
» anonymous	integer	false		none

continued

Name	Type	Required	Restrictions	Description
variableType	string	true		The type of the transform.

Enumerated Values¶

Property	Value
dateExtraction	[`year`, `yearDay`, `month`, `monthDay`, `week`, `weekDay`]
variableType	[`text`, `categorical`, `numeric`, `categoricalInt`]

Datasource


{
  "categories": "BATCH_PREDICTIONS",
  "credentialData": {
    "credentialType": "basic",
    "password": "string",
    "user": "string"
  },
  "credentialId": "string",
  "dataSourceId": "string",
  "doSnapshot": true,
  "password": "string",
  "persistDataAfterIngestion": true,
  "sampleSize": {
    "type": "rows",
    "value": 1000000
  },
  "useKerberos": false,
  "user": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
categories	any	false		An array of strings describing the intended use of the dataset.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	[string]	false		none

continued

Name	Type	Required	Restrictions	Description
credentialData	any	false		The credentials to authenticate with the database, to be used instead of credential ID.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	BasicCredentials	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	S3Credentials	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	OAuthCredentials	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	SnowflakeKeyPairCredentials	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	GoogleServiceAccountCredentials	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	DatabricksAccessTokenCredentials	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	DatabricksServicePrincipalCredentials	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	AzureServicePrincipalCredentials	false		none

continued

Name	Type	Required	Description
credentialId	string	false	The ID of the set of credentials to authenticate with the database.
dataSourceId	string	true	The identifier for the DataSource to use as the source of data.
doSnapshot	boolean	false	If true, create a snapshot dataset; if false, create a remote dataset. Creating snapshots from non-file sources requires an additional permission, `Enable Create Snapshot Data Source`.
password	string	false	The password (in cleartext) for database authentication. The password will be encrypted on the server side in scope of HTTP request and never saved or stored. DEPRECATED: please use credentialId or credentialData instead.
persistDataAfterIngestion	boolean	false	If true, will enforce saving all data (for download and sampling) and will allow a user to view extended data profile (which includes data statistics like min/max/median/mean, histogram, etc.). If false, will not enforce saving data. The data schema (feature names and types) still will be available. Specifying this parameter to false and `doSnapshot` to true will result in an error.
sampleSize	SampleSize	false	Ingest size to use during dataset registration. Default behavior is to ingest full dataset.
useKerberos	boolean	false	If true, use kerberos authentication for database authentication.
user	string	false	The username for database authentication. DEPRECATED: please use credentialId or credentialData instead.

Enumerated Values¶

Property	Value
anonymous	[`BATCH_PREDICTIONS`, `MULTI_SERIES_CALENDAR`, `PREDICTION`, `SAMPLE`, `SINGLE_SERIES_CALENDAR`, `TRAINING`]

EntityCountByTypeResponse


{
  "numCalendars": 0,
  "numExternalModelPackages": 0,
  "numFeatureDiscoveryConfigs": 0,
  "numPredictionDatasets": 0,
  "numProjects": 0,
  "numSparkSqlQueries": 0
}

Number of different type entities that use the dataset.

Properties¶

Name	Type	Required	Description
numCalendars	integer	true	The number of calendars that use the dataset
numExternalModelPackages	integer	true	The number of external model packages that use the dataset
numFeatureDiscoveryConfigs	integer	true	The number of feature discovery configs that use the dataset
numPredictionDatasets	integer	true	The number of prediction datasets that use the dataset
numProjects	integer	true	The number of projects that use the dataset
numSparkSqlQueries	integer	true	The number of spark sql queries that use the dataset

FeatureCountByTypeResponse


{
  "count": 0,
  "featureType": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
count	integer	true		The number of features of this type in the dataset
featureType	string	true		The data type grouped in this count

FeatureKeySummaryDetailsResponseValidatorMultilabel


{
  "max": 0,
  "mean": 0,
  "median": 0,
  "min": 0,
  "pctRows": 0,
  "stdDev": 0
}

Statistics of the key.

Properties¶

Name	Type	Required	Description
max	number	true	Maximum value of the key.
mean	number	true	Mean value of the key.
median	number	true	Median value of the key.
min	number	true	Minimum value of the key.
pctRows	number	true	Percentage occurrence of key in the EDA sample of the feature.
stdDev	number	true	Standard deviation of the key.

FeatureKeySummaryDetailsResponseValidatorSummarizedCategorical


{
  "dataQualities": "ISSUES_FOUND",
  "max": 0,
  "mean": 0,
  "median": 0,
  "min": 0,
  "pctRows": 0,
  "stdDev": 0
}

Statistics of the key.

Properties¶

Name	Type	Required	Description
dataQualities	string	true	The indicator of data quality assessment of the feature.
max	number	true	Maximum value of the key.
mean	number	true	Mean value of the key.
median	number	true	Median value of the key.
min	number	true	Minimum value of the key.
pctRows	number	true	Percentage occurrence of key in the EDA sample of the feature.
stdDev	number	true	Standard deviation of the key.

Enumerated Values¶

Property	Value
dataQualities	[`ISSUES_FOUND`, `NOT_ANALYZED`, `NO_ISSUES_FOUND`]

FeatureKeySummaryResponseValidatorMultilabel


{
  "key": "string",
  "summary": {
    "max": 0,
    "mean": 0,
    "median": 0,
    "min": 0,
    "pctRows": 0,
    "stdDev": 0
  }
}

Properties¶

Name	Type	Required	Restrictions	Description
key	string	true		Name of the key.
summary	FeatureKeySummaryDetailsResponseValidatorMultilabel	true		Statistics of the key.

FeatureKeySummaryResponseValidatorSummarizedCategorical


{
  "key": "string",
  "summary": {
    "dataQualities": "ISSUES_FOUND",
    "max": 0,
    "mean": 0,
    "median": 0,
    "min": 0,
    "pctRows": 0,
    "stdDev": 0
  }
}

For a Summarized Categorical columns, this will contain statistics for the top 50 keys (truncated to 103 characters)

Properties¶

Name	Type	Required	Restrictions	Description
key	string	true		Name of the key.
summary	FeatureKeySummaryDetailsResponseValidatorSummarizedCategorical	true		Statistics of the key.

FeatureListCreate


{
  "description": "string",
  "features": [
    "string"
  ],
  "name": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
description	string	false		Description of the featurelist
features	[string]	true	minItems: 1	List of names of features to be included in the new featurelist, all features listed must be part of the universe. All features for this dataset for the request to succeed.
name	string	true		Name of the featurelist to be created

FeatureListModify


{
  "description": "string",
  "name": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
description	string,null	false		The new description of the featurelist
name	string	false		The new name of the featurelist.

FeatureTransform


{
  "dateExtraction": "year",
  "name": "string",
  "parentName": "string",
  "replacement": "string",
  "variableType": "text"
}

Properties¶

Name	Type	Required	Description
dateExtraction	string	false	The value to extract from the date column, of these options: `[year\|yearDay\|month\|monthDay\|week\|weekDay]`. Required for transformation of a date column. Otherwise must not be provided.
name	string	true	The name of the new feature. Must not be the same as any existing features for this project. Must not contain '/' character.
parentName	string	true	The name of the parent feature.
replacement	any	false	The replacement in case of a failed transformation.

anyOf

Name	Type	Required	Restrictions	Description
» anonymous	string,null	false		none

or

Name	Type	Required	Restrictions	Description
» anonymous	boolean,null	false		none

or

Name	Type	Required	Restrictions	Description
» anonymous	number,null	false		none

or

Name	Type	Required	Restrictions	Description
» anonymous	integer,null	false		none

continued

Name	Type	Required	Restrictions	Description
variableType	string	true		The type of the new feature. Must be one of `text`, `categorical` (Deprecated in version v2.21), `numeric`, or `categoricalInt`. See the description of this method for more information.

Enumerated Values¶

Property	Value
dateExtraction	[`year`, `yearDay`, `month`, `monthDay`, `week`, `weekDay`]
variableType	[`text`, `categorical`, `numeric`, `categoricalInt`]

FromLatest


{
  "categories": "BATCH_PREDICTIONS",
  "credentialData": {
    "credentialType": "basic",
    "password": "string",
    "user": "string"
  },
  "credentialId": "string",
  "credentials": "string",
  "password": "string",
  "useKerberos": false,
  "useLatestSuccess": false,
  "user": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
categories	any	false		An array of strings describing the intended use of the dataset.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	[string]	false		none

continued

Name	Type	Required	Restrictions	Description
credentialData	any	false		The credentials to authenticate with the database, to be used instead of credential ID.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	BasicCredentials	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	S3Credentials	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	OAuthCredentials	false		none

continued

Name	Type	Required	Description
credentialId	string	false	The ID of the set of credentials to authenticate with the database.
credentials	string	false	A list of credentials to use if this is a Spark dataset that requires credentials.
password	string	false	The password (in cleartext) for database authentication. The password will be encrypted on the server-side HTTP request and never saved or stored. Required only if the previous data source was a data source. DEPRECATED: please use credentialId or credentialData instead.
useKerberos	boolean	false	If true, use Kerberos for database authentication.
useLatestSuccess	boolean	false	If true, use the latest version that was successfully ingested instead of the latest version, which might be in an errored state. If no successful version is present, the latest errored version is used and the operation fails.
user	string	false	The username for database authentication. Required only if the dataset was initially created from a data source. DEPRECATED: please use credentialId or credentialData instead.

Enumerated Values¶

Property	Value
anonymous	[`BATCH_PREDICTIONS`, `MULTI_SERIES_CALENDAR`, `PREDICTION`, `SAMPLE`, `SINGLE_SERIES_CALENDAR`, `TRAINING`]

FromSpecific


{
  "categories": "BATCH_PREDICTIONS",
  "credentialData": {
    "credentialType": "basic",
    "password": "string",
    "user": "string"
  },
  "credentialId": "string",
  "credentials": "string",
  "password": "string",
  "useKerberos": false,
  "user": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
categories	any	false		An array of strings describing the intended use of the dataset.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	[string]	false		none

continued

Name	Type	Required	Restrictions	Description
credentialData	any	false		The credentials to authenticate with the database, to be used instead of credential ID.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	BasicCredentials	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	S3Credentials	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	OAuthCredentials	false		none

continued

Name	Type	Required	Description
credentialId	string	false	The ID of the set of credentials to authenticate with the database.
credentials	string	false	A list of credentials to use if this is a Spark dataset that requires credentials.
password	string	false	The password (in cleartext) for database authentication. The password will be encrypted on the server-side HTTP request and never saved or stored. Required only if the previous data source was a data source. DEPRECATED: please use credentialId or credentialData instead.
useKerberos	boolean	false	If true, use Kerberos for database authentication.
user	string	false	The username for database authentication. Required only if the dataset was initially created from a data source. DEPRECATED: please use credentialId or credentialData instead.

Enumerated Values¶

Property	Value
anonymous	[`BATCH_PREDICTIONS`, `MULTI_SERIES_CALENDAR`, `PREDICTION`, `SAMPLE`, `SINGLE_SERIES_CALENDAR`, `TRAINING`]

FullDatasetDetailsResponse


{
  "categories": [
    "BATCH_PREDICTIONS"
  ],
  "columnCount": 0,
  "createdBy": "string",
  "creationDate": "2019-08-24T14:15:22Z",
  "dataEngineQueryId": "string",
  "dataPersisted": true,
  "dataSourceId": "string",
  "dataSourceType": "string",
  "datasetId": "string",
  "datasetSize": 0,
  "description": "string",
  "eda1ModificationDate": "2019-08-24T14:15:22Z",
  "eda1ModifierFullName": "string",
  "entityCountByType": {
    "numCalendars": 0,
    "numExternalModelPackages": 0,
    "numFeatureDiscoveryConfigs": 0,
    "numPredictionDatasets": 0,
    "numProjects": 0,
    "numSparkSqlQueries": 0
  },
  "error": "string",
  "featureCount": 0,
  "featureCountByType": [
    {
      "count": 0,
      "featureType": "string"
    }
  ],
  "featureDiscoveryProjectId": "string",
  "isDataEngineEligible": true,
  "isLatestVersion": true,
  "isSnapshot": true,
  "isWranglingEligible": true,
  "lastModificationDate": "2019-08-24T14:15:22Z",
  "lastModifierFullName": "string",
  "name": "string",
  "processingState": "COMPLETED",
  "recipeId": "string",
  "rowCount": 0,
  "sampleSize": {
    "type": "rows",
    "value": 1000000
  },
  "tags": [
    "string"
  ],
  "timeSeriesProperties": {
    "isMostlyImputed": null
  },
  "uri": "string",
  "versionId": "string"
}

Properties¶

Name	Type	Required	Description
categories	[string]	true	An array of strings describing the intended use of the dataset.
columnCount	integer	true	The number of columns in the dataset.
createdBy	string,null	true	Username of the user who created the dataset.
creationDate	string(date-time)	true	The date when the dataset was created.
dataEngineQueryId	string,null	true	ID of the source data engine query.
dataPersisted	boolean	true	If true, user is allowed to view extended data profile (which includes data statistics like min/max/median/mean, histogram, etc.) and download data. If false, download is not allowed and only the data schema (feature names and types) will be available.
dataSourceId	string,null	true	ID of the datasource used as the source of the dataset.
dataSourceType	string	true	The type of the datasource that was used as the source of the dataset.
datasetId	string	true	The ID of this dataset.
datasetSize	integer	true	The size of the dataset as a CSV in bytes.
description	string,null	true	The description of the dataset.
eda1ModificationDate	string(date-time)	true	The ISO 8601 formatted date and time when the EDA1 for the dataset was updated.
eda1ModifierFullName	string	true	The user who was the last to update EDA1 for the dataset.
entityCountByType	EntityCountByTypeResponse	false	Number of different type entities that use the dataset.
error	string	true	Details of exception raised during ingestion process, if any.
featureCount	integer	true	Total number of features in the dataset.
featureCountByType	[FeatureCountByTypeResponse]	true	Number of features in the dataset grouped by feature type.
featureDiscoveryProjectId	string	false	Feature Discovery project ID used to create the dataset.
isDataEngineEligible	boolean	true	Whether this dataset can be a data source of a data engine query.
isLatestVersion	boolean	true	Whether this dataset version is the latest version of this dataset.
isSnapshot	boolean	true	Whether the dataset is an immutable snapshot of data which has previously been retrieved and saved to DataRobot.
isWranglingEligible	boolean	true	Whether the source of the dataset can support wrangling.
lastModificationDate	string(date-time)	true	The ISO 8601 formatted date and time when the dataset was last modified.
lastModifierFullName	string	true	Full name of user who was the last to modify the dataset.
name	string	true	The name of this dataset in the catalog.
processingState	string	true	Current ingestion process state of dataset.
recipeId	string,null	true	ID of the source recipe.
rowCount	integer	true	The number of rows in the dataset.
sampleSize	SampleSize	false	Ingest size to use during dataset registration. Default behavior is to ingest full dataset.
tags	[string]	true	List of tags attached to the item.
timeSeriesProperties	TimeSeriesProperties	true	Properties related to time series data prep.
uri	string	true	The URI to datasource. For example, `file_name.csv`, or `jdbc:DATA_SOURCE_GIVEN_NAME/SCHEMA.TABLE_NAME`, or `jdbc:DATA_SOURCE_GIVEN_NAME/<query>` for `query` based datasources, or`https://s3.amazonaws.com/dr-pr-tst-data/kickcars-sample-200.csv`, etc.
versionId	string	true	The object ID of the catalog_version the dataset belongs to.

Enumerated Values¶

Property	Value
processingState	[`COMPLETED`, `ERROR`, `RUNNING`]

GCPKey


{
  "authProviderX509CertUrl": "http://example.com",
  "authUri": "http://example.com",
  "clientEmail": "string",
  "clientId": "string",
  "clientX509CertUrl": "http://example.com",
  "privateKey": "string",
  "privateKeyId": "string",
  "projectId": "string",
  "tokenUri": "http://example.com",
  "type": "service_account"
}

The Google Cloud Platform (GCP) key. Output is the downloaded JSON resulting from creating a service account User Managed Key (in the IAM & admin > Service accounts section of GCP).Required if googleConfigId/configId is not specified.Cannot include this parameter if googleConfigId/configId is specified.

Properties¶

Name	Type	Required	Description
authProviderX509CertUrl	string(uri)	false	Auth provider X509 certificate URL.
authUri	string(uri)	false	Auth URI.
clientEmail	string	false	Client email address.
clientId	string	false	Client ID.
clientX509CertUrl	string(uri)	false	Client X509 certificate URL.
privateKey	string	false	Private key.
privateKeyId	string	false	Private key ID
projectId	string	false	Project ID.
tokenUri	string(uri)	false	Token URI.
type	string	true	GCP account type.

Enumerated Values¶

Property	Value
type	`service_account`

GeneratorSettings


{
  "datetimePartitionColumn": "string",
  "defaultCategoricalAggregationMethod": "last",
  "defaultNumericAggregationMethod": "mean",
  "defaultTextAggregationMethod": "concat",
  "endToSeriesMaxDatetime": true,
  "multiseriesIdColumns": [
    "string"
  ],
  "startFromSeriesMinDatetime": true,
  "target": "string",
  "timeStep": 0,
  "timeUnit": "DAY"
}

Data engine generator settings of the given generator_type.

Properties¶

Name	Type	Required	Restrictions	Description
datetimePartitionColumn	string	true		The date column that will be used as a datetime partition column in time series project.
defaultCategoricalAggregationMethod	string	true		Default aggregation method used for categorical feature.
defaultNumericAggregationMethod	string	true		Default aggregation method used for numeric feature.
defaultTextAggregationMethod	string	false		Default aggregation method used for text feature.
endToSeriesMaxDatetime	boolean	false		A boolean value indicating whether generates post-aggregated series up to series maximum datetime or global maximum datetime.
multiseriesIdColumns	[string]	false	maxItems: 1 minItems: 1	An array with the names of columns identifying the series to which row of the output dataset belongs. Currently, only one multiseries ID column is supported.
startFromSeriesMinDatetime	boolean	false		A boolean value indicating whether post-aggregated series starts from series minimum datetime or global minimum datetime.
target	string	false		The name of target for the output dataset.
timeStep	integer	true		Number of time steps for the output dataset.
timeUnit	string	true		Indicates which unit is a basis for time steps of the output dataset.

Enumerated Values¶

Property	Value
defaultCategoricalAggregationMethod	[`last`, `mostFrequent`]
defaultNumericAggregationMethod	[`mean`, `sum`]
defaultTextAggregationMethod	[`concat`, `last`, `meanLength`, `mostFrequent`, `totalLength`]
timeUnit	[`DAY`, `HOUR`, `MINUTE`, `MONTH`, `QUARTER`, `WEEK`, `YEAR`]

GetDatasetVersionProjectsResponse


{
  "count": 0,
  "data": [
    {
      "id": "string",
      "url": "string"
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Properties¶

Name	Type	Required	Description
count	integer	false	Number of items returned on this page.
data	[DatasetProject]	true	Array of project references.
next	string,null(uri)	true	URL pointing to the next page (if null, there is no next page).
previous	string,null(uri)	true	URL pointing to the previous page (if null, there is no previous page).
totalCount	integer	true	The total number of items across all pages.

GoogleServiceAccountCredentials


{
  "configId": "string",
  "credentialType": "gcp",
  "gcpKey": {
    "authProviderX509CertUrl": "http://example.com",
    "authUri": "http://example.com",
    "clientEmail": "string",
    "clientId": "string",
    "clientX509CertUrl": "http://example.com",
    "privateKey": "string",
    "privateKeyId": "string",
    "projectId": "string",
    "tokenUri": "http://example.com",
    "type": "service_account"
  },
  "googleConfigId": "string"
}

Properties¶

Name	Type	Required	Description
configId	string	false	ID of Secure configurations shared by admin.Alternative to googleConfigId (deprecated). If specified, cannot include gcpKey.
credentialType	string	true	The type of these credentials, 'gcp' here.
gcpKey	GCPKey	false	The Google Cloud Platform (GCP) key. Output is the downloaded JSON resulting from creating a service account User Managed Key (in the IAM & admin > Service accounts section of GCP).Required if googleConfigId/configId is not specified.Cannot include this parameter if googleConfigId/configId is specified.
googleConfigId	string	false	ID of Secure configurations shared by admin. This is deprecated.Please use configId instead. If specified, cannot include gcpKey.

Enumerated Values¶

Property	Value
credentialType	`gcp`

Hdfs


{
  "categories": "BATCH_PREDICTIONS",
  "doSnapshot": true,
  "namenodeWebhdfsPort": 0,
  "password": "string",
  "persistDataAfterIngestion": true,
  "url": "http://example.com",
  "user": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
categories	any	false		An array of strings describing the intended use of the dataset.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	[string]	false		none

continued

Name	Type	Required	Description
doSnapshot	boolean	false	If true, create a snapshot dataset; if false, create a remote dataset. Creating snapshots from non-file sources requires an additional permission, `Enable Create Snapshot Data Source`.
namenodeWebhdfsPort	integer	false	The port of HDFS name node.
password	string	false	The password (in cleartext) for authenticating to HDFS using Kerberos. The password will be encrypted on the server side in scope of HTTP request and never saved or stored.
persistDataAfterIngestion	boolean	false	If true, will enforce saving all data (for download and sampling) and will allow a user to view extended data profile (which includes data statistics like min/max/median/mean, histogram, etc.). If false, will not enforce saving data. The data schema (feature names and types) still will be available. Specifying this parameter to false and `doSnapshot` to true will result in an error
url	string(uri)	true	The HDFS url to use as the source of data for the dataset being created.
user	string	false	The username for authenticating to HDFS using Kerberos.

Enumerated Values¶

Property	Value
anonymous	[`BATCH_PREDICTIONS`, `MULTI_SERIES_CALENDAR`, `PREDICTION`, `SAMPLE`, `SINGLE_SERIES_CALENDAR`, `TRAINING`]

MaterializationDestination


{
  "catalog": "string",
  "schema": "string",
  "table": "string"
}

Destination table information to create and materialize the recipe to. If None, the recipe will be materialized in DataRobot.

Properties¶

Name	Type	Required	Description
catalog	string	true	Database to materialize the recipe to.
schema	string	true	Schema to materialize the recipe to.
table	string	true	Table name to create and materialize the recipe to. This table should not already exist.

OAuthCredentials


{
  "credentialType": "oauth",
  "oauthAccessToken": null,
  "oauthClientId": null,
  "oauthClientSecret": null,
  "oauthRefreshToken": "string"
}

Properties¶

Name	Type	Required	Description
credentialType	string	true	The type of these credentials, 'oauth' here.
oauthAccessToken	string,null	false	The oauth access token.
oauthClientId	string,null	false	The oauth client ID.
oauthClientSecret	string,null	false	The oauth client secret.
oauthRefreshToken	string	true	The oauth refresh token.

Enumerated Values¶

Property	Value
credentialType	`oauth`

PatchDataset


{
  "categories": "BATCH_PREDICTIONS",
  "name": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
categories	any	false		An array of strings describing the intended use of the dataset. If any categories were previously specified for the dataset, they will be overwritten.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	[string]	false		none

continued

Name	Type	Required	Restrictions	Description
name	string	false		The new name of the dataset.

Enumerated Values¶

Property	Value
anonymous	[`BATCH_PREDICTIONS`, `MULTI_SERIES_CALENDAR`, `PREDICTION`, `SAMPLE`, `SINGLE_SERIES_CALENDAR`, `TRAINING`]

RetrieveDataEngineQueryResponse


{
  "datasets": [
    {
      "alias": "string",
      "datasetId": "string",
      "datasetVersionId": "string"
    }
  ],
  "generatorSettings": {
    "datetimePartitionColumn": "string",
    "defaultCategoricalAggregationMethod": "last",
    "defaultNumericAggregationMethod": "mean",
    "defaultTextAggregationMethod": "concat",
    "endToSeriesMaxDatetime": true,
    "multiseriesIdColumns": [
      "string"
    ],
    "startFromSeriesMinDatetime": true,
    "target": "string",
    "timeStep": 0,
    "timeUnit": "DAY"
  },
  "generatorType": "TimeSeries",
  "id": "string",
  "query": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
datasets	[DataEngineDataset]	true	maxItems: 32	Source datasets in the Data Engine workspace.
generatorSettings	GeneratorSettings	true		Data engine generator settings of the given `generator_type`.
generatorType	string	true		Type of data engine query generator
id	string	true		The ID of the data engine query generator.
query	string	true		Generated SparkSQL query.

Enumerated Values¶

Property	Value
generatorType	`TimeSeries`

S3Credentials


{
  "awsAccessKeyId": "string",
  "awsSecretAccessKey": "string",
  "awsSessionToken": null,
  "configId": "string",
  "credentialType": "s3"
}

Properties¶

Name	Type	Required	Description
awsAccessKeyId	string	false	The S3 AWS access key ID. Required if configId is not specified.Cannot include this parameter if configId is specified.
awsSecretAccessKey	string	false	The S3 AWS secret access key. Required if configId is not specified.Cannot include this parameter if configId is specified.
awsSessionToken	string,null	false	The S3 AWS session token for AWS temporary credentials.Cannot include this parameter if configId is specified.
configId	string	false	ID of Secure configurations of credentials shared by admin.If specified, cannot include awsAccessKeyId, awsSecretAccessKey or awsSessionToken
credentialType	string	true	The type of these credentials, 's3' here.

Enumerated Values¶

Property	Value
credentialType	`s3`

SampleSize


{
  "type": "rows",
  "value": 1000000
}

Ingest size to use during dataset registration. Default behavior is to ingest full dataset.

Properties¶

Name	Type	Required	Restrictions	Description
type	string	true		Sample size can be specified only as a number of rows for now.
value	integer	true	maximum: 1000000	Number of rows to ingest during dataset registration.

Enumerated Values¶

Property	Value
type	`rows`

SharedRolesListResponse


{
  "count": 0,
  "data": [
    {
      "canShare": true,
      "canUseData": true,
      "id": "string",
      "name": "string",
      "role": "CONSUMER",
      "shareRecipientType": "user",
      "userFullName": "string"
    }
  ],
  "next": "http://example.com",
  "previous": "http://example.com",
  "totalCount": 0
}

Properties¶

Name	Type	Required	Description
count	integer	false	Number of items returned on this page.
data	[SharedRolesResponse]	true	An array of SharedRoles objects.
next	string,null(uri)	true	URL pointing to the next page (if null, there is no next page).
previous	string,null(uri)	true	URL pointing to the previous page (if null, there is no previous page).
totalCount	integer	true	The total number of items across all pages.

SharedRolesResponse


{
  "canShare": true,
  "canUseData": true,
  "id": "string",
  "name": "string",
  "role": "CONSUMER",
  "shareRecipientType": "user",
  "userFullName": "string"
}

Properties¶

Name	Type	Required	Description
canShare	boolean	true	True if this user can share with other users
canUseData	boolean	true	True if the user can view, download and process data (use to create projects, predictions, etc)
id	string	true	The ID of the recipient organization, group or user.
name	string	true	The name of the recipient organization, group or user.
role	string	true	The role of the org/group/user on this dataset or "NO_ROLE" for removing access when used with route to modify access.
shareRecipientType	string	true	It describes the recipient type.
userFullName	string	false	If the recipient type is a user, the full name of the user if available.

Enumerated Values¶

Property	Value
role	[`CONSUMER`, `EDITOR`, `OWNER`, `NO_ROLE`]
shareRecipientType	[`user`, `group`, `organization`]

SnowflakeKeyPairCredentials


{
  "configId": "string",
  "credentialType": "snowflake_key_pair_user_account",
  "passphrase": "string",
  "privateKeyStr": "string",
  "user": "string"
}

Properties¶

Name	Type	Required	Description
configId	string	false	The ID of the saved shared credentials. If specified, cannot include user, privateKeyStr or passphrase.
credentialType	string	true	The type of these credentials, 'snowflake_key_pair_user_account' here.
passphrase	string	false	Optional passphrase to decrypt private key. Cannot include this parameter if configId is specified.
privateKeyStr	string	false	Private key for key pair authentication. Required if configId is not specified. Cannot include this parameter if configId is specified.
user	string	false	Username for this credential. Required if configId is not specified. Cannot include this parameter if configId is specified.

Enumerated Values¶

Property	Value
credentialType	`snowflake_key_pair_user_account`

TimeSeriesProperties


{
  "isMostlyImputed": null
}

Properties related to time series data prep.

Properties¶

Name	Type	Required	Restrictions	Description
isMostlyImputed	boolean,null	true		Whether more than half of the rows are imputed.

UpdateCatalogMetadata


{
  "description": "string",
  "name": "string",
  "tags": [
    "string"
  ]
}

Properties¶

Name	Type	Required	Restrictions	Description
description	string	false	maxLength: 1000	New catalog item description
name	string	false	maxLength: 255	New catalog item name
tags	[string]	false		New catalog item tags. Tags must be the lower case, without spaces,and cannot include -$.,{}"#' special characters.

UpdateDatasetDeleted

{}

Properties¶

None

Url


{
  "categories": "BATCH_PREDICTIONS",
  "doSnapshot": true,
  "persistDataAfterIngestion": true,
  "sampleSize": {
    "type": "rows",
    "value": 1000000
  },
  "url": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
categories	any	false		An array of strings describing the intended use of the dataset.

oneOf

Name	Type	Required	Restrictions	Description
» anonymous	string	false		none

xor

Name	Type	Required	Restrictions	Description
» anonymous	[string]	false		none

continued

Name	Type	Required	Description
doSnapshot	boolean	false	If true, create a snapshot dataset; if false, create a remote dataset. Creating snapshots from non-file sources requires an additional permission, `Enable Create Snapshot Data Source`.
persistDataAfterIngestion	boolean	false	If true, will enforce saving all data (for download and sampling) and will allow a user to view extended data profile (which includes data statistics like min/max/median/mean, histogram, etc.). If false, will not enforce saving data. The data schema (feature names and types) still will be available. Specifying this parameter to false and `doSnapshot` to true will result in an error.
sampleSize	SampleSize	false	Ingest size to use during dataset registration. Default behavior is to ingest full dataset.
url	string(url)	true	The URL to download the dataset used to create the dataset item and version.

Enumerated Values¶

Property	Value
anonymous	[`BATCH_PREDICTIONS`, `MULTI_SERIES_CALENDAR`, `PREDICTION`, `SAMPLE`, `SINGLE_SERIES_CALENDAR`, `TRAINING`]

WorkspaceSourceCreatedResponse


{
  "workspaceStateId": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
workspaceStateId	string	true		The ID of the data engine workspace state.

WorkspaceSourceDatasetWithCredsResponse


{
  "alias": "string",
  "datasetId": "string",
  "datasetVersionId": "string",
  "needsCredentials": true
}

Properties¶

Name	Type	Required	Description
alias	string	true	Alias to be used as the table name.
datasetId	string	true	ID of a dataset in the catalog.
datasetVersionId	string	true	ID of a dataset version in the catalog.
needsCredentials	boolean	true	Whether a user must provide credentials for source datasets.

WorkspaceStateCreatedFromQueryGeneratorResponse


{
  "datasets": [
    {
      "alias": "string",
      "datasetId": "string",
      "datasetVersionId": "string"
    }
  ],
  "language": "string",
  "query": "string",
  "queryGeneratorId": "string",
  "workspaceStateId": "string"
}

Properties¶

Name	Type	Required	Restrictions	Description
datasets	[DataEngineDataset]	true	maxItems: 32	Source datasets in the Data Engine workspace.
language	string	true		Language of the Data Engine query.
query	string	true		Actual body of the Data Engine query.
queryGeneratorId	string,null	true		Query generator id.
workspaceStateId	string	true		Data Engine workspace state ID.

WorkspaceStateResponse


{
  "datasets": [
    {
      "alias": "string",
      "datasetId": "string",
      "datasetVersionId": "string",
      "needsCredentials": true
    }
  ],
  "language": "string",
  "query": "string",
  "queryGeneratorId": "string",
  "runTime": 0
}

Properties¶

Name	Type	Required	Description
datasets	[WorkspaceSourceDatasetWithCredsResponse]	true	The source datasets in the data engine workspace.
language	string	true	The language of the data engine query.
query	string	true	The actual SQL statement of the data engine query.
queryGeneratorId	string,null	true	The query generator ID.
runTime	number,null	true	The execution time of the data engine query.