Data Wrangling¶
This page outlines the operations, endpoints, parameters, and example requests and responses for the Data Wrangling.
GET /api/v2/recipes/¶
Get a list of the recipes available for given user.
Code samples¶
# You can also use wget
curl -X GET https://app.datarobot.com/api/v2/recipes/ \
-H "Accept: application/json" \
-H "Authorization: Bearer {access-token}"
Parameters
Name | In | Type | Required | Description |
---|---|---|---|---|
offset | query | integer | false | Number of results to skip. |
limit | query | integer | false | At most this many results are returned. The default may change without notice. |
orderBy | query | string | false | The attribute sort order applied to the returned recipes list: 'recipe_id', 'name', 'description', 'dialect', 'status', 'recipe_type', 'created_at', 'created_by', 'updated_at', 'updated_by'. Prefix the attribute name with a dash to sort in descending order. e.g., orderBy='-name'. Defaults to '-created'. |
search | query | string | false | Only return recipes with names that contain the specified string. |
dialect | query | any | false | SQL dialect for Query Generator. |
status | query | any | false | Status used for filtering recipes. |
recipeType | query | any | false | Type of the recipe workflow. |
creatorUserId | query | any | false | Filter results to display only those created by user(s) associated with the specified ID. |
creatorUsername | query | any | false | Filter results to display only those created by user(s) associated with the specified username. |
Enumerated Values¶
Parameter | Value |
---|---|
orderBy | [recipeId , -recipeId , name , -name , description , -description , dialect , -dialect , status , -status , recipeType , -recipeType , createdAt , -createdAt , createdBy , -createdBy , updatedAt , -updatedAt , updatedBy , -updatedBy ] |
Example responses¶
200 Response
{
"count": 0,
"data": [
{
"createdAt": "2019-08-24T14:15:22Z",
"createdBy": {
"email": "string",
"fullName": "string",
"id": "string",
"userhash": "string",
"username": "string"
},
"description": "string",
"dialect": "snowflake",
"downsampling": {
"arguments": {
"rows": 0,
"seed": null
},
"directive": "random-sample"
},
"errorMessage": null,
"failedOperationsIndex": null,
"inputs": [
{
"alias": "string",
"dataSourceId": "string",
"dataStoreId": "string",
"inputType": "datasource",
"sampling": {
"arguments": {
"rows": 10000,
"seed": 0
},
"directive": "random-sample"
}
}
],
"name": "string",
"operations": [
{
"arguments": {
"conditions": [
{
"column": "string",
"function": "between",
"functionArguments": []
}
],
"keepRows": true,
"operator": "and"
},
"directive": "filter"
}
],
"recipeId": "string",
"recipeType": "wrangling",
"settings": {
"featureDiscoveryProjectId": "string",
"featureDiscoverySupervisedFeatureReduction": null,
"predictionPoint": "string",
"relationshipsConfigurationId": "string",
"target": "string",
"weightsFeature": "string"
},
"status": "draft",
"updatedAt": "2019-08-24T14:15:22Z",
"updatedBy": {
"email": "string",
"fullName": "string",
"id": "string",
"userhash": "string",
"username": "string"
}
}
],
"next": "http://example.com",
"previous": "http://example.com",
"totalCount": 0
}
Responses¶
Status | Meaning | Description | Schema |
---|---|---|---|
200 | OK | none | RecipesListResponse |
To perform this operation, you must be authenticated by means of one of the following methods:
BearerAuth
POST /api/v2/recipes/fromDataStore/¶
Create a recipe which could be used for wrangling from a created fully reconfigured source of data. A data source
specifies, via SQL query or selected table and schema data, which data to extract from the data connection
(the location of data within a given endpoint) to use for modeling or predictions. A data source
has one data connection
and one connector
but can have many datasets
.
Code samples¶
# You can also use wget
curl -X POST https://app.datarobot.com/api/v2/recipes/fromDataStore/ \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "Authorization: Bearer {access-token}"
Body parameter¶
{
"dataSourceType": "dr-database-v1",
"dataStoreId": "string",
"dialect": "snowflake",
"experimentContainerId": "string",
"inputs": [
{
"canonicalName": "string",
"catalog": "string",
"sampling": {
"arguments": {
"rows": 10000,
"seed": 0
},
"directive": "random-sample"
},
"schema": "string",
"table": "string"
}
],
"useCaseId": "string"
}
Parameters
Name | In | Type | Required | Description |
---|---|---|---|---|
body | body | RecipeFromDataSourceCreate | false | none |
Example responses¶
201 Response
{
"createdAt": "2019-08-24T14:15:22Z",
"createdBy": {
"email": "string",
"fullName": "string",
"id": "string",
"userhash": "string",
"username": "string"
},
"description": "string",
"dialect": "snowflake",
"downsampling": {
"arguments": {
"rows": 0,
"seed": null
},
"directive": "random-sample"
},
"errorMessage": null,
"failedOperationsIndex": null,
"inputs": [
{
"alias": "string",
"dataSourceId": "string",
"dataStoreId": "string",
"inputType": "datasource",
"sampling": {
"arguments": {
"rows": 10000,
"seed": 0
},
"directive": "random-sample"
}
}
],
"name": "string",
"operations": [
{
"arguments": {
"conditions": [
{
"column": "string",
"function": "between",
"functionArguments": []
}
],
"keepRows": true,
"operator": "and"
},
"directive": "filter"
}
],
"recipeId": "string",
"recipeType": "wrangling",
"settings": {
"featureDiscoveryProjectId": "string",
"featureDiscoverySupervisedFeatureReduction": null,
"predictionPoint": "string",
"relationshipsConfigurationId": "string",
"target": "string",
"weightsFeature": "string"
},
"status": "draft",
"updatedAt": "2019-08-24T14:15:22Z",
"updatedBy": {
"email": "string",
"fullName": "string",
"id": "string",
"userhash": "string",
"username": "string"
}
}
Responses¶
Status | Meaning | Description | Schema |
---|---|---|---|
201 | Created | Data source and recipe created successfully. | RecipeResponse |
To perform this operation, you must be authenticated by means of one of the following methods:
BearerAuth
POST /api/v2/recipes/fromDataset/¶
Create a recipe which could be used for wrangling from given dataset. Deepcopy the dataset's recipe if available.Otherwise create a new recipe reusing the dataset's data source.A data source
specifies, via SQL query or selected table and schema data, which data to extract from the data connection
(the location of data within a given endpoint) to use for modeling or predictions. A data source
has one data connection
and one connector
but can have many datasets
.
Code samples¶
# You can also use wget
curl -X POST https://app.datarobot.com/api/v2/recipes/fromDataset/ \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "Authorization: Bearer {access-token}"
Body parameter¶
{
"datasetId": "string",
"dialect": "snowflake",
"status": "preview"
}
Parameters
Name | In | Type | Required | Description |
---|---|---|---|---|
body | body | GenericRecipeFromDataset | false | none |
Example responses¶
201 Response
{
"createdAt": "2019-08-24T14:15:22Z",
"createdBy": {
"email": "string",
"fullName": "string",
"id": "string",
"userhash": "string",
"username": "string"
},
"description": "string",
"dialect": "snowflake",
"downsampling": {
"arguments": {
"rows": 0,
"seed": null
},
"directive": "random-sample"
},
"errorMessage": null,
"failedOperationsIndex": null,
"inputs": [
{
"alias": "string",
"dataSourceId": "string",
"dataStoreId": "string",
"inputType": "datasource",
"sampling": {
"arguments": {
"rows": 10000,
"seed": 0
},
"directive": "random-sample"
}
}
],
"name": "string",
"operations": [
{
"arguments": {
"conditions": [
{
"column": "string",
"function": "between",
"functionArguments": []
}
],
"keepRows": true,
"operator": "and"
},
"directive": "filter"
}
],
"recipeId": "string",
"recipeType": "wrangling",
"settings": {
"featureDiscoveryProjectId": "string",
"featureDiscoverySupervisedFeatureReduction": null,
"predictionPoint": "string",
"relationshipsConfigurationId": "string",
"target": "string",
"weightsFeature": "string"
},
"status": "draft",
"updatedAt": "2019-08-24T14:15:22Z",
"updatedBy": {
"email": "string",
"fullName": "string",
"id": "string",
"userhash": "string",
"username": "string"
}
}
Responses¶
Status | Meaning | Description | Schema |
---|---|---|---|
201 | Created | Recipe created successfully. | RecipeResponse |
422 | Unprocessable Entity | You can't specify dialect or inputs when source Recipe is available. |
None |
To perform this operation, you must be authenticated by means of one of the following methods:
BearerAuth
POST /api/v2/recipes/fromRecipe/¶
Shallow copy the given recipe, reusing existing data sources. Implicitly creates duplicate of wrangling session.
Code samples¶
# You can also use wget
curl -X POST https://app.datarobot.com/api/v2/recipes/fromRecipe/ \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "Authorization: Bearer {access-token}"
Body parameter¶
{
"name": "string",
"recipeId": "string"
}
Parameters
Name | In | Type | Required | Description |
---|---|---|---|---|
body | body | RecipeFromRecipeCreate | false | none |
Example responses¶
201 Response
{
"createdAt": "2019-08-24T14:15:22Z",
"createdBy": {
"email": "string",
"fullName": "string",
"id": "string",
"userhash": "string",
"username": "string"
},
"description": "string",
"dialect": "snowflake",
"downsampling": {
"arguments": {
"rows": 0,
"seed": null
},
"directive": "random-sample"
},
"errorMessage": null,
"failedOperationsIndex": null,
"inputs": [
{
"alias": "string",
"dataSourceId": "string",
"dataStoreId": "string",
"inputType": "datasource",
"sampling": {
"arguments": {
"rows": 10000,
"seed": 0
},
"directive": "random-sample"
}
}
],
"name": "string",
"operations": [
{
"arguments": {
"conditions": [
{
"column": "string",
"function": "between",
"functionArguments": []
}
],
"keepRows": true,
"operator": "and"
},
"directive": "filter"
}
],
"recipeId": "string",
"recipeType": "wrangling",
"settings": {
"featureDiscoveryProjectId": "string",
"featureDiscoverySupervisedFeatureReduction": null,
"predictionPoint": "string",
"relationshipsConfigurationId": "string",
"target": "string",
"weightsFeature": "string"
},
"status": "draft",
"updatedAt": "2019-08-24T14:15:22Z",
"updatedBy": {
"email": "string",
"fullName": "string",
"id": "string",
"userhash": "string",
"username": "string"
}
}
Responses¶
Status | Meaning | Description | Schema |
---|---|---|---|
201 | Created | Recipe created successfully. | RecipeResponse |
To perform this operation, you must be authenticated by means of one of the following methods:
BearerAuth
DELETE /api/v2/recipes/{recipeId}/¶
Marks the wrangling recipe with a given ID as deleted.
Code samples¶
# You can also use wget
curl -X DELETE https://app.datarobot.com/api/v2/recipes/{recipeId}/ \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "Authorization: Bearer {access-token}"
Body parameter¶
{
"featureDiscoverySupervisedFeatureReduction": true,
"predictionPoint": "string",
"relationshipsConfigurationId": "string",
"target": "string",
"weightsFeature": "string"
}
Parameters
Name | In | Type | Required | Description |
---|---|---|---|---|
recipeId | path | string | true | The ID of the recipe. |
body | body | RecipeSettingsUpdate | false | none |
Example responses¶
204 Response
{
"featureDiscoveryProjectId": "string",
"featureDiscoverySupervisedFeatureReduction": null,
"predictionPoint": "string",
"relationshipsConfigurationId": "string",
"target": "string",
"weightsFeature": "string"
}
Responses¶
Status | Meaning | Description | Schema |
---|---|---|---|
204 | No Content | Successfully deleted. | RecipeSettingsResponse |
To perform this operation, you must be authenticated by means of one of the following methods:
BearerAuth
GET /api/v2/recipes/{recipeId}/¶
Retrieve a wrangling recipe given ID.
Code samples¶
# You can also use wget
curl -X GET https://app.datarobot.com/api/v2/recipes/{recipeId}/ \
-H "Accept: application/json" \
-H "Authorization: Bearer {access-token}"
Parameters
Name | In | Type | Required | Description |
---|---|---|---|---|
recipeId | path | string | true | The ID of the recipe. |
Example responses¶
200 Response
{
"createdAt": "2019-08-24T14:15:22Z",
"createdBy": {
"email": "string",
"fullName": "string",
"id": "string",
"userhash": "string",
"username": "string"
},
"description": "string",
"dialect": "snowflake",
"downsampling": {
"arguments": {
"rows": 0,
"seed": null
},
"directive": "random-sample"
},
"errorMessage": null,
"failedOperationsIndex": null,
"inputs": [
{
"alias": "string",
"dataSourceId": "string",
"dataStoreId": "string",
"inputType": "datasource",
"sampling": {
"arguments": {
"rows": 10000,
"seed": 0
},
"directive": "random-sample"
}
}
],
"name": "string",
"operations": [
{
"arguments": {
"conditions": [
{
"column": "string",
"function": "between",
"functionArguments": []
}
],
"keepRows": true,
"operator": "and"
},
"directive": "filter"
}
],
"recipeId": "string",
"recipeType": "wrangling",
"settings": {
"featureDiscoveryProjectId": "string",
"featureDiscoverySupervisedFeatureReduction": null,
"predictionPoint": "string",
"relationshipsConfigurationId": "string",
"target": "string",
"weightsFeature": "string"
},
"status": "draft",
"updatedAt": "2019-08-24T14:15:22Z",
"updatedBy": {
"email": "string",
"fullName": "string",
"id": "string",
"userhash": "string",
"username": "string"
}
}
Responses¶
Status | Meaning | Description | Schema |
---|---|---|---|
200 | OK | none | RecipeResponse |
To perform this operation, you must be authenticated by means of one of the following methods:
BearerAuth
PATCH /api/v2/recipes/{recipeId}/¶
Patch a wrangling recipe name and description
Code samples¶
# You can also use wget
curl -X PATCH https://app.datarobot.com/api/v2/recipes/{recipeId}/ \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "Authorization: Bearer {access-token}"
Body parameter¶
{
"description": "string",
"name": "string"
}
Parameters
Name | In | Type | Required | Description |
---|---|---|---|---|
recipeId | path | string | true | The ID of the recipe. |
body | body | PatchRecipe | false | none |
Example responses¶
200 Response
{
"createdAt": "2019-08-24T14:15:22Z",
"createdBy": {
"email": "string",
"fullName": "string",
"id": "string",
"userhash": "string",
"username": "string"
},
"description": "string",
"dialect": "snowflake",
"downsampling": {
"arguments": {
"rows": 0,
"seed": null
},
"directive": "random-sample"
},
"errorMessage": null,
"failedOperationsIndex": null,
"inputs": [
{
"alias": "string",
"dataSourceId": "string",
"dataStoreId": "string",
"inputType": "datasource",
"sampling": {
"arguments": {
"rows": 10000,
"seed": 0
},
"directive": "random-sample"
}
}
],
"name": "string",
"operations": [
{
"arguments": {
"conditions": [
{
"column": "string",
"function": "between",
"functionArguments": []
}
],
"keepRows": true,
"operator": "and"
},
"directive": "filter"
}
],
"recipeId": "string",
"recipeType": "wrangling",
"settings": {
"featureDiscoveryProjectId": "string",
"featureDiscoverySupervisedFeatureReduction": null,
"predictionPoint": "string",
"relationshipsConfigurationId": "string",
"target": "string",
"weightsFeature": "string"
},
"status": "draft",
"updatedAt": "2019-08-24T14:15:22Z",
"updatedBy": {
"email": "string",
"fullName": "string",
"id": "string",
"userhash": "string",
"username": "string"
}
}
Responses¶
Status | Meaning | Description | Schema |
---|---|---|---|
200 | OK | none | RecipeResponse |
To perform this operation, you must be authenticated by means of one of the following methods:
BearerAuth
PUT /api/v2/recipes/{recipeId}/downsampling/¶
Updates the downsampling directive in the recipe.Downsampling will be applied on top of the recipe during publishing.
Code samples¶
# You can also use wget
curl -X PUT https://app.datarobot.com/api/v2/recipes/{recipeId}/downsampling/ \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "Authorization: Bearer {access-token}"
Body parameter¶
{
"downsampling": {
"arguments": {
"rows": 0,
"seed": null
},
"directive": "random-sample"
}
}
Parameters
Name | In | Type | Required | Description |
---|---|---|---|---|
recipeId | path | string | true | The ID of the recipe. |
body | body | RecipeDownsamplingUpdate | false | none |
Example responses¶
200 Response
{
"createdAt": "2019-08-24T14:15:22Z",
"createdBy": {
"email": "string",
"fullName": "string",
"id": "string",
"userhash": "string",
"username": "string"
},
"description": "string",
"dialect": "snowflake",
"downsampling": {
"arguments": {
"rows": 0,
"seed": null
},
"directive": "random-sample"
},
"errorMessage": null,
"failedOperationsIndex": null,
"inputs": [
{
"alias": "string",
"dataSourceId": "string",
"dataStoreId": "string",
"inputType": "datasource",
"sampling": {
"arguments": {
"rows": 10000,
"seed": 0
},
"directive": "random-sample"
}
}
],
"name": "string",
"operations": [
{
"arguments": {
"conditions": [
{
"column": "string",
"function": "between",
"functionArguments": []
}
],
"keepRows": true,
"operator": "and"
},
"directive": "filter"
}
],
"recipeId": "string",
"recipeType": "wrangling",
"settings": {
"featureDiscoveryProjectId": "string",
"featureDiscoverySupervisedFeatureReduction": null,
"predictionPoint": "string",
"relationshipsConfigurationId": "string",
"target": "string",
"weightsFeature": "string"
},
"status": "draft",
"updatedAt": "2019-08-24T14:15:22Z",
"updatedBy": {
"email": "string",
"fullName": "string",
"id": "string",
"userhash": "string",
"username": "string"
}
}
Responses¶
Status | Meaning | Description | Schema |
---|---|---|---|
200 | OK | none | RecipeResponse |
422 | Unprocessable Entity | Cannot modify published recipe. | None |
To perform this operation, you must be authenticated by means of one of the following methods:
BearerAuth
GET /api/v2/recipes/{recipeId}/inputs/¶
Gets inputs of the given recipe.
Code samples¶
# You can also use wget
curl -X GET https://app.datarobot.com/api/v2/recipes/{recipeId}/inputs/ \
-H "Accept: application/json" \
-H "Authorization: Bearer {access-token}"
Parameters
Name | In | Type | Required | Description |
---|---|---|---|---|
recipeId | path | string | true | The ID of the recipe. |
Example responses¶
200 Response
{
"inputs": [
{
"columnCount": 0,
"connectionName": "string",
"dataSourceId": "string",
"dataStoreId": "string",
"inputType": "datasource",
"name": "string",
"rowCount": 0,
"status": "ABORTED"
}
]
}
Responses¶
Status | Meaning | Description | Schema |
---|---|---|---|
200 | OK | none | RecipeInputsResponse |
To perform this operation, you must be authenticated by means of one of the following methods:
BearerAuth
GET /api/v2/recipes/{recipeId}/insights/¶
Retrieve recipe insights.
Code samples¶
# You can also use wget
curl -X GET https://app.datarobot.com/api/v2/recipes/{recipeId}/insights/?limit=100&offset=0 \
-H "Accept: application/json" \
-H "Authorization: Bearer {access-token}"
Parameters
Name | In | Type | Required | Description |
---|---|---|---|---|
limit | query | integer | true | At most this many results are returned. The default may change and a maximum limit may be imposed without notice. |
offset | query | integer | true | This many results will be skipped. |
recipeId | path | string | true | The ID of the recipe. |
Example responses¶
200 Response
{
"count": 0,
"data": [
{
"datasetId": "string",
"datasetVersionId": "string",
"dateFormat": "string",
"featureType": "Boolean",
"id": 0,
"isZeroInflated": true,
"keySummary": {
"key": "string",
"summary": {
"dataQualities": "ISSUES_FOUND",
"max": 0,
"mean": 0,
"median": 0,
"min": 0,
"pctRows": 0,
"stdDev": 0
}
},
"language": "string",
"lowInformation": true,
"majorityClassCount": 0,
"max": 0,
"mean": 0,
"median": 0,
"min": 0,
"minorityClassCount": 0,
"naCount": 0,
"name": "string",
"plot": [
{
"count": 0,
"label": "string"
}
],
"stdDev": 0,
"timeSeriesEligibilityReason": "string",
"timeSeriesEligibilityReasonAggregation": "string",
"timeSeriesEligible": true,
"timeSeriesEligibleAggregation": true,
"timeStep": 0,
"timeStepAggregation": 0,
"timeUnit": "string",
"timeUnitAggregation": "string",
"uniqueCount": 0
}
],
"message": "string",
"next": "http://example.com",
"previous": "http://example.com",
"status": "ABORTED",
"totalCount": 0
}
Responses¶
Status | Meaning | Description | Schema |
---|---|---|---|
200 | OK | none | RefinexInsightsResponse |
To perform this operation, you must be authenticated by means of one of the following methods:
BearerAuth
GET /api/v2/recipes/{recipeId}/operations/{operationIndex}/¶
Returns an operation configuration with an additional inputColumns field to show the list of columns available at that stage.
Code samples¶
# You can also use wget
curl -X GET https://app.datarobot.com/api/v2/recipes/{recipeId}/operations/{operationIndex}/ \
-H "Accept: application/json" \
-H "Authorization: Bearer {access-token}"
Parameters
Name | In | Type | Required | Description |
---|---|---|---|---|
recipeId | path | string | true | The ID of the recipe. |
operationIndex | path | integer | true | The zero-based index of the operation. |
Example responses¶
200 Response
{}
Responses¶
Status | Meaning | Description | Schema |
---|---|---|---|
200 | OK | none | OperationDetails |
To perform this operation, you must be authenticated by means of one of the following methods:
BearerAuth
GET /api/v2/recipes/{recipeId}/preview/¶
Retrieve a wrangling preview given ID.
Code samples¶
# You can also use wget
curl -X GET https://app.datarobot.com/api/v2/recipes/{recipeId}/preview/ \
-H "Accept: application/json" \
-H "Authorization: Bearer {access-token}"
Parameters
Name | In | Type | Required | Description |
---|---|---|---|---|
offset | query | integer | false | Number of results to skip. |
limit | query | integer | false | At most this many results are returned. The default may change without notice. |
recipeId | path | string | true | The ID of the recipe. |
Example responses¶
200 Response
{
"byteSize": 0,
"columns": [
"string"
],
"count": 0,
"data": [
[
"string"
]
],
"estimatedSizeExceedsLimit": true,
"next": "http://example.com",
"previous": "http://example.com",
"resultSchema": [
{
"columnDefaultValue": "string",
"dataType": "string",
"dataTypeInt": 0,
"isInPrimaryKey": true,
"isNullable": "NO",
"name": "string",
"precision": 0,
"scale": 0
}
],
"totalCount": 0
}
Responses¶
Status | Meaning | Description | Schema |
---|---|---|---|
200 | OK | none | RecipePreviewResponse |
To perform this operation, you must be authenticated by means of one of the following methods:
BearerAuth
POST /api/v2/recipes/{recipeId}/preview/¶
Starts the preview process for the recipe. Since this is an asynchronous process this endpoint returns a status ID to use with the status endpoint and a location header with the URL that can be polled for status.Launch WranglingJob, which includes: 1. InitialSamplingJob if it hasn’t been launched before 2. Preview query itself 3. Launch recipe eda job
Insights computation is launched implicitly if there was sampling specified and no operations specified.
Code samples¶
# You can also use wget
curl -X POST https://app.datarobot.com/api/v2/recipes/{recipeId}/preview/ \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "Authorization: Bearer {access-token}"
Body parameter¶
{
"credentialId": "string"
}
Parameters
Name | In | Type | Required | Description |
---|---|---|---|---|
recipeId | path | string | true | The ID of the recipe. |
body | body | RecipeRunPreviewAsync | false | none |
Example responses¶
202 Response
{
"code": 0,
"created": "2019-08-24T14:15:22Z",
"description": "",
"message": "",
"status": "INITIALIZED",
"statusId": "e900225c-0629-4e96-be6e-86a17a309645",
"statusType": ""
}
Responses¶
Status | Meaning | Description | Schema |
---|---|---|---|
202 | Accepted | none | StatusResponse |
422 | Unprocessable Entity | Credentials were not provided and default credentials were not found. | None |
To perform this operation, you must be authenticated by means of one of the following methods:
BearerAuth
POST /api/v2/recipes/{recipeId}/relationshipQualityAssessments/¶
Submit a job to assess the quality of the relationship configuration within a Feature Discovery session in Workbench.
Code samples¶
# You can also use wget
curl -X POST https://app.datarobot.com/api/v2/recipes/{recipeId}/relationshipQualityAssessments/ \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {access-token}"
Body parameter¶
{
"credentials": [
{
"catalogVersionId": "string",
"credentialId": "string",
"url": "string"
}
],
"datetimePartitionColumn": "string",
"featureEngineeringPredictionPoint": "string",
"relationshipsConfiguration": {
"datasetDefinitions": [
{
"catalogId": "string",
"catalogVersionId": "string",
"featureListId": "string",
"identifier": "string",
"primaryTemporalKey": "string",
"snapshotPolicy": "specified"
}
],
"featureDiscoveryMode": "default",
"featureDiscoverySettings": [
{
"description": "string",
"family": "string",
"name": "string",
"settingType": "string",
"value": true,
"verboseName": "string"
}
],
"id": "string",
"relationships": [
{
"dataset1Identifier": "string",
"dataset1Keys": [
"string"
],
"dataset2Identifier": "string",
"dataset2Keys": [
"string"
],
"featureDerivationWindowEnd": 0,
"featureDerivationWindowStart": 0,
"featureDerivationWindowTimeUnit": "MILLISECOND",
"featureDerivationWindows": [
{
"end": 0,
"start": 0,
"unit": "MILLISECOND"
}
],
"predictionPointRounding": 0,
"predictionPointRoundingTimeUnit": "MILLISECOND"
}
],
"snowflakePushDownCompatible": true
},
"userId": "string"
}
Parameters
Name | In | Type | Required | Description |
---|---|---|---|---|
recipeId | path | string | true | The ID of the recipe. |
body | body | RelationshipQualityAssessmentsCreate | false | none |
Responses¶
Status | Meaning | Description | Schema |
---|---|---|---|
202 | Accepted | Relationship quality assessment has successfully started. See the Location header. | None |
422 | Unprocessable Entity | Unable to process the request | None |
Response Headers¶
Status | Header | Type | Format | Description |
---|---|---|---|---|
202 | Location | string | A url that can be polled to check the status. |
To perform this operation, you must be authenticated by means of one of the following methods:
BearerAuth
PATCH /api/v2/recipes/{recipeId}/settings/¶
Updates some recipe settings applicable in the modeling stage.
Code samples¶
# You can also use wget
curl -X PATCH https://app.datarobot.com/api/v2/recipes/{recipeId}/settings/ \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "Authorization: Bearer {access-token}"
Body parameter¶
{
"featureDiscoverySupervisedFeatureReduction": true,
"predictionPoint": "string",
"relationshipsConfigurationId": "string",
"target": "string",
"weightsFeature": "string"
}
Parameters
Name | In | Type | Required | Description |
---|---|---|---|---|
recipeId | path | string | true | The ID of the recipe. |
body | body | RecipeSettingsUpdate | false | none |
Example responses¶
200 Response
{
"featureDiscoveryProjectId": "string",
"featureDiscoverySupervisedFeatureReduction": null,
"predictionPoint": "string",
"relationshipsConfigurationId": "string",
"target": "string",
"weightsFeature": "string"
}
Responses¶
Status | Meaning | Description | Schema |
---|---|---|---|
200 | OK | none | RecipeSettingsResponse |
To perform this operation, you must be authenticated by means of one of the following methods:
BearerAuth
POST /api/v2/recipes/{recipeId}/sql/¶
Builds a SQL query for the recipe. Overrides operations to get the adjusted query without changing the recipe.
Code samples¶
# You can also use wget
curl -X POST https://app.datarobot.com/api/v2/recipes/{recipeId}/sql/ \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "Authorization: Bearer {access-token}"
Body parameter¶
{
"operations": [
{
"arguments": {
"conditions": [
{
"column": "string",
"function": "between",
"functionArguments": []
}
],
"keepRows": true,
"operator": "and"
},
"directive": "filter"
}
]
}
Parameters
Name | In | Type | Required | Description |
---|---|---|---|---|
recipeId | path | string | true | The ID of the recipe. |
body | body | BuildRecipeSql | false | none |
Example responses¶
201 Response
{
"sql": "string"
}
Responses¶
Status | Meaning | Description | Schema |
---|---|---|---|
201 | Created | none | BuildRecipeSqlResponse |
409 | Conflict | Input source data is not ready yet. | None |
422 | Unprocessable Entity | Failed to build SQL. | None |
To perform this operation, you must be authenticated by means of one of the following methods:
BearerAuth
Schemas¶
AggregateDirectiveArguments
{
"aggregations": [
{
"feature": null,
"functions": [
"sum"
]
}
],
"groupBy": [
"string"
]
}
The aggregation description.
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
aggregations | [Aggregation] | true | minItems: 1 |
The aggregations. |
groupBy | [string] | true | minItems: 1 |
The column(s) to group by. |
Aggregation
{
"feature": null,
"functions": [
"sum"
]
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
feature | string¦null | false | The feature. | |
functions | [string] | true | minItems: 1 |
The functions. |
BuildRecipeSql
{
"operations": [
{
"arguments": {
"conditions": [
{
"column": "string",
"function": "between",
"functionArguments": []
}
],
"keepRows": true,
"operator": "and"
},
"directive": "filter"
}
]
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
operations | [OneOfDirective]¦null | false | maxItems: 1000 |
List of operations to override the recipe operations when building SQL with default null. It doesn't modify the recipe itself. Missing operations field or null give original recipe SQL. Empty operations list produces basic query of a format: SELECT <list of columns> FROM <table name> |
BuildRecipeSqlResponse
{
"sql": "string"
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
sql | string | true | Generated sql. |
CatalogPasswordCredentials
{
"catalogVersionId": "string",
"password": "string",
"url": "string",
"user": "string"
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
catalogVersionId | string | false | Identifier of the catalog version | |
password | string | true | The password (in cleartext) for database authentication. The password will be encrypted on the server side as part of the HTTP request and never saved or stored. | |
url | string | false | URL that is subject to credentials. | |
user | string | true | The username for database authentication. |
ComputeNewDirectiveArguments
{
"expression": "string",
"newFeatureName": "string"
}
The transformation description.
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
expression | string | true | The expression for new feature computation. | |
newFeatureName | string | true | The new feature name which will hold results of expression evaluation. |
DataStoreExtendedColumnNoKeysResponse
{
"columnDefaultValue": "string",
"dataType": "string",
"dataTypeInt": 0,
"isInPrimaryKey": true,
"isNullable": "NO",
"name": "string",
"precision": 0,
"scale": 0
}
JDBC result column description
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
columnDefaultValue | string¦null | false | Default value of the column. | |
dataType | string | true | DataType of the column. | |
dataTypeInt | integer | false | Integer value of the column data type. | |
isInPrimaryKey | boolean | false | True if the column is in the primary key . | |
isNullable | string¦null | false | If the column values can be null. | |
name | string | true | Name of the column. | |
precision | integer | false | Precision of the column. | |
scale | integer | false | Scale of the column. |
Enumerated Values¶
Property | Value |
---|---|
isNullable | [NO , UNKNOWN , YES ] |
DatasetDefinition
{
"catalogId": "string",
"catalogVersionId": "string",
"featureListId": "string",
"identifier": "string",
"primaryTemporalKey": "string",
"snapshotPolicy": "specified"
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
catalogId | string | true | ID of the catalog item. | |
catalogVersionId | string | true | ID of the catalog item version. | |
featureListId | string¦null | false | ID of the feature list. This decides which columns in the dataset are used for feature generation. | |
identifier | string | true | maxLength: 20 minLength: 1 minLength: 1 |
Short name of the dataset (used directly as part of the generated feature names). |
primaryTemporalKey | string¦null | false | Name of the column indicating time of record creation. | |
snapshotPolicy | string | false | Policy for using dataset snapshots when creating a project or making predictions. Must be one of the following values: 'specified': Use specific snapshot specified by catalogVersionId. 'latest': Use latest snapshot from the same catalog item. 'dynamic': Get data from the source (only applicable for JDBC datasets). |
Enumerated Values¶
Property | Value |
---|---|
snapshotPolicy | [specified , latest , dynamic ] |
DatasetFeaturePlotDataResponse
{
"count": 0,
"label": "string"
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
count | number | true | Number of values in the bin. | |
label | string | true | Bin start for numerical/uncapped, or string value for categorical. The bin ==Missing== is created for rows that did not have the feature. |
DatasetInputCreate
{
"sampling": {
"arguments": {
"rows": 10000,
"seed": 0
},
"directive": "random-sample"
}
}
Dataset configuration.
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
sampling | DatasetInputSampling | false | Sampling data transformation. |
DatasetInputSampling
{
"arguments": {
"rows": 10000,
"seed": 0
},
"directive": "random-sample"
}
Sampling data transformation.
Properties¶
oneOf
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
anonymous | object | false | none | |
» arguments | RandomSampleArgumentsCreate | false | The interactive sampling config. | |
» directive | string | true | The directive name. |
xor
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
anonymous | object | false | none | |
» arguments | LimitDirectiveArguments | true | The interactive sampling config. | |
» directive | string | true | The directive name. |
xor
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
anonymous | object | false | none | |
» arguments | DatetimeSampleArgumentsCreate | false | The interactive sampling config. | |
» directive | string | true | The directive name. |
Enumerated Values¶
Property | Value |
---|---|
directive | random-sample |
directive | limit |
directive | datetime-sample |
DatetimeSampleArgumentsCreate
{
"datetimePartitionColumn": "string",
"multiseriesIdColumn": null,
"rows": 10000,
"selectedSeries": [
"string"
],
"strategy": "earliest"
}
The interactive sampling config.
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
datetimePartitionColumn | string | true | The datetime partition column to order by. | |
multiseriesIdColumn | string¦null | false | The series ID column, if present. | |
rows | integer | false | maximum: 10000 minimum: 1 |
The number of rows to be sampled. |
selectedSeries | [string]¦null | false | maxItems: 1000 minItems: 1 |
The selected series to be sampled. Requires "multiseriesIdColumn". |
strategy | string | true | Sets whether to take the latest or earliest rows relative to the datetime partition column. |
Enumerated Values¶
Property | Value |
---|---|
strategy | [earliest , latest ] |
DownsamplingRandomDirectiveArguments
{
"rows": 0,
"seed": null
}
The downsampling configuration.
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
rows | integer | true | The number of sampled rows. | |
seed | integer¦null | false | The start number of the random number generator |
DropColumnsArguments
{
"columns": [
"string"
]
}
The transformation description.
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
columns | [string] | true | maxItems: 1000 minItems: 1 |
The list of columns. |
ExperimentContainerUserResponse
{
"email": "string",
"fullName": "string",
"id": "string",
"userhash": "string",
"username": "string"
}
User who created the Use Case
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
string¦null | true | The email address of the user. | ||
fullName | string¦null | false | The full name of the user. | |
id | string | true | The id of the user. | |
userhash | string¦null | false | User's gravatar hash. | |
username | string¦null | false | The username of the user. |
FeatureDerivationWindow
{
"end": 0,
"start": 0,
"unit": "MILLISECOND"
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
end | integer | true | maximum: 0 |
How many featureDerivationWindowUnits of each dataset's primary temporal key into the past relative to the datetimePartitionColumn the feature derivation window should end. Will be a non-positive integer, if present. If present, time-aware joins will be used. Only applicable when table1Identifier is not provided. |
start | integer | true | maximum: 0 (exclusive) |
How many featureDerivationWindowUnits of each dataset's primary temporal key into the past relative to the datetimePartitionColumn the feature derivation window should begin. Will be a negative integer, if present. If present, time-aware joins will be used. Only applicable when table1Identifier is not provided. |
unit | string | true | Time unit of the feature derivation window. Supported values are MILLISECOND, SECOND, MINUTE, HOUR, DAY, WEEK, MONTH, QUARTER, YEAR. If present, time-aware joins will be used. Only applicable when table1Identifier is not provided. |
Enumerated Values¶
Property | Value |
---|---|
unit | [MILLISECOND , SECOND , MINUTE , HOUR , DAY , WEEK , MONTH , QUARTER , YEAR ] |
FeatureDiscoverySettingResponse
{
"description": "string",
"family": "string",
"name": "string",
"settingType": "string",
"value": true,
"verboseName": "string"
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
description | string | true | Description of this feature discovery setting | |
family | string | true | Family of this feature discovery setting | |
name | string | true | maxLength: 100 |
Name of this feature discovery setting |
settingType | string | true | Type of this feature discovery setting | |
value | boolean | true | Value of this feature discovery setting | |
verboseName | string | true | Human readable name of this feature discovery setting |
FeatureKeySummaryDetailsResponseValidatorMultilabel
{
"max": 0,
"mean": 0,
"median": 0,
"min": 0,
"pctRows": 0,
"stdDev": 0
}
Statistics of the key.
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
max | number | true | Maximum value of the key. | |
mean | number | true | Mean value of the key. | |
median | number | true | Median value of the key. | |
min | number | true | Minimum value of the key. | |
pctRows | number | true | Percentage occurrence of key in the EDA sample of the feature. | |
stdDev | number | true | Standard deviation of the key. |
FeatureKeySummaryDetailsResponseValidatorSummarizedCategorical
{
"dataQualities": "ISSUES_FOUND",
"max": 0,
"mean": 0,
"median": 0,
"min": 0,
"pctRows": 0,
"stdDev": 0
}
Statistics of the key.
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
dataQualities | string | true | The indicator of data quality assessment of the feature. | |
max | number | true | Maximum value of the key. | |
mean | number | true | Mean value of the key. | |
median | number | true | Median value of the key. | |
min | number | true | Minimum value of the key. | |
pctRows | number | true | Percentage occurrence of key in the EDA sample of the feature. | |
stdDev | number | true | Standard deviation of the key. |
Enumerated Values¶
Property | Value |
---|---|
dataQualities | [ISSUES_FOUND , NOT_ANALYZED , NO_ISSUES_FOUND ] |
FeatureKeySummaryResponseValidatorMultilabel
{
"key": "string",
"summary": {
"max": 0,
"mean": 0,
"median": 0,
"min": 0,
"pctRows": 0,
"stdDev": 0
}
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
key | string | true | Name of the key. | |
summary | FeatureKeySummaryDetailsResponseValidatorMultilabel | true | Statistics of the key. |
FeatureKeySummaryResponseValidatorSummarizedCategorical
{
"key": "string",
"summary": {
"dataQualities": "ISSUES_FOUND",
"max": 0,
"mean": 0,
"median": 0,
"min": 0,
"pctRows": 0,
"stdDev": 0
}
}
For a Summarized Categorical columns, this will contain statistics for the top 50 keys (truncated to 103 characters)
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
key | string | true | Name of the key. | |
summary | FeatureKeySummaryDetailsResponseValidatorSummarizedCategorical | true | Statistics of the key. |
FilterCondition
{
"column": "string",
"function": "between",
"functionArguments": []
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
column | string | true | The column name. | |
function | string | true | The function used to evaluate each value. | |
functionArguments | [anyOf] | false | maxItems: 2 |
The arguments to use with the function. |
anyOf
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
» anonymous | string | false | none |
or
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
» anonymous | integer | false | none |
or
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
» anonymous | number | false | none |
Enumerated Values¶
Property | Value |
---|---|
function | [between , contains , eq , gt , gte , lt , lte , neq , notnull , null ] |
FilterDirectiveArguments
{
"conditions": [
{
"column": "string",
"function": "between",
"functionArguments": []
}
],
"keepRows": true,
"operator": "and"
}
The transformation description.
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
conditions | [FilterCondition] | true | maxItems: 1000 |
The list of conditions. |
keepRows | boolean | true | Determines whether matching rows should be kept or dropped. | |
operator | string | true | The operator to apply on multiple conditions. |
Enumerated Values¶
Property | Value |
---|---|
operator | [and , or ] |
GenericRecipeFromDataset
{
"datasetId": "string",
"dialect": "snowflake",
"status": "preview"
}
Properties¶
oneOf
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
anonymous | object | false | none | |
» datasetId | string | true | Dataset ID to create a Recipe from. | |
» dialect | string | true | Source type data was retrieved from. Should be omitted for dataset rewrangling. | |
» status | string | true | Preview recipe |
xor
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
anonymous | object | false | none | |
» datasetId | string | true | Dataset ID to create a Recipe from. | |
» datasetVersionId | string¦null | false | Dataset version ID to create a Recipe from. | |
» dialect | string | false | Source type data was retrieved from. Should be omitted for dataset rewrangling and feature discovery recipes. | |
» experimentContainerId | string | false | [DEPRECATED - replaced with use_case_id] ID assigned to the Use Case, which is an experimental container for the recipe. | |
» inputs | [DatasetInputCreate] | false | maxItems: 1 minItems: 1 |
List of recipe inputs. Should be omitted on dataset wrangling when dataset is created from recipe. |
» recipeType | string | true | Type of the recipe workflow. | |
» status | string | false | Wrangling recipe | |
» useCaseId | string | false | ID of the Use Case associated with the recipe. |
Enumerated Values¶
Property | Value |
---|---|
dialect | [snowflake , bigquery , databricks , spark , postgres ] |
status | preview |
dialect | [snowflake , bigquery , databricks , spark , postgres ] |
recipeType | [wrangling , Wrangling , WRANGLING , featureDiscovery , FeatureDiscovery , FEATURE_DISCOVERY , featureDiscoveryPrivatePreview , FeatureDiscoveryPrivatePreview , FEATURE_DISCOVERY_PRIVATE_PREVIEW ] |
status | draft |
JDBCTableDataSourceInputCreate
{
"canonicalName": "string",
"catalog": "string",
"sampling": {
"arguments": {
"rows": 10000,
"seed": 0
},
"directive": "random-sample"
},
"schema": "string",
"table": "string"
}
Data source configuration.
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
canonicalName | string | true | Data source canonical name. | |
catalog | string | false | maxLength: 256 |
Catalog name in the database if supported. |
sampling | SampleDirectiveCreate | false | The input data transformation steps. | |
schema | string | false | maxLength: 256 |
Schema associated with the table or view in the database if the data source is not query based. |
table | string | true | maxLength: 256 |
Table or view name in the database if the data source is not query based. |
JobErrorCode
0
JobErrorCode
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
JobErrorCode | integer | false | Possible job error codes. This enum exists for consistency with the DataRobot Status API. |
Enumerated Values¶
Property | Value |
---|---|
JobErrorCode | [0 , 1 ] |
JobExecutionState
"INITIALIZED"
JobExecutionState
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
JobExecutionState | string | false | Possible job states. Values match the DataRobot Status API. |
Enumerated Values¶
Property | Value |
---|---|
JobExecutionState | [INITIALIZED , RUNNING , COMPLETED , ERROR , ABORTED , EXPIRED ] |
JoinArguments
{
"joinType": "inner",
"leftKeys": [
"string"
],
"rightDataSourceId": "string",
"rightKeys": [
"string"
],
"source": "table"
}
The transformation description.
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
joinType | string | true | The join type between primary and secondary data sources. | |
leftKeys | [string] | true | maxItems: 10000 minItems: 1 |
The list of columns to be used in the "ON" clause from the primary data source. |
rightDataSourceId | string | true | The ID of the input data source. | |
rightKeys | [string] | true | maxItems: 10000 minItems: 1 |
The list of columns to be used in the "ON" clause from a secondary data source. |
source | string | true | The source type. |
Enumerated Values¶
Property | Value |
---|---|
joinType | [inner , left ] |
source | table |
LimitDirectiveArguments
{
"rows": 1000
}
The interactive sampling config.
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
rows | integer | true | maximum: 10000 minimum: 1 |
The number of rows to be selected. |
OneOfDirective
{
"arguments": {
"conditions": [
{
"column": "string",
"function": "between",
"functionArguments": []
}
],
"keepRows": true,
"operator": "and"
},
"directive": "filter"
}
Properties¶
oneOf
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
anonymous | object | false | none | |
» arguments | FilterDirectiveArguments | true | The transformation description. | |
» directive | string | true | The single data transformation step. |
xor
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
anonymous | object | false | none | |
» arguments | ReplaceDirectiveArguments | true | The transformation description. | |
» directive | string | true | The single data transformation step. |
xor
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
anonymous | object | false | none | |
» arguments | ComputeNewDirectiveArguments | true | The transformation description. | |
» directive | string | true | The single data transformation step. |
xor
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
anonymous | object | false | none | |
» directive | string | true | The single data transformation step. |
xor
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
anonymous | object | false | none | |
» arguments | DropColumnsArguments | true | The transformation description. | |
» directive | string | true | The single data transformation step. |
xor
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
anonymous | object | false | none | |
» arguments | RenameColumnsArguments | true | The transformation description. | |
» directive | string | true | The single data transformation step. |
xor
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
anonymous | object | false | none | |
» arguments | JoinArguments | true | The transformation description. | |
» directive | string | true | The single data transformation step. |
xor
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
anonymous | object | false | none | |
» arguments | AggregateDirectiveArguments | true | The aggregation description. | |
» directive | string | true | The single data transformation step. |
Enumerated Values¶
Property | Value |
---|---|
directive | filter |
directive | replace |
directive | compute-new |
directive | dedupe-rows |
directive | drop-columns |
directive | rename-columns |
directive | join |
directive | aggregate |
OneOfDownsamplingDirective
{
"arguments": {
"rows": 0,
"seed": null
},
"directive": "random-sample"
}
Data transformation step.
Properties¶
oneOf
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
anonymous | object | false | none | |
» arguments | DownsamplingRandomDirectiveArguments | true | The downsampling configuration. | |
» directive | string | true | The downsampling method. |
xor
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
anonymous | object | false | none | |
» arguments | SmartDownsamplingArguments | true | The downsampling configuration. | |
» directive | string | true | The downsampling method. |
Enumerated Values¶
Property | Value |
---|---|
directive | random-sample |
directive | smart-downsampling |
OperationDetails
{}
Properties¶
None
PatchRecipe
{
"description": "string",
"name": "string"
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
description | string | false | maxLength: 1000 |
New recipe description. |
name | string | false | maxLength: 255 |
New recipe name. |
RandomSampleArgumentsCreate
{
"rows": 10000,
"seed": 0
}
The interactive sampling config.
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
rows | integer | false | maximum: 10000 minimum: 1 |
The number of rows to be sampled. |
seed | integer | false | minimum: 0 |
The starting number of the random number generator. |
RecipeDownsamplingUpdate
{
"downsampling": {
"arguments": {
"rows": 0,
"seed": null
},
"directive": "random-sample"
}
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
downsampling | OneOfDownsamplingDirective | true | Data transformation step. |
RecipeFromDataSourceCreate
{
"dataSourceType": "dr-database-v1",
"dataStoreId": "string",
"dialect": "snowflake",
"experimentContainerId": "string",
"inputs": [
{
"canonicalName": "string",
"catalog": "string",
"sampling": {
"arguments": {
"rows": 10000,
"seed": 0
},
"directive": "random-sample"
},
"schema": "string",
"table": "string"
}
],
"useCaseId": "string"
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
dataSourceType | string | true | Data source type. | |
dataStoreId | string | true | Data store ID for this data source. | |
dialect | string | true | Source type data was retrieved from. | |
experimentContainerId | string | false | [DEPRECATED - replaced with use_case_id] ID assigned to the Use Case, which is an experimental container for the recipe. | |
inputs | [JDBCTableDataSourceInputCreate] | true | maxItems: 1000 minItems: 1 |
List of recipe inputs |
useCaseId | string | false | ID of the Use Case associated with the recipe. |
Enumerated Values¶
Property | Value |
---|---|
dataSourceType | [dr-database-v1 , jdbc ] |
dialect | [snowflake , bigquery , databricks , spark , postgres ] |
RecipeFromRecipeCreate
{
"name": "string",
"recipeId": "string"
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
name | string | false | maxLength: 255 |
The recipe name. |
recipeId | string | true | Recipe ID to create a Recipe from. |
RecipeInputResponse
{
"alias": "string",
"dataSourceId": "string",
"dataStoreId": "string",
"inputType": "datasource",
"sampling": {
"arguments": {
"rows": 10000,
"seed": 0
},
"directive": "random-sample"
}
}
Properties¶
oneOf
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
anonymous | object | false | none | |
» alias | string | true | maxLength: 256 |
The alias for the data source table. |
» dataSourceId | string | true | The ID of the input data source. | |
» dataStoreId | string | true | The ID of the input data store. | |
» inputType | string | true | The data that comes from a database connection. | |
» sampling | SampleDirectiveCreate | false | The input data transformation steps. |
xor
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
anonymous | object | false | none | |
» alias | string | true | maxLength: 256 |
The alias for the data source table. |
» datasetId | string | true | The ID of the input data store. | |
» datasetVersionId | string | true | The ID of the input data source. | |
» inputType | string | true | The data that comes from the Data Registry. | |
» sampling | SampleDirectiveCreate | false | The input data transformation steps. |
Enumerated Values¶
Property | Value |
---|---|
inputType | datasource |
inputType | dataset |
RecipeInputStatsResponse
{
"columnCount": 0,
"connectionName": "string",
"dataSourceId": "string",
"dataStoreId": "string",
"inputType": "datasource",
"name": "string",
"rowCount": 0,
"status": "ABORTED"
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
columnCount | integer¦null | true | Number of features in original (not sampled) data source | |
connectionName | string¦null | true | The user-friendly name of the data store. | |
dataSourceId | string | true | ID of the input data source | |
dataStoreId | string | true | ID of the input data store | |
inputType | string | true | Source type data came from | |
name | string¦null | true | Combination of "catalog", "schema" and "table" from data source | |
rowCount | integer¦null | true | Number of rows in original (not sampled) data source | |
status | string | true | Input preparation status |
Enumerated Values¶
Property | Value |
---|---|
inputType | [datasource , dataset ] |
status | [ABORTED , COMPLETED , ERROR , EXPIRED , INITIALIZED , RUNNING ] |
RecipeInputsResponse
{
"inputs": [
{
"columnCount": 0,
"connectionName": "string",
"dataSourceId": "string",
"dataStoreId": "string",
"inputType": "datasource",
"name": "string",
"rowCount": 0,
"status": "ABORTED"
}
]
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
inputs | [RecipeInputStatsResponse] | true | maxItems: 1000 minItems: 1 |
List of recipe inputs |
RecipePreviewResponse
{
"byteSize": 0,
"columns": [
"string"
],
"count": 0,
"data": [
[
"string"
]
],
"estimatedSizeExceedsLimit": true,
"next": "http://example.com",
"previous": "http://example.com",
"resultSchema": [
{
"columnDefaultValue": "string",
"dataType": "string",
"dataTypeInt": 0,
"isInPrimaryKey": true,
"isNullable": "NO",
"name": "string",
"precision": 0,
"scale": 0
}
],
"totalCount": 0
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
byteSize | integer | true | minimum: 0 |
Data memory usage |
columns | [string] | true | maxItems: 10000 |
List of columns in data preview |
count | integer | false | Number of items returned on this page. | |
data | [array] | true | maxItems: 1000 |
List of records output by the query. |
estimatedSizeExceedsLimit | boolean | true | Defines if downsampling should be done based on sample size | |
next | string(uri)¦null | true | URL pointing to the next page (if null, there is no next page). | |
previous | string(uri)¦null | true | URL pointing to the previous page (if null, there is no previous page). | |
resultSchema | [DataStoreExtendedColumnNoKeysResponse]¦null | true | maxItems: 10000 |
JDBC result schema |
totalCount | integer | true | The total number of items across all pages. |
RecipeResponse
{
"createdAt": "2019-08-24T14:15:22Z",
"createdBy": {
"email": "string",
"fullName": "string",
"id": "string",
"userhash": "string",
"username": "string"
},
"description": "string",
"dialect": "snowflake",
"downsampling": {
"arguments": {
"rows": 0,
"seed": null
},
"directive": "random-sample"
},
"errorMessage": null,
"failedOperationsIndex": null,
"inputs": [
{
"alias": "string",
"dataSourceId": "string",
"dataStoreId": "string",
"inputType": "datasource",
"sampling": {
"arguments": {
"rows": 10000,
"seed": 0
},
"directive": "random-sample"
}
}
],
"name": "string",
"operations": [
{
"arguments": {
"conditions": [
{
"column": "string",
"function": "between",
"functionArguments": []
}
],
"keepRows": true,
"operator": "and"
},
"directive": "filter"
}
],
"recipeId": "string",
"recipeType": "wrangling",
"settings": {
"featureDiscoveryProjectId": "string",
"featureDiscoverySupervisedFeatureReduction": null,
"predictionPoint": "string",
"relationshipsConfigurationId": "string",
"target": "string",
"weightsFeature": "string"
},
"status": "draft",
"updatedAt": "2019-08-24T14:15:22Z",
"updatedBy": {
"email": "string",
"fullName": "string",
"id": "string",
"userhash": "string",
"username": "string"
}
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
createdAt | string(date-time)¦null | true | ISO 8601-formatted date/time when the recipe was created. | |
createdBy | ExperimentContainerUserResponse | true | User who created the Use Case | |
description | string | true | maxLength: 1000 |
The recipe description. |
dialect | string | true | Source type data was retrieved from. | |
downsampling | OneOfDownsamplingDirective | true | Data transformation step. | |
errorMessage | string¦null | true | Error message related to the specific operation | |
failedOperationsIndex | integer¦null | true | Index of the first operation where error appears. | |
inputs | [RecipeInputResponse] | true | maxItems: 1000 |
List of data sources. |
name | string¦null | true | maxLength: 255 |
The recipe name. |
operations | [OneOfDirective] | true | maxItems: 1000 |
List of transformations |
recipeId | string | true | The ID of the recipe. | |
recipeType | string | true | Type of the recipe workflow. | |
settings | RecipeSettingsResponse | true | Recipe settings reusable at a modeling stage. | |
status | string | true | Recipe publication status. | |
updatedAt | string(date-time)¦null | true | ISO 8601-formatted date/time when the recipe was last updated. | |
updatedBy | ExperimentContainerUserResponse | true | User who created the Use Case |
Enumerated Values¶
Property | Value |
---|---|
dialect | [snowflake , bigquery , spark-feature-discovery , databricks , spark , postgres ] |
recipeType | [wrangling , Wrangling , WRANGLING , featureDiscovery , FeatureDiscovery , FEATURE_DISCOVERY , featureDiscoveryPrivatePreview , FeatureDiscoveryPrivatePreview , FEATURE_DISCOVERY_PRIVATE_PREVIEW ] |
status | [draft , preview , published ] |
RecipeRunPreviewAsync
{
"credentialId": "string"
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
credentialId | string | false | The ID of the credentials to use for the connection. If not given, the default credentials for the connection will be used. |
RecipeSettingsResponse
{
"featureDiscoveryProjectId": "string",
"featureDiscoverySupervisedFeatureReduction": null,
"predictionPoint": "string",
"relationshipsConfigurationId": "string",
"target": "string",
"weightsFeature": "string"
}
Recipe settings reusable at a modeling stage.
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
featureDiscoveryProjectId | string¦null | true | Associated feature discovery project ID. | |
featureDiscoverySupervisedFeatureReduction | boolean¦null | true | Run supervised feature reduction for Feature Discovery. | |
predictionPoint | string¦null | true | maxLength: 255 |
The date column to be used as the prediction point for time-based feature engineering. |
relationshipsConfigurationId | string¦null | true | Associated relationships configuration ID. | |
target | string¦null | false | The feature to use as the target at the modeling stage. | |
weightsFeature | string¦null | false | The weights feature. |
RecipeSettingsUpdate
{
"featureDiscoverySupervisedFeatureReduction": true,
"predictionPoint": "string",
"relationshipsConfigurationId": "string",
"target": "string",
"weightsFeature": "string"
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
featureDiscoverySupervisedFeatureReduction | boolean¦null | false | Run supervised feature reduction for Feature Discovery. | |
predictionPoint | string¦null | false | maxLength: 255 |
The date column to be used as the prediction point for time-based feature engineering. |
relationshipsConfigurationId | string¦null | false | [Deprecated] No effect. The relationships configuration ID field is immutable. | |
target | string¦null | false | The feature to use as the target at the modeling stage. | |
weightsFeature | string¦null | false | The weights feature. |
RecipesListResponse
{
"count": 0,
"data": [
{
"createdAt": "2019-08-24T14:15:22Z",
"createdBy": {
"email": "string",
"fullName": "string",
"id": "string",
"userhash": "string",
"username": "string"
},
"description": "string",
"dialect": "snowflake",
"downsampling": {
"arguments": {
"rows": 0,
"seed": null
},
"directive": "random-sample"
},
"errorMessage": null,
"failedOperationsIndex": null,
"inputs": [
{
"alias": "string",
"dataSourceId": "string",
"dataStoreId": "string",
"inputType": "datasource",
"sampling": {
"arguments": {
"rows": 10000,
"seed": 0
},
"directive": "random-sample"
}
}
],
"name": "string",
"operations": [
{
"arguments": {
"conditions": [
{
"column": "string",
"function": "between",
"functionArguments": []
}
],
"keepRows": true,
"operator": "and"
},
"directive": "filter"
}
],
"recipeId": "string",
"recipeType": "wrangling",
"settings": {
"featureDiscoveryProjectId": "string",
"featureDiscoverySupervisedFeatureReduction": null,
"predictionPoint": "string",
"relationshipsConfigurationId": "string",
"target": "string",
"weightsFeature": "string"
},
"status": "draft",
"updatedAt": "2019-08-24T14:15:22Z",
"updatedBy": {
"email": "string",
"fullName": "string",
"id": "string",
"userhash": "string",
"username": "string"
}
}
],
"next": "http://example.com",
"previous": "http://example.com",
"totalCount": 0
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
count | integer | false | Number of items returned on this page. | |
data | [RecipeResponse] | true | A list of the datasets in this Use Case. | |
next | string(uri)¦null | true | URL pointing to the next page (if null, there is no next page). | |
previous | string(uri)¦null | true | URL pointing to the previous page (if null, there is no previous page). | |
totalCount | integer | true | The total number of items across all pages. |
RefinexInsightsResponse
{
"count": 0,
"data": [
{
"datasetId": "string",
"datasetVersionId": "string",
"dateFormat": "string",
"featureType": "Boolean",
"id": 0,
"isZeroInflated": true,
"keySummary": {
"key": "string",
"summary": {
"dataQualities": "ISSUES_FOUND",
"max": 0,
"mean": 0,
"median": 0,
"min": 0,
"pctRows": 0,
"stdDev": 0
}
},
"language": "string",
"lowInformation": true,
"majorityClassCount": 0,
"max": 0,
"mean": 0,
"median": 0,
"min": 0,
"minorityClassCount": 0,
"naCount": 0,
"name": "string",
"plot": [
{
"count": 0,
"label": "string"
}
],
"stdDev": 0,
"timeSeriesEligibilityReason": "string",
"timeSeriesEligibilityReasonAggregation": "string",
"timeSeriesEligible": true,
"timeSeriesEligibleAggregation": true,
"timeStep": 0,
"timeStepAggregation": 0,
"timeUnit": "string",
"timeUnitAggregation": "string",
"uniqueCount": 0
}
],
"message": "string",
"next": "http://example.com",
"previous": "http://example.com",
"status": "ABORTED",
"totalCount": 0
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
count | integer | false | Number of items returned on this page. | |
data | [WranglingFeatureResponse] | true | maxItems: 100 |
The list of features related to the requested dataset. |
message | string | false | Status message. | |
next | string(uri)¦null | true | URL pointing to the next page (if null, there is no next page). | |
previous | string(uri)¦null | true | URL pointing to the previous page (if null, there is no previous page). | |
status | string | false | Job status. | |
totalCount | integer | true | The total number of items across all pages. |
Enumerated Values¶
Property | Value |
---|---|
status | [ABORTED , COMPLETED , ERROR , EXPIRED , INITIALIZED , RUNNING ] |
Relationship
{
"dataset1Identifier": "string",
"dataset1Keys": [
"string"
],
"dataset2Identifier": "string",
"dataset2Keys": [
"string"
],
"featureDerivationWindowEnd": 0,
"featureDerivationWindowStart": 0,
"featureDerivationWindowTimeUnit": "MILLISECOND",
"featureDerivationWindows": [
{
"end": 0,
"start": 0,
"unit": "MILLISECOND"
}
],
"predictionPointRounding": 0,
"predictionPointRoundingTimeUnit": "MILLISECOND"
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
dataset1Identifier | string¦null | false | maxLength: 20 minLength: 1 minLength: 1 |
Identifier of the first dataset in the relationship. If this is not provided, it represents the primary dataset. |
dataset1Keys | [string] | true | maxItems: 10 minItems: 1 |
column(s) in the first dataset that are used to join to the second dataset. |
dataset2Identifier | string | true | maxLength: 20 minLength: 1 minLength: 1 |
Identifier of the second dataset in the relationship. |
dataset2Keys | [string] | true | maxItems: 10 minItems: 1 |
column(s) in the second dataset that are used to join to the first dataset. |
featureDerivationWindowEnd | integer | false | maximum: 0 |
How many featureDerivationWindowUnits of each dataset's primary temporal key into the past relative to the datetimePartitionColumn the feature derivation window should end. Will be a non-positive integer, if present. If present, time-aware joins will be used. Only applicable when table1Identifier is not provided. |
featureDerivationWindowStart | integer | false | maximum: 0 (exclusive) |
How many featureDerivationWindowUnits of each dataset's primary temporal key into the past relative to the datetimePartitionColumn the feature derivation window should begin. Will be a negative integer, if present. If present, time-aware joins will be used. Only applicable when table1Identifier is not provided. |
featureDerivationWindowTimeUnit | string | false | Time unit of the feature derivation window. Supported values are MILLISECOND, SECOND, MINUTE, HOUR, DAY, WEEK, MONTH, QUARTER, YEAR. If present, time-aware joins will be used. Only applicable when table1Identifier is not provided. | |
featureDerivationWindows | [FeatureDerivationWindow] | false | maxItems: 3 |
List of feature derivation window definitions that will be used. |
predictionPointRounding | integer | false | maximum: 30 minimum: 0 (exclusive) |
Closest value of predictionPointRoundingTimeUnit to round the prediction point into the past when applying the feature derivation window. Will be a positive integer, if present. Only applicable when table1Identifier is not provided. |
predictionPointRoundingTimeUnit | string | false | Time unit of the prediction point rounding. Supported values are MILLISECOND, SECOND, MINUTE, HOUR, DAY, WEEK, MONTH, QUARTER, YEAR. Only applicable when table1Identifier is not provided. |
Enumerated Values¶
Property | Value |
---|---|
featureDerivationWindowTimeUnit | [MILLISECOND , SECOND , MINUTE , HOUR , DAY , WEEK , MONTH , QUARTER , YEAR ] |
predictionPointRoundingTimeUnit | [MILLISECOND , SECOND , MINUTE , HOUR , DAY , WEEK , MONTH , QUARTER , YEAR ] |
RelationshipQualityAssessmentsCreate
{
"credentials": [
{
"catalogVersionId": "string",
"credentialId": "string",
"url": "string"
}
],
"datetimePartitionColumn": "string",
"featureEngineeringPredictionPoint": "string",
"relationshipsConfiguration": {
"datasetDefinitions": [
{
"catalogId": "string",
"catalogVersionId": "string",
"featureListId": "string",
"identifier": "string",
"primaryTemporalKey": "string",
"snapshotPolicy": "specified"
}
],
"featureDiscoveryMode": "default",
"featureDiscoverySettings": [
{
"description": "string",
"family": "string",
"name": "string",
"settingType": "string",
"value": true,
"verboseName": "string"
}
],
"id": "string",
"relationships": [
{
"dataset1Identifier": "string",
"dataset1Keys": [
"string"
],
"dataset2Identifier": "string",
"dataset2Keys": [
"string"
],
"featureDerivationWindowEnd": 0,
"featureDerivationWindowStart": 0,
"featureDerivationWindowTimeUnit": "MILLISECOND",
"featureDerivationWindows": [
{
"end": 0,
"start": 0,
"unit": "MILLISECOND"
}
],
"predictionPointRounding": 0,
"predictionPointRoundingTimeUnit": "MILLISECOND"
}
],
"snowflakePushDownCompatible": true
},
"userId": "string"
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
credentials | [oneOf]¦null | false | maxItems: 30 |
Credentials for dynamic policy secondary datasets. |
oneOf
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
» anonymous | StoredCredentials | false | none |
xor
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
» anonymous | CatalogPasswordCredentials | false | none |
continued
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
datetimePartitionColumn | string¦null | false | If a datetime partition column was used, the name of the column. | |
featureEngineeringPredictionPoint | string¦null | false | The date column to be used as the prediction point for time-based feature engineering. | |
relationshipsConfiguration | RelationshipsConfigPayload | true | Object describing how secondary datasets are related to the primary dataset | |
userId | string | false | Mongo Id of the User who created the request |
RelationshipsConfigPayload
{
"datasetDefinitions": [
{
"catalogId": "string",
"catalogVersionId": "string",
"featureListId": "string",
"identifier": "string",
"primaryTemporalKey": "string",
"snapshotPolicy": "specified"
}
],
"featureDiscoveryMode": "default",
"featureDiscoverySettings": [
{
"description": "string",
"family": "string",
"name": "string",
"settingType": "string",
"value": true,
"verboseName": "string"
}
],
"id": "string",
"relationships": [
{
"dataset1Identifier": "string",
"dataset1Keys": [
"string"
],
"dataset2Identifier": "string",
"dataset2Keys": [
"string"
],
"featureDerivationWindowEnd": 0,
"featureDerivationWindowStart": 0,
"featureDerivationWindowTimeUnit": "MILLISECOND",
"featureDerivationWindows": [
{
"end": 0,
"start": 0,
"unit": "MILLISECOND"
}
],
"predictionPointRounding": 0,
"predictionPointRoundingTimeUnit": "MILLISECOND"
}
],
"snowflakePushDownCompatible": true
}
Object describing how secondary datasets are related to the primary dataset
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
datasetDefinitions | [DatasetDefinition] | true | maxItems: 30 minItems: 1 |
A list of datasets |
featureDiscoveryMode | string¦null | false | Mode of feature discovery. Supported values are 'default' and 'manual'. | |
featureDiscoverySettings | [FeatureDiscoverySettingResponse]¦null | false | maxItems: 100 |
List of feature discovery settings used to customize the feature discovery process. |
id | string | true | Id of the relationship configuration | |
relationships | [Relationship] | true | maxItems: 70 minItems: 1 |
A list of relationships |
snowflakePushDownCompatible | boolean¦null | false | Flag indicating if the relationships configuration is compatible with Snowflake push down processing. |
Enumerated Values¶
Property | Value |
---|---|
featureDiscoveryMode | [default , manual ] |
RenameColumn
{
"newName": "string",
"originalName": "string"
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
newName | string | true | The new column name. | |
originalName | string | true | The original column name. |
RenameColumnsArguments
{
"columnMappings": [
{
"newName": "string",
"originalName": "string"
}
]
}
The transformation description.
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
columnMappings | [RenameColumn] | true | maxItems: 1000 minItems: 1 |
The list of name mappings. |
ReplaceDirectiveArguments
{
"isCaseSensitive": true,
"matchMode": "partial",
"origin": "string",
"replacement": "",
"searchFor": "string"
}
The transformation description.
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
isCaseSensitive | boolean | false | The flag indicating if the "search_for" value is case-sensitive. | |
matchMode | string | true | The match mode to use when detecting "search_for" values. | |
origin | string | true | The place name to look for in values. | |
replacement | string | false | The replacement value. | |
searchFor | string | true | Indicates what needs to be replaced. |
Enumerated Values¶
Property | Value |
---|---|
matchMode | [partial , exact , regex ] |
SampleDirectiveCreate
{
"arguments": {
"rows": 10000,
"seed": 0
},
"directive": "random-sample"
}
The input data transformation steps.
Properties¶
oneOf
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
anonymous | object | false | none | |
» arguments | RandomSampleArgumentsCreate | false | The interactive sampling config. | |
» directive | string | true | The directive name. |
xor
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
anonymous | object | false | none | |
» arguments | DatetimeSampleArgumentsCreate | false | The interactive sampling config. | |
» directive | string | true | The directive name. |
xor
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
anonymous | object | false | none | |
» arguments | LimitDirectiveArguments | true | The interactive sampling config. | |
» directive | string | true | The directive name. |
Enumerated Values¶
Property | Value |
---|---|
directive | random-sample |
directive | datetime-sample |
directive | limit |
SmartDownsamplingArguments
{
"method": "binary",
"rows": 2,
"seed": null
}
The downsampling configuration.
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
method | string | true | The smart downsampling method. | |
rows | integer | true | minimum: 2 |
The number of sampled rows. |
seed | integer¦null | false | The starting number for the random number generator |
Enumerated Values¶
Property | Value |
---|---|
method | [binary , zero-inflated ] |
StatusResponse
{
"code": 0,
"created": "2019-08-24T14:15:22Z",
"description": "",
"message": "",
"status": "INITIALIZED",
"statusId": "e900225c-0629-4e96-be6e-86a17a309645",
"statusType": ""
}
StatusResponse
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
code | JobErrorCode | true | The error code associated with the job. | |
created | string(date-time) | true | The creation date of the job (ISO 8601 formatted). | |
description | string | false | The description associated with the job. | |
message | string | false | The error message associated with the job. | |
status | JobExecutionState | true | The execution status of the job. | |
statusId | string(uuid) | true | ID that can be used with GET /api/v2/status/{statusId}/ to poll for the testing job's status. | |
statusType | string | false | The type of the status object. |
StoredCredentials
{
"catalogVersionId": "string",
"credentialId": "string",
"url": "string"
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
catalogVersionId | string | false | Identifier of the catalog version | |
credentialId | string | true | ID of the credentials object in credential store.Can only be used along with catalogVersionId. | |
url | string¦null | false | URL that is subject to credentials. |
WranglingFeatureResponse
{
"datasetId": "string",
"datasetVersionId": "string",
"dateFormat": "string",
"featureType": "Boolean",
"id": 0,
"isZeroInflated": true,
"keySummary": {
"key": "string",
"summary": {
"dataQualities": "ISSUES_FOUND",
"max": 0,
"mean": 0,
"median": 0,
"min": 0,
"pctRows": 0,
"stdDev": 0
}
},
"language": "string",
"lowInformation": true,
"majorityClassCount": 0,
"max": 0,
"mean": 0,
"median": 0,
"min": 0,
"minorityClassCount": 0,
"naCount": 0,
"name": "string",
"plot": [
{
"count": 0,
"label": "string"
}
],
"stdDev": 0,
"timeSeriesEligibilityReason": "string",
"timeSeriesEligibilityReasonAggregation": "string",
"timeSeriesEligible": true,
"timeSeriesEligibleAggregation": true,
"timeStep": 0,
"timeStepAggregation": 0,
"timeUnit": "string",
"timeUnitAggregation": "string",
"uniqueCount": 0
}
Properties¶
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
datasetId | string | true | The ID of the dataset the feature belongs to | |
datasetVersionId | string | true | The ID of the dataset version the feature belongs to. | |
dateFormat | string¦null | true | The date format string for how this feature was interpreted (or null if not a date feature). If not null, it will be compatible with https://docs.python.org/2/library/time.html#time.strftime . | |
featureType | string | true | Feature type. | |
id | integer | true | The number of the column in the dataset. | |
isZeroInflated | boolean¦null | false | whether feature has an excessive number of zeros | |
keySummary | any | false | Per key summaries for Summarized Categorical or Multicategorical columns |
oneOf
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
» anonymous | FeatureKeySummaryResponseValidatorSummarizedCategorical | false | For a Summarized Categorical columns, this will contain statistics for the top 50 keys (truncated to 103 characters) |
xor
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
» anonymous | [FeatureKeySummaryResponseValidatorMultilabel] | false | For a Multicategorical columns, this will contain statistics for the top classes |
continued
Name | Type | Required | Restrictions | Description |
---|---|---|---|---|
language | string | false | Detected language of the feature. | |
lowInformation | boolean | false | Whether feature has too few values to be informative. | |
majorityClassCount | integer¦null | true | The number of rows with a majority class value if smart downsampling is applicable to this feature. | |
max | number¦null | false | Maximum value of the EDA sample of the feature. | |
mean | number¦null | false | Arithmetic mean of the EDA sample of the feature. | |
median | number¦null | false | Median of the EDA sample of the feature. | |
min | number¦null | false | Minimum value of the EDA sample of the feature. | |
minorityClassCount | integer¦null | true | The number of rows with neither null nor majority class value if smart downsampling is applicable to this feature. | |
naCount | integer¦null | false | Number of missing values. | |
name | string | true | Feature name | |
plot | [DatasetFeaturePlotDataResponse]¦null | false | Plot data based on feature values. | |
stdDev | number¦null | false | Standard deviation of EDA sample of the feature. | |
timeSeriesEligibilityReason | string¦null | false | why the feature is ineligible for time series projects, or 'suitable' if it is eligible. | |
timeSeriesEligibilityReasonAggregation | string¦null | false | why the feature is ineligible for aggregation, or 'suitable' if it is eligible. | |
timeSeriesEligible | boolean | false | whether this feature can be used as a datetime partitioning feature for time series projects. Only sufficiently regular date features can be selected as the datetime feature for time series projects. Always false for non-date features. Date features that cannot be used in datetime partitioning for a time series project may be eligible for an OTV project, which has less stringent requirements. | |
timeSeriesEligibleAggregation | boolean | false | whether this feature can be used as a datetime feature for aggregationfor time series data prep. Always false for non-date features. | |
timeStep | integer¦null | false | The minimum time step that can be used to specify time series windows. The units for this value are the timeUnit . When specifying windows for time series projects, all windows must have durations that are integer multiples of this number. Only present for date features that are eligible for time series projects and null otherwise. |
|
timeStepAggregation | integer¦null | false | The minimum time step that can be used to aggregate using this feature for time series data prep. The units for this value are the timeUnit . Only present for date features that are eligible for aggregation in time series data prep and null otherwise. |
|
timeUnit | string¦null | false | The unit for the interval between values of this feature, e.g. DAY, MONTH, HOUR. When specifying windows for time series projects, the windows are expressed in terms of this unit. Only present for date features eligible for time series projects, and null otherwise. | |
timeUnitAggregation | string¦null | false | The unit for the interval between values of this feature, e.g. DAY, MONTH, HOUR. Only present for date features eligible for aggregation, and null otherwise. | |
uniqueCount | integer¦null | false | Number of unique values. |
Enumerated Values¶
Property | Value |
---|---|
featureType | [Boolean , Categorical , Currency , Date , Date Duration , Document , Image , Interaction , Length , Location , Multicategorical , Numeric , Percentage , Summarized Categorical , Text , Time ] |