DataRobot apicore package walkthrough¶
In v2.29 of the R Client, available for public preview, DataRobot has introduced a new dependency: the datarobot.apicore
package. The package provides you with access to all of the capabilities of the DataRobot platform previously unavailable for the R client. The package is generated from the OpenAPI specification of the DataRobot Public API.
Use the following snippet to install the two packages:
# Use the following links to install the package directly from GitHub library(remotes) install_github("datarobot/rsdk", subdir = "datarobot.apicore", ref = github_release()) install_github("datarobot/rsdk", subdir = "datarobot", ref = github_release()) # Download the tarballs directly from the GitHub Releases page at # https://github.com/datarobot/rsdk/releases # Then run install_packages()
The datarobot.apicore
package is loaded into your R session when you load the datarobot
package.
library(datarobot) #> Loading required package: datarobot.apicore #> Authenticating with config at: /Users/druser/.config/datarobot/drconfig.yaml #> Authentication token saved
The following code lets you check the version of the API server that you are connected to.
iapi <- datarobot.apicore::InfrastructureApi$new() iapi$VersionList() #> $versionString #> [1] "2.30.0" #> #> $minor #> [1] 30 #> #> $major #> [1] 2 #> #> attr(,"class") #> [1] "VersionRetrieveResponse"
Next, try using the endpoint GET /datasets
. It retrieves all of the datasets that you have access to in the AI Catalog (a new method introduced in v2.29).
catalogapi <- datarobot.apicore::AiCatalogApi$new() try(catalogapi$DatasetsList()) #> Error in private$DatasetsListWithHttpInfo(limit, offset, category, orderBy, : #> Missing required parameter `limit`.
v2.29 also introduces request parameter validation with helpful error messaging. Fill in the parameters used below (both required and optional).
datasets <- try(catalogapi$DatasetsList( limit = 2, offset = 0, category = "TRAINING", orderBy = "created" )) dataset <- datasets$data[[1]] dataset[c("name", "datasetId", "datasetSize", "creationDate")] #> $name #> [1] "SPI 2016-2019.csv" #> #> $datasetId #> [1] "600f45bba65b448826884d5f" #> #> $datasetSize #> [1] 8795275 #> #> $creationDate #> [1] "2021-01-25 22:27:07 UTC"
The datarobot.apicore
package is very expressive and can provide R-specific functionality around the DataRobot Public API. Generally speaking, though, you may not need this additional customization, so the datarobot
package provides several conveniences to simplify your development.
Access the API¶
The datarobot
environment hosts a singleton list, dr
, containing instances of all of the different API classes in datarobot.apicore
.
exists("dr") #> [1] TRUE print(names(dr)) #> [1] "AiCatalogApi" "AnalyticsApi" #> [3] "ApplicationsApi" "BlueprintsApi" #> [5] "CommentsApi" "CredentialsApi" #> [7] "CustomTasksApi" "DataConnectivityApi" #> [9] "DatetimePartitioningApi" "DeploymentsApi" #> [11] "DocumentationApi" "GovernanceApi" #> [13] "ImagesApi" "InfrastructureApi" #> [15] "InsightsApi" "JobsApi" #> [17] "MlopsApi" "ModelsApi" #> [19] "NotificationsApi" "PredictionsApi" #> [21] "ProjectsApi" "SsoConfigurationApi" #> [23] "UseCaseApi" "UserManagementApi" #> [25] "UtilitiesApi"
You can use this list to quickly access API methods and avoid constructing new objects every time. Try checking the API server version again.
# Server version dr$InfrastructureApi$VersionList() #> $versionString #> [1] "2.30.0" #> #> $minor #> [1] 30 #> #> $major #> [1] 2 #> #> attr(,"class") #> [1] "VersionRetrieveResponse" # AI catalog datasets datasets <- try(dr$AiCatalogApi$DatasetsList( limit = 2, offset = 0, category = "TRAINING", orderBy = "created" )) dataset <- datasets$data[[1]] dataset[c("name", "datasetId", "datasetSize", "creationDate")] #> $name #> [1] "SPI 2016-2019.csv" #> #> $datasetId #> [1] "600f45bba65b448826884d5f" #> #> $datasetSize #> [1] 8795275 #> #> $creationDate #> [1] "2021-01-25 22:27:07 UTC"
The example above shows that you can use one less line of code and create one less object per API call.
Note
The API classes in the dr
list all use the default authentication method, ConnectToDataRobot()
.
Convenience wrapper functions¶
DataRobot provides a set of wrapper functions around every API endpoint. These functions:
- Follow the saner naming convention of
VerbObject()
that has existed in the R API Client. For example,ListDatasets()
rather thanDatasetsList()
. - Reuse the old names for functions that were already implemented in the R API Client before v2.29. For example,
GetServerVersion()
rather thanVersionList()
. - Set default values if they were provided in the OpenAPI spec.
Try checking the API server version one more time:
GetServerVersion() #> $major #> [1] 2 #> #> $minor #> [1] 30 #> #> $versionString #> [1] "2.30.0" #> #> $releasedVersion #> [1] "2.29.0"
Now, try looking up training datasets.
trainingDatasets <- try(ListDatasets( category = "TRAINING", orderBy = "created", datasetVersionIds = c(), offset = 0, limit = 2 )) dataset <- trainingDatasets$data[[1]] dataset[c("name", "datasetId", "datasetSize", "creationDate")] #> $name #> [1] "SPI 2016-2019.csv" #> #> $datasetId #> [1] "600f45bba65b448826884d5f" #> #> $datasetSize #> [1] 8795275 #> #> $creationDate #> [1] "2021-01-25 22:27:07 UTC"
DataRobot recommends you use whichever pattern you’re most comfortable with, but the convenience wrapper functions provide syntactic ease.