Platform (V9.0)¶
The following table lists each new feature.
Platform enhancements¶
-
With DataRobot release version 9.0, deployments are now only supported by Kubernetes. Version 9.0 supports OpenShift 4.10 and AWS EKS with K8s v1.23. Older installation options (i.e., Dockerized, RPM, and Hadoop) are no longer supported. If you are not on a supported version of Kubernetes, you will need to use the 8.x versions of DataRobot with Dockerized, RPM, or Hadoop installs.
-
Minio will not be packaged with the DataRobot installation. You will need to provide and manage an S3 API-compatible object store to use with DataRobot.
-
DataRobot will no longer package a container registry. You will need to provide a docker registry for DataRobot containers.
Preview: DataRobot Notebooks¶
The DataRobot application now includes an in-browser editor to create and execute notebooks for data science analysis and modeling. Notebooks display computation results in various formats, including text, images, graphs, plots, tables, and more. You can customize output display by using open-source plugins. Cells can also contain Markdown rich text for commentary and explanation of the coding workflow. As you develop and edit a notebook, DataRobot stores a history of revisions that you can return to at any time.
DataRobot Notebooks offer a dashboard that hosts notebook creation, upload, and management. Individual notebooks have containerized, built-in environments with commonly used machine learning libraries that you can easily set up in a few clicks. Notebook environments seamlessly integrate with DataRobot's API, allowing a robust coding experience supported by keyboard shortcuts for cell functions, in-line documentation, and saved environment variables for secrets management and automatic authentication.
Preview documentation.
API enhancements¶
The following is a summary of API new features and enhancements. Go to the API Documentation home for more information on each client.
Tip
DataRobot highly recommends updating to the latest API client for Python and R.
Access DataRobot REST API documentation from docs.datarobot.com¶
DataRobot now offers REST API documentation available directly from the public documentation hub. Previously, REST API docs were only accessible through the application. Now, you can access information about REST endpoints and parameters in the API reference section of the public documentation site.
Python client v3.0¶
Now generally available, DataRobot has released version 3.0 of the Python client. This version introduces significant changes to common methods and usage of the client. Many prominent changes are listed below, but view the changelog for a complete list of changes introduced in version 3.0.
Python client v3.0 new features¶
A summary of some new features for version 3.0 are outlined below:
- Version 3.0 of the Python client does not support Python 3.6 and earlier versions. Version 3.0 currently supports Python 3.7+.
- The default Autopilot mode for the
project.start_autopilot
method has changed toAUTOPILOT_MODE.QUICK
. - Pass a file, file path, or DataFrame to a deployment to easily make batch predictions and return the results as a DataFrame using the new method
Deployment.predict_batch
. - You can use a new method to retrieve the canonical URI for a project, model, deployment, or dataset:
Project.get_uri
Model.get_uri
Deployment.get_uri
Dataset.get_uri
New methods for DataRobot projects¶
Review the new methods available for datarobot.models.Project
:
Project.get_options
allows you to retrieve saved modeling options.Project.set_options
savesAdvancedOptions
values for use in modeling.Project.analyze_and_model
initiates Autopilot or data analysis using data that has been uploaded to DataRobot.Project.get_dataset
retrieves the dataset used to create the project.Project.set_partitioning_method
creates the correct Partition class for a regular project based on input arguments.Project.set_datetime_partitioning
creates the correct Partition class for a time series project.Project.get_top_model
returns the highest scoring model for a metric of your choice.
Python client v3.1¶
The following API enhancements are introduced with version 3.1 of DataRobot's Python client:
-
Added new methods
BatchPredictionJob.apply_time_series_data_prep_and_score
andBatchPredictionJob.apply_time_series_data_prep_and_score_to_file
that apply time series data prep to a file or dataset and make batch predictions with a deployment. -
Added new methods
DataEngineQueryGenerator.prepare_prediction_dataset
andDataEngineQueryGenerator.prepare_prediction_dataset_from_catalog
that apply time series data prep to a file or catalog dataset and upload the prediction dataset to a project. -
Added new
max_wait
parameter to the methodProject.create_from_dataset
. Values larger than the default can be specified to avoid timeouts when creating a project from Dataset. -
Added new method for creating a segmented modeling project from an existing clustering project and model
Project.create_segmented_project_from_clustering_model
. Switch to this function if you are previously using ModelPackage for segmented modeling purposes. -
Added new method
is_unsupervised_clustering_or_multiclass
for checking whether the clustering or multiclass parameters are used, quick and efficient without extra API calls. -
Added value
PREPARED_FOR_DEPLOYMENT
to theRECOMMENDED_MODEL_TYPE
enum. -
Added two new methods to the ImageAugmentationList class:
ImageAugmentationList.list
andImageAugmentationList.update
. -
Added
format
key to Batch Prediction intake and output settings for S3, GCP and Azure. -
The method
PredictionExplanations.is_multiclass
now adds an additional API call to check for multiclass target validity, which adds a small delay. -
AdvancedOptions
parameterblend_best_models
defaults to false. -
AdvancedOptions <datarobot.helpers.AdvancedOptions>
parameterconsider_blenders_in_recommendation
defaults to false. -
DatetimePartitioning
now has the parameterunsupervised_mode
.
Preview: R client v2.29¶
Now available for preview, DataRobot has released version 2.29 of the R client. This version brings parity between the R client and version 2.29 of the Public API. As a result, it introduces significant changes to common methods and usage of the client. These changes are encapsulated in a new library (in addition to the datarobot
library): datarobot.apicore
, which provides auto-generated functions to access the Public API. The datarobot
package provides a number of API wrapper functions around the apicore
package to make it easier to use.
Reference the v2.29 documentation for more details on the new R client, including installation instructions, detailed method overviews, and reference documentation.
New R Functions¶
- Generated API wrapper functions are organized into categories based on their tags from the OpenAPI specification, which were themselves redone for the entire DataRobot Public API in v2.27.
- API wrapper functions use camel-cased argument names to be consistent with the rest of the package.
- Most function names follow a
VerbObject
pattern based on the OpenAPI specification. - Some function names match "legacy" functions that existed in v2.18 of the R Client if they invoked the same underlying endpoint. For example, the wrapper function is called
GetModel
, notRetrieveProjectsModels
, since the latter is what was implemented in the R client for the endpoint/projects/{mId}/models/{mId}
. - Similarly, these functions use the same arguments as the corresponding "legacy" functions to ensure DataRobot does not break existing code calling those functions.
- The R client (both
datarobot
anddatarobot.apicore
packages) outputs a warning when you attempt to access certain resources (projects, models, deployments, etc.) that are deprecated or disabled by the DataRobot platform migration to Python 3. - Added the helper function
EditConfig
that allows you to interactively modifydrconfig.yaml
. - Added the
DownloadDatasetAsCsv
function to retrieve a dataset as a CSV file usingcatalogId
. - Added the
GetFeatureDiscoveryRelationships
function to get the feature discovery relationships for a project. - The R client (both
datarobot
anddatarobot.apicore
packages) will output a warning when you attempt to access certain resources (projects, models, deployments, etc.) that are deprecated or disabled by the DataRobot platform migration to Python 3.
R enhancements¶
- The function
RequestFeatureImpact
now accepts arowCount
argument, which will change the sample size used for Feature Impact calculations. - The internal helper function
ValidateModel
was renamed toValidateAndReturnModel
and now works with model classes from theapicore
package. - The
quickrun
argument has been removed from the functionSetTarget
. Setmode = AutopilotMode.Quick
instead. - The Transferable Models family of functions (
ListTransferableModels
,GetTransferableModel
,RequestTransferableModel
,DownloadTransferableModel
,UploadTransferableModel
,UpdateTransferableModel
,DeleteTransferableModel
) have been removed. The underlying endpoints—long deprecated—were removed from the Public API with the removal of the Standalone Scoring Engine (SSE). - Removed files (code, tests, doc) representing parts of the Public API not present in v2.27-2.29.
Calculate Feature Impact for each backtest¶
Feature Impact provides a transparent overview of a model, especially in a model's compliance documentation. Time-dependent models trained on different backtests and holdout partitions can have different Feature Impact calculations for each backtest. Now generally available, you can calculate Feature Impact for each backtest using DataRobot's REST API, allowing you to inspect model stability over time by comparing Feature Impact scores from different backtests.
Deprecation announcements¶
API deprecations¶
R deprecations¶
Review the breaking changes introduced in version 2.29:
- The
quickrun
argument has been removed from the function SetTarget. Setmode = AutopilotMode.Quick
instead. -
The Transferable Models functions have been removed. Note that the underlying endpoints were also removed from the Public API with the removal of the Standalone Scoring Engine (SSE). The affected functions are listed below:
ListTransferableModels
GetTransferableModel
RequestTransferableModel
DownloadTransferableModel
UploadTransferableModel
UpdateTransferableModel
DeleteTransferableModel
Review the deprecations introduced in version 2.29:
- Compliance Documentation API is deprecated. Instead use the Automated Documentation API.
Python deprecations¶
Review the deprecations introduced in version 3.0:
Project.set_target
has been removed. UseProject.analyze_and_model
instead.PredictJob.create
has been removed. UseModel.request_predictions
instead.Model.get_leaderboard_ui_permalink
has been removed. UseModel.get_uri
instead.Project.open_leaderboard_browser
has been removed. UseProject.open_in_browser
instead.ComplianceDocumentation
has been removed. UseAutomatedDocument
instead.
The following deprecations are introduced in version 3.1:
- Deprecated method
Project.create_from_hdfs
. - Deprecated method
DatetimePartitioning.generate
. - Deprecated parameter
in_use
fromImageAugmentationList.create
as DataRobot will take care of it automatically. - Deprecated property
Deployment.capabilities
fromDeployment
. ImageAugmentationSample.compute
was removed in v3.1. You can get the same information with the methodImageAugmentationList.compute_samples
.- The
sample_id
parameter is now removed fromImageAugmentationSample.list
. Please useauglist_id
instead.
Hadoop is no longer available¶
Starting with version 9.0, you can only install DataRobot on Kubernetes. Dockerized, RPM, and Hadoop installations will no longer be available. Also, the ability to directly ingest data from HDFS for modeling and prediction is deprecated.