Platform (V9.0)¶
The following table lists each new feature.
Platform enhancements¶
-
With DataRobot release version 9.0, deployments are now only supported by Kubernetes. Version 9.0 supports OpenShift 4.10 and AWS EKS with K8s v1.23. Older installation options (i.e., Dockerized, RPM, and Hadoop) are no longer supported. If you are not on a supported version of Kubernetes, you will need to use the 8.x versions of DataRobot with Dockerized, RPM, or Hadoop installs.
-
Minio will not be packaged with the DataRobot installation. You will need to provide and manage an S3 API-compatible object store to use with DataRobot.
-
DataRobot will no longer package a container registry. You will need to provide a docker registry for DataRobot containers.
Preview: DataRobot Notebooks¶
The DataRobot application now includes an in-browser editor to create and execute notebooks for data science analysis and modeling. Notebooks display computation results in various formats, including text, images, graphs, plots, tables, and more. You can customize output display by using open-source plugins. Cells can also contain Markdown rich text for commentary and explanation of the coding workflow. As you develop and edit a notebook, DataRobot stores a history of revisions that you can return to at any time.
DataRobot Notebooks offer a dashboard that hosts notebook creation, upload, and management. Individual notebooks have containerized, built-in environments with commonly used machine learning libraries that you can easily set up in a few clicks. Notebook environments seamlessly integrate with DataRobot's API, allowing a robust coding experience supported by keyboard shortcuts for cell functions, in-line documentation, and saved environment variables for secrets management and automatic authentication.
Preview documentation.
API enhancements¶
The following is a summary of API new features and enhancements. Go to the API Documentation home for more information on each client.
Tip
DataRobot highly recommends updating to the latest API client for Python and R.
Access DataRobot REST API documentation from docs.datarobot.com¶
DataRobot now offers REST API documentation available directly from the public documentation hub. Previously, REST API docs were only accessible through the application. Now, you can access information about REST endpoints and parameters in the API reference section of the public documentation site.
Python client v3.0¶
Now generally available, DataRobot has released version 3.0 of the Python client. This version introduces significant changes to common methods and usage of the client. Many prominent changes are listed below, but view the changelog for a complete list of changes introduced in version 3.0.
Python client v3.0 new features¶
A summary of some new features for version 3.0 are outlined below:
- Version 3.0 of the Python client does not support Python 3.6 and earlier versions. Version 3.0 currently supports Python 3.7+.
- The default Autopilot mode for the
project.start_autopilotmethod has changed toAUTOPILOT_MODE.QUICK. - Pass a file, file path, or DataFrame to a deployment to easily make batch predictions and return the results as a DataFrame using the new method
Deployment.predict_batch. - You can use a new method to retrieve the canonical URI for a project, model, deployment, or dataset:
Project.get_uriModel.get_uriDeployment.get_uriDataset.get_uri
New methods for DataRobot projects¶
Review the new methods available for datarobot.models.Project:
Project.get_optionsallows you to retrieve saved modeling options.Project.set_optionssavesAdvancedOptionsvalues for use in modeling.Project.analyze_and_modelinitiates Autopilot or data analysis using data that has been uploaded to DataRobot.Project.get_datasetretrieves the dataset used to create the project.Project.set_partitioning_methodcreates the correct Partition class for a regular project based on input arguments.Project.set_datetime_partitioningcreates the correct Partition class for a time series project.Project.get_top_modelreturns the highest scoring model for a metric of your choice.
Python client v3.1¶
The following API enhancements are introduced with version 3.1 of DataRobot's Python client:
-
Added new methods
BatchPredictionJob.apply_time_series_data_prep_and_scoreandBatchPredictionJob.apply_time_series_data_prep_and_score_to_filethat apply time series data prep to a file or dataset and make batch predictions with a deployment. -
Added new methods
DataEngineQueryGenerator.prepare_prediction_datasetandDataEngineQueryGenerator.prepare_prediction_dataset_from_catalogthat apply time series data prep to a file or catalog dataset and upload the prediction dataset to a project. -
Added new
max_waitparameter to the methodProject.create_from_dataset. Values larger than the default can be specified to avoid timeouts when creating a project from Dataset. -
Added new method for creating a segmented modeling project from an existing clustering project and model
Project.create_segmented_project_from_clustering_model. Switch to this function if you are previously using ModelPackage for segmented modeling purposes. -
Added new method
is_unsupervised_clustering_or_multiclassfor checking whether the clustering or multiclass parameters are used, quick and efficient without extra API calls. -
Added value
PREPARED_FOR_DEPLOYMENTto theRECOMMENDED_MODEL_TYPEenum. -
Added two new methods to the ImageAugmentationList class:
ImageAugmentationList.listandImageAugmentationList.update. -
Added
formatkey to Batch Prediction intake and output settings for S3, GCP and Azure. -
The method
PredictionExplanations.is_multiclassnow adds an additional API call to check for multiclass target validity, which adds a small delay. -
AdvancedOptionsparameterblend_best_modelsdefaults to false. -
AdvancedOptions <datarobot.helpers.AdvancedOptions>parameterconsider_blenders_in_recommendationdefaults to false. -
DatetimePartitioningnow has the parameterunsupervised_mode.
Preview: R client v2.29¶
Now available for preview, DataRobot has released version 2.29 of the R client. This version brings parity between the R client and version 2.29 of the Public API. As a result, it introduces significant changes to common methods and usage of the client. These changes are encapsulated in a new library (in addition to the datarobot library): datarobot.apicore, which provides auto-generated functions to access the Public API. The datarobot package provides a number of API wrapper functions around the apicore package to make it easier to use.
Reference the v2.29 documentation for more details on the new R client, including installation instructions, detailed method overviews, and reference documentation.
New R Functions¶
- Generated API wrapper functions are organized into categories based on their tags from the OpenAPI specification, which were themselves redone for the entire DataRobot Public API in v2.27.
- API wrapper functions use camel-cased argument names to be consistent with the rest of the package.
- Most function names follow a
VerbObjectpattern based on the OpenAPI specification. - Some function names match "legacy" functions that existed in v2.18 of the R Client if they invoked the same underlying endpoint. For example, the wrapper function is called
GetModel, notRetrieveProjectsModels, since the latter is what was implemented in the R client for the endpoint/projects/{mId}/models/{mId}. - Similarly, these functions use the same arguments as the corresponding "legacy" functions to ensure DataRobot does not break existing code calling those functions.
- The R client (both
datarobotanddatarobot.apicorepackages) outputs a warning when you attempt to access certain resources (projects, models, deployments, etc.) that are deprecated or disabled by the DataRobot platform migration to Python 3. - Added the helper function
EditConfigthat allows you to interactively modifydrconfig.yaml. - Added the
DownloadDatasetAsCsvfunction to retrieve a dataset as a CSV file usingcatalogId. - Added the
GetFeatureDiscoveryRelationshipsfunction to get the feature discovery relationships for a project. - The R client (both
datarobotanddatarobot.apicorepackages) will output a warning when you attempt to access certain resources (projects, models, deployments, etc.) that are deprecated or disabled by the DataRobot platform migration to Python 3.
R enhancements¶
- The function
RequestFeatureImpactnow accepts arowCountargument, which will change the sample size used for Feature Impact calculations. - The internal helper function
ValidateModelwas renamed toValidateAndReturnModeland now works with model classes from theapicorepackage. - The
quickrunargument has been removed from the functionSetTarget. Setmode = AutopilotMode.Quickinstead. - The Transferable Models family of functions (
ListTransferableModels,GetTransferableModel,RequestTransferableModel,DownloadTransferableModel,UploadTransferableModel,UpdateTransferableModel,DeleteTransferableModel) have been removed. The underlying endpoints—long deprecated—were removed from the Public API with the removal of the Standalone Scoring Engine (SSE). - Removed files (code, tests, doc) representing parts of the Public API not present in v2.27-2.29.
Calculate Feature Impact for each backtest¶
Feature Impact provides a transparent overview of a model, especially in a model's compliance documentation. Time-dependent models trained on different backtests and holdout partitions can have different Feature Impact calculations for each backtest. Now generally available, you can calculate Feature Impact for each backtest using DataRobot's REST API, allowing you to inspect model stability over time by comparing Feature Impact scores from different backtests.
Deprecation announcements¶
API deprecations¶
R deprecations¶
Review the breaking changes introduced in version 2.29:
- The
quickrunargument has been removed from the function SetTarget. Setmode = AutopilotMode.Quickinstead. -
The Transferable Models functions have been removed. Note that the underlying endpoints were also removed from the Public API with the removal of the Standalone Scoring Engine (SSE). The affected functions are listed below:
ListTransferableModelsGetTransferableModelRequestTransferableModelDownloadTransferableModelUploadTransferableModelUpdateTransferableModelDeleteTransferableModel
Review the deprecations introduced in version 2.29:
- Compliance Documentation API is deprecated. Instead use the Automated Documentation API.
Python deprecations¶
Review the deprecations introduced in version 3.0:
Project.set_targethas been removed. UseProject.analyze_and_modelinstead.PredictJob.createhas been removed. UseModel.request_predictionsinstead.Model.get_leaderboard_ui_permalinkhas been removed. UseModel.get_uriinstead.Project.open_leaderboard_browserhas been removed. UseProject.open_in_browserinstead.ComplianceDocumentationhas been removed. UseAutomatedDocumentinstead.
The following deprecations are introduced in version 3.1:
- Deprecated method
Project.create_from_hdfs. - Deprecated method
DatetimePartitioning.generate. - Deprecated parameter
in_usefromImageAugmentationList.createas DataRobot will take care of it automatically. - Deprecated property
Deployment.capabilitiesfromDeployment. ImageAugmentationSample.computewas removed in v3.1. You can get the same information with the methodImageAugmentationList.compute_samples.- The
sample_idparameter is now removed fromImageAugmentationSample.list. Please useauglist_idinstead.
Hadoop is no longer available¶
Starting with version 9.0, you can only install DataRobot on Kubernetes. Dockerized, RPM, and Hadoop installations will no longer be available. Also, the ability to directly ingest data from HDFS for modeling and prediction is deprecated.
