November 2022
November 2022¶
November 22, 2022
With the latest deployment, DataRobot's managed AI Platform deployment delivered the following new GA and preview features. See the deployment history for past feature announcements. See also:
Features grouped by capability
Name | GA | Preview |
---|---|---|
Modeling | ||
Text Prediction Explanations now GA | ✔ | |
Changes to blender model defaults | ✔ | |
Japanese compliance documentation now generally available, more complete | ✔ | |
Prediction Explanations for cluster models | ✔ | |
Use Cases tab renamed to Value Tracker | ✔ | |
Predictions and MLOps | ||
Environment limit management for custom models | ✔ | |
Dynamically load required agent spoolers in a Java application | ✔ | |
API enhancements | ||
R client v2.29 | ✔ |
NumPy library to be upgraded in December¶
DataRobot is upgrading a Python library called numpy
during the week of December 11, 2022. Users should not experience backward compatibility issues as a result of this change.
The numpy
library handles various numerical transformations related to data processing and preparation in the platform. Upgrading the numpy
library is a proactive step to address common vulnerabilities and exposures (CVEs). DataRobot regularly upgrades libraries to improve speed, security, and predictive performance.
Testing indicates that a subset of users may experience small changes in model predictions as a result of the upgrade. Only users that have trained and deployed a model using .xpt
or .xport
file formats may see predictions change. In cases where predictions change, the difference in prediction values is typically less than 1%. These changes are due to incremental differences in the treatment of floats between the current and target upgrade versions of the numpy
library.
GA¶
Text Prediction Explanations now GA¶
Text Prediction Explanations help understand how individual words (n-grams) in a text feature influence predictions, helping to validate and understand the model and the importance it is placing on words. They use the standard color bar spectrum of blue (negative) to red (positive) impact to easily visualize and understand your text and display n-grams not recognized by the model in grey. Text Prediction Explanations, either XEMP OR SHAP, are run by default when text is present in a dataset.
Changes to blender model defaults¶
This release brings changes to the default behavior of blender models. A blender (or ensemble) model combines the predictions of two or more models, potentially improving accuracy. DataRobot can automatically create these models at the end of Autopilot when the Create blenders from top models advanced option is enabled. Previously the default setting was to enable creating blenders automatically; now, the default is not to build these models.
Additionally, the number of models allowed when creating blenders either automatically or manually has changed. While previously there was both no limit, and later a three-model maximum in the number of contributory models, that limit has been adjusted to allow up to eight models per blender.
Finally, the automatic creation of advanced blenders has been removed. These blenders used a backwards stage-wise process to eliminate models when it benefits the blend's cross-validation score.
- Advanced Average (AVG) Blend
- Advanced Generalized Linear Model (GLM) Blend
- Advanced Elastic Net (ENET) Blend
The following blender types are currently in the process of deprecation:
Blender | Deprecation status |
---|---|
Random Forest Blend (RF) | Existing RF blenders continue to work; you cannot create new RF blenders. |
Light Gradient Boosting Machine Blend (LGBM) | Existing LGBM blenders continue to work; you cannot create new LGBM blenders. |
TensorFlow Blend (TF) | Existing TF blenders do not work; you cannot create new TF blenders. |
These changes have been made in response to customer feedback. Because blenders can extend build times and cause deployment issues, the changes ensure that these impacts only affect those users needing the capability. Testing has determined that, in most cases, the accuracy gain does not justify the extended runtimes imposed on Autopilot. For data scientists who need blender capabilities, manual blending is not affected.
Japanese compliance documentation now generally available, more complete¶
With this release, model compliance documentation is now generally available for users in Japanese. Now, Japanese-language users can generate, for each model, individualized documentation to provide comprehensive guidance on what constitutes effective model risk management and download it as an editable Microsoft Word document. In the preview version, some sections were untranslated and therefore removed from the report. Now the following previously untranslated sections are translated and available for binary classification and multiclass projects:
- Bias and Fairness
- Lift Chart
- Accuracy
Anomaly detection compliance information is not yet translated and is not included. It is available in English if the information is required. Compliance Reports are a premium feature; contact your DataRobot representative for information on availability.
Environment limit management for custom models¶
The execution environment limit allows administrators to control how many custom model environments a user can add to the Custom Model Workshop. In addition, the execution environment version limit allows administrators to control how many versions a user can add to each of those environments. These limits can be:
-
Directly applied to the user: Set in a user's permissions. Overrides the limits set in the group and organization permissions (if the user limit value is lower).
-
Inherited from a user group: Set in the permissions of the group a user belongs to. Overrides the limits set in organization permissions (if the user group limit value is lower).
-
Inherited from an organization: Set in the permissions of the organization a user belongs to.
If the environment or environment version limits are defined for an organization or a group, the users within that organization or group inherit the defined limits. However, a more specific definition of those limits at a lower level takes precedence. For example, an organization may have the environment limits set to 5, a group to 4, and the user to 3; in this scenario, the final limit for the individual user is 3. For more information on adding custom model execution environments, see the Custom model environment documentation.
Any user can view their environment and environment version limits. On the Custom Models > Environments tab, next to the Add new environment and the New version buttons, a badge indicates how many environments (or environment versions) you've added and how many environments (or environment versions) you can add based on the environment limit:
The following status categories are available for this badge:
Badge | Description |
---|---|
The number of environments (or versions) is less than 75% of the limit. | |
The number of environments (or versions) is equal to or greater than 75% of the limit. | |
The number of environments (or versions) has reached the limit. |
With the correct permissions, an administrator can set these limits at a user or group level. For a user or a group, on the Permissions tab, click Platform, and then click Admin Controls. Next, under Admin Controls, set either or both of the following settings:
-
Execution Environments limit: The maximum number of custom model execution environments users in this group can add.
-
Execution Environments versions limit: The maximum number of versions users in this group can add to each custom model execution environment.
For more information, see the Manage user execution environment limits documentation (or the Manage group execution environment limits documentation).
Dynamically load required agent spoolers in a Java application¶
Dynamically loading third-party Monitoring Agent spoolers in your Java application improves security by removing unused code. This functionality works by loading a separate JAR file for the Amazon SQS, RabbitMQ, Google Cloud Pub/Sub, and Apache Kafka spoolers, as needed. The natively supported file system spooler is still configurable without loading a JAR file. Previously, the datarobot-mlops
and mlops-agent
packages included all spooler types by default.
To use a third-party spooler in your MLOps Java application, you must include the required spoolers as dependencies in your POM (Project Object Model) file, along with datarobot-mlops
:
<properties>
<mlops.version>8.3.0</mlops.version>
</properties>
<dependency>
<groupId>com.datarobot</groupId>
<artifactId>datarobot-mlops</artifactId>
<version>${mlops.version}</version>
</dependency>
<dependency>
<groupId>com.datarobot</groupId>
<artifactId>spooler-sqs</artifactId>
<version>${mlops.version}</version>
</dependency>
The spooler JAR files are included in the MLOps agent tarball. They are also available individually as downloadable JAR files in the public Maven repository for the DataRobot MLOps Agent.
To use a third-party spooler with the executable agent JAR file, add the path to the spooler to the classpath:
java ... -cp path/to/mlops-agent-8.3.0.jar:path/to/spooler-kafka-8.3.0.jar com.datarobot.mlops.agent.Agent
The start-agent.sh
script provided as an example automatically performs this task, adding any spooler JAR files found in the lib
directory to the classpath. If your spooler JAR files are in a different directory, set the MLOPS_SPOOLER_JAR_PATH
environment variable.
For more information, see the Dynamically load required spoolers in a Java application documentation.
Use Cases tab renamed to Value Tracker¶
With this release, the Use Cases tab at the top of the DataRobot is now the Value Tracker. While the functionality remains the same, all instances of “use cases” in this feature have been replaced by “value tracker.”
See the Value Tracker documentation for more information.
Preview¶
Prediction Explanations for cluster models¶
Now available for preview, you can use Prediction Explanations with clustering to uncover which factors most contributed to any given row’s cluster assignment. With this insight, you can easily explain clustering model outcomes to stakeholders and identify high-impact factors to help focus their business strategies.
Functioning very much like multiclass Prediction Explanations—but reporting on clusters instead of classes—cluster explanations are available from both the Leaderboard and deployments when enabled. They are available for all XEMP-based clustering projects and are not available with time series.
Required feature flag: Enable Clustering Prediction Explanations
Preview documentation.
API enhancements¶
The following is a summary of API new features and enhancements. Go to the API documentation user guide for more information on each client.
Tip
DataRobot highly recommends updating to the latest API client for Python and R.
Preview: R client v2.29¶
Now available for preview, DataRobot has released version 2.29 of the R client. This version brings parity between the R client and version 2.29 of the Public API. As a result, it introduces significant changes to common methods and usage of the client. These changes are encapsulated in a new library (in addition to the datarobot
library): datarobot.apicore
, which provides auto-generated functions to access the Public API. The datarobot
package provides a number of API wrapper functions around the apicore
package to make it easier to use.
Reference the v2.29 documentation for more details on the new R client, including installation instructions, detailed method overviews, and reference documentation.
New R Functions¶
- Generated API wrapper functions are organized into categories based on their tags from the OpenAPI specification, which were themselves redone for the entire DataRobot Public API in v2.27.
- API wrapper functions use camel-cased argument names to be consistent with the rest of the package.
- Most function names follow a
VerbObject
pattern based on the OpenAPI specification. - Some function names match "legacy" functions that existed in v2.18 of the R Client if they invoked the same underlying endpoint. For example, the wrapper function is called
GetModel
, notRetrieveProjectsModels
, since the latter is what was implemented in the R client for the endpoint/projects/{mId}/models/{mId}
. - Similarly, these functions use the same arguments as the corresponding "legacy" functions to ensure DataRobot does not break existing code calling those functions.
- The R client (both
datarobot
anddatarobot.apicore
packages) outputs a warning when you attempt to access certain resources (projects, models, deployments, etc.) that are deprecated or disabled by the DataRobot platform migration to Python 3. - Added the helper function
EditConfig
that allows you to interactively modifydrconfig.yaml
. - Added the
DownloadDatasetAsCsv
function to retrieve a dataset as a CSV file usingcatalogId
. - Added the
GetFeatureDiscoveryRelationships
function to get the feature discovery relationships for a project. - The R client (both
datarobot
anddatarobot.apicore
packages) will output a warning when you attempt to access certain resources (projects, models, deployments, etc.) that are deprecated or disabled by the DataRobot platform migration to Python 3.
R enhancements¶
- The function
RequestFeatureImpact
now accepts arowCount
argument, which will change the sample size used for Feature Impact calculations. - The internal helper function
ValidateModel
was renamed toValidateAndReturnModel
and now works with model classes from theapicore
package. - The
quickrun
argument has been removed from the functionSetTarget
. Setmode = AutopilotMode.Quick
instead. - The Transferable Models family of functions (
ListTransferableModels
,GetTransferableModel
,RequestTransferableModel
,DownloadTransferableModel
,UploadTransferableModel
,UpdateTransferableModel
,DeleteTransferableModel
) have been removed. The underlying endpoints—long deprecated—were removed from the Public API with the removal of the Standalone Scoring Engine (SSE). - Removed files (code, tests, doc) representing parts of the Public API not present in v2.27-2.29.
R deprecations¶
Review the breaking changes introduced in version 2.29:
- The
quickrun
argument has been removed from the function SetTarget. Setmode = AutopilotMode.Quick
instead. -
The Transferable Models functions have been removed. Note that the underlying endpoints were also removed from the Public API with the removal of the Standalone Scoring Engine (SSE). The affected functions are listed below:
ListTransferableModels
GetTransferableModel
RequestTransferableModel
DownloadTransferableModel
UploadTransferableModel
UpdateTransferableModel
DeleteTransferableModel
Review the deprecations introduced in version 2.29:
- Compliance Documentation API is deprecated. Instead use the Automated Documentation API.
Deprecation announcements¶
Current status of Python 2 deprecation and removal¶
As of the November 2022 release, the following describes the state of the Python 2 removal:
-
Python 2 projects and models are disabled and no longer support Leaderboard predictions.
-
Python 2-based model deployments are disabled with the exception of organizations that requested an extension for the frozen runtime.
See the guide for detailed information on Python 2 deprecation and migration to Python 3.
All product and company names are trademarks™ or registered® trademarks of their respective holders. Use of them does not imply any affiliation with or endorsement by them.