AI Platform releases > Self-Managed AI Platform releases > Version 7.2.x > AutoML (V7.2)

AutoML (V7.2)¶

September 13, 2021

The DataRobot v7.2.0 release includes many new AutoML features and enhancements described in this section. See also the new features described in the time series (AutoTS) and MLOps release notes.

See these important deprecation announcements for information about changes to DataRobot's support for older, expiring functionality. This document also describes DataRobot's fixed issues.

In the spotlight...¶

The following features are some of the highlights of Release 7.2:

Purpose-built AI applications with the AI App Builder
Preview: External prediction insights
Preview: Bias and Fairness monitoring for deployments

User interface enhancements¶

This release introduces a new login experience for DataRobot platform application users. The new page is redesigned to convey the level of innovation and technical revolution this company and product are offering without affecting the existing log in workflow.

ROC Curve redesign¶

The ROC Curve tab has been redesigned to streamline the model evaluation strategies you can perform. Along with the Prediction Distribution graph, ROC curve, confusion matrix, and a summary of metrics, you can now generate profit curves, precision-recall curves, and custom charts in the ROC Curve tab.

For details, see ROC Curve.

To improve navigation, this release brings a new home for the project sharing and project name editing tools. While still available from the project control center (Manage Projects), you can now more quickly access the tools directly from the project dropdown.

Data enhancements¶

New Spark version for improved performance¶

Release 7.2 upgrades the Spark version used for Feature Discovery and Spark SQL to Spark 3.0. In addition to Spark performance improvements, the upgrade brings improved JDBC compatibility with the AI Catalog (which uses Java 11) and a smaller shippable codebase. DataRobot now supports all drivers that are compatible with any Java version 8 or later.

Connect to Snowflake and Google BigQuery using OAuth¶

Snowflake and Google BigQuery users can now set up a data connection using OAuth single sign-on. Once configured, you can read data from production databases to use for model building and predictions. For details, see Data connection with OAuth.

Feature Discovery features¶

Feature Discovery Relationship Editor setup guide¶

With Feature Discovery, DataRobot generates new features from multiple datasets so that you don’t need to perform feature engineering manually to consolidate multiple datasets. Use the Relationship Editor to join the datasets to prepare for Feature Discovery.

The Relationship Editor setup guide is a new intermediate screen that displays when you click the Add datasets button on the EDA (Data) page. It walks you through the process of specifying prediction points for time-aware features and adding the datasets to be joined for Feature Discovery. For details, see Create a Feature Discovery project.

Feature Discovery engineering controls¶

Feature Discovery engineering controls, now publicly available, let you influence how DataRobot conducts feature engineering.

You can enable specific controls to use your domain knowledge to guide feature engineering or to improve accuracy. You might want to exclude specific transformations that slow down processing or are difficult to explain to stakeholders. For details, see Set feature engineering controls.

Feature Discovery settings enhanced¶

The Feature Discovery tab on the Data page provides dataset relationship details, a feature derivation summary, and a feature derivation log. You can now see the number of secondary datasets, explored features, and derived features that resulted from Feature Discovery. Click Show more to see which feature engineering controls were used during Feature Discovery and to learn about each.

For details, see Define relationships.

Categorical Statistics feature type¶

Categorical Statistics let you explore numeric statistics like sum, max, and average for each category of a categorical feature. In the following example, during Feature Discovery, DataRobot explores Spending numeric statistics for each category of the Product-Type feature:

Spending(30 days min)
Spending(30 days min by Product_Type = A)
Spending(30 days min by Product_Type = B)
Spending(30 days min by Product_Type = C) ..

Categorical Statistics aggregation is turned off by default. You can enable it on the Feature Engineering tab of the Feature Discovery Settings page. For details, see Categorical Statistics.

Modeling features¶

Purpose-built AI applications with the AI App Builder¶

The AI App Builder, available from the Applications tab, provides a no-code platform to enable core DataRobot services (making predictions, optimizing outcomes, simulating scenarios, and more) without having to build models and evaluate their performance in DataRobot.

Each application starts with a template and data source—either a deployment or dataset in the AI Catalog. However, the App Builder lets you configure additional widgets, custom features, and pages to tailor the application to a specific use case.

Once deployed, applications can be easily shared and do not require users to own full DataRobot licenses in order to use them, offering a great solution for broadening your organization’s ability to use DataRobot’s functionality.

Widgets¶

Applications are composed of widgets that create visual, interactive, and purpose-driven end-user applications. There are two types of widgets—chart widgets and header widgets. Chart widgets add visualizations to an application and can be configured to surface important insights in your data and prediction results. Header widgets provide additional filtering options for your application.

The What-if and Optimizer widget provides two tools for interacting with prediction results:

What-if: A decision-support tool that allows you to create and compare multiple prediction simulations to identify the option that provides the best outcome. You can also make predictions, then change one or more inputs to create a new simulation, and see how those changes affect the target feature.
Optimizer: Identifies the maximum or minimum predicted value for a target by varying the values of a selection of flexible features in the model.

Word Cloud blueprints for multiclass projects¶

An improvement has been made so that all Stochastic Gradient Descent (SGD) blueprints create a Word Cloud if even a single text feature is present in a multiclass project. Previously, there was a specialized SGD blueprint, available from the Repository, that had to be run manually. Access the new visualizations from either the model’s Describe > Word Cloud or Insights > Word Cloud tabs.

New Keras DeepCTR models available in the Repository¶

To support data scientists with CTR data (categoricals with high cardinality), DataRobot introduces three DeepCTR models, available from the Repository. These models—neural factorization machine, autoint, and deep cross network—can be particularly useful when building clickthrough rate or recommendation models.

Bias and Fairness improvements¶

With this release, DataRobot has upgraded the user experience for calculating Bias and Fairness for your models. The first improvement allows you to enable Bias and Fairness insights after modeling has already started. Select a model and navigate to Bias and Fairness > Settings. Once configured, Bias and Fairness insights are enabled for every model on the Leaderboard.

The second improvement is the ability to view multiple fairness metrics in the Per-Class Bias page. This functionality allows you to view fairness scores for all five fairness metrics using a dropdown menu.

For details, see the Bias and Fairness documentation.

TLS options for Portable Prediction Server¶

By default, the Portable Prediction Server (PPS) serves predictions over an insecure listener on an :8080 port (clear text HTTP over TCP). You can now also serve predictions over a secure listener on :8443 port (HTTP over TLS/SSL, or simply HTTPS). When the secure listener is enabled, the insecure listener becomes unavailable. The configuration is accomplished using environment variables, which are described in the documentation along with accompanying examples.