AI Platform releases > AI Platform releases > February 2024

February 2024¶

February 28, 2024

This page provides announcements of newly released features available in DataRobot's SaaS single- and multi-tenant AI Platform, with links to additional resources. From the release center, you can also access:

In the spotlight¶

Manage notebook file systems with Codespaces¶

Codespaces have been added to DataRobot Workbench to enhance the code-first experience, especially when working with DataRobot Notebooks. A codespace, similar to a repository or folder file tree structure, can contain any number of files and nested folders. Within a codespace, you can open, view, and edit multiple notebook and non-notebook files at the same time. You can also execute multiple notebooks in the same container session (with each notebook running on its own kernel). In addition to persistent file storage, the codespace interface includes a file editor and integrated terminal for an advanced code development experience. Because codespaces are Git-compatible, you can version your notebook files as well as non-notebook files in external Git repositories using the Git CLI.

Video: Codespaces

Preview documentation.

Feature flag ON by default: Enable Codespaces

February features¶

The following table lists each new feature:

Features grouped by capability

Name	GA	Preview
Data
All wrangling operations generally available	✔
Additional support for ingestion of Parquet files	✔
ADLS Gen2 connector added to DataRobot		✔
Modeling
Sliced insights introduces two new operators	✔
Predictions and MLOps
Real-time notifications for deployments	✔
Timeliness indicators for predictions and actuals	✔
Column name remapping for batch predictions	✔
Global models in the Registry	✔*
New environment variables for custom models	✔*
Automated deployment and replacement in Sagemaker		✔
Additional columns in custom model output		✔
New runtime parameter definition options for custom models		✔
Updated layout for the NextGen Console		✔
Notebooks
Manage notebook file systems with codespaces		✔
Apps
Custom apps now generally available	✔
Admin
New secure sharing workflow	✔

* Premium feature

GA¶

All wrangling operations generally available¶

The Join and Aggregate wrangling operations, previously available for preview, are now generally available in Workbench.

To reinforce data security, further protect your sensitive information, and streamline the sharing process, resources can now only be shared between users from the same organization. To learn how to add new users to your organization, as well as manage existing users, see the documentation on tenant isolation and collaboration.

Note that these changes only impact Multi-Tenant SaaS users. If and when that changes to the other deployment environments, you will be notified and the documentation will be updated.

Sliced insights introduces two new operators¶

With this deployment, the data slice functionality, which allows you to view a subpopulation of a model's data based on feature values, adds two new filter options— between and not between. Selecting either opens a modal for that lets you set a range, inclusive, of the actual values specified.

Real-time notifications for deployments¶

DataRobot provides automated monitoring with a notification system, allowing you to configure alerts triggered when service health, data drift status, model accuracy, or fairness values deviate from your organization's accepted values. Now generally available, you can enable real-time notifications for these status alerts, allowing your organization to quickly respond to changes in model health without waiting for scheduled health status notifications:

Timeliness indicators for predictions and actuals¶

Deployments have several statuses to define their general health, including service health, data drift, and accuracy. These statuses are calculated based on the most recent available data. For deployments relying on batch predictions made in intervals greater than 24 hours, this method can result in an unknown status value on the prediction health indicators in the deployment inventory. Now generally available, deployment health indicators can retain the most recently calculated health status, presented along with timeliness status indicators to reveal when they are based on old data. You can determine the appropriate timeliness intervals for your deployments on a case-by-case basis. Once you've enabled timeliness tracking on a deployment's Usage > Settings tab, you can view timeliness indicators on the Usage tab and in the Deployments inventory:

Deployments inventoryUsage tab

Column name remapping for batch predictions¶

When configuring one-time or recurring batch predictions, you can change column names in the prediction job's output by mapping them to entries added in the Column names remapping section of the Prediction options. Click + Add column name remapping and define the Input column name to replace with the specified Output column name in the prediction output:

Global models in the Registry¶

Now available as a premium feature, you can deploy pre-trained, global models for predictive or generative use cases from the Registry (NextGen) and Model Registry (Classic). These high-quality, open-source models are trained and ready for deployment, allowing you to make predictions immediately after installing DataRobot. For GenAI use cases, you can find classifiers to identify prompt injection, toxicity, and sentiment, as well as a regressor to output a refusal score. Global models are available to all users; however, only administrators have edit rights. To identify global models on the Registry > Model directory page, locate the Global column and look for models with Yes:

New environment variables for custom models¶

When you use a drop-in environment or a custom environment built on DRUM, your custom model code can reference several environment variables injected to facilitate access to the DataRobot client and MLOps connected client. The DATAROBOT_ENDPOINT and DATAROBOT_API_TOKEN environment variables require public network access, a premium feature available in NextGen and DataRobot Classic.

Environment Variable	Description
`MLOPS_DEPLOYMENT_ID`	If a custom model is running in deployment mode (i.e., the custom model is deployed), the deployment ID is available.
`DATAROBOT_ENDPOINT`	If a custom model has public network access, the DataRobot endpoint URL is available.
`DATAROBOT_API_TOKEN`	If a custom model has public network access, your DataRobot API token is available.

Custom apps now generally available¶

Now generally available, you can create custom applications in DataRobot to share machine learning projects using web applications, including Streamlit, Dash, and R Shiny, from an image created in Docker. You can also use DRApps (a simple command line interface) to host a custom application in DataRobot using a DataRobot execution environment. This allows you to run apps without building your own Docker image. Custom applications don't provide any storage; however, you can access the full DataRobot API and other services. With this release, your custom applications are paused after a period of inactivity; the first time you access a paused custom application, a loading screen appears while it restarts.

Preview¶

Additional support for ingestion of Parquet files¶

DataRobot now supports ingestion of Parquet files in the AI Catalog, training datasets, and predictions datasets. The following Parquet file types are supported:

Single Parquet files
Single zipped Parquet files
Multiple Parquet files (registered as separate datasets)
Zipped multi-Parquet file (merged to create a single dataset in DataRobot)

Feature flag ON by default: Enable Parquet File Ingestion

ADLS Gen2 connector added to DataRobot¶

Support for the ADLS Gen2 native connector has been added to both DataRobot Classic and Workbench, allowing you to:

Create and configure data connections.
Add ADLS Gen2 datasets.

Preview documentation.

Feature flag ON by default: Enable ADLS Gen2 Connector

Automated deployment and replacement in Sagemaker¶

You can now create a DataRobot-managed Sagemaker prediction environment to deploy custom models and Scoring Code in Sagemaker with real-time inference and serverless inference. With DataRobot management enabled, the external Sagemaker deployment has access to MLOps management, including automatic model replacement.

For more information, see the documentation.

Feature flag OFF by default: Enable the Automated Deployment and Replacement of Custom Models in Sagemaker

Updated layout for the NextGen Console¶

This update to the NextGen Console provides important monitoring, predictions, and mitigation features in a modern user interface with a new and intuitive layout. This updated layout provides a seamless transition from model experimentation in Workbench and registration in Registry, to model monitoring and management in Console—while maintaining the features and functionality available in DataRobot Classic.

Preview documentation.

Feature flag OFF by default: Enable Updated Console Layout

Additional columns in custom model output¶

The score() hook can return any number of extra columns, containing data of types string, int, float, bool, or datetime. When additional columns are returned through the score() method, the prediction response is as follows:

For a tabular response (CSV), the additional columns are returned as part of the response table or dataframe.
For a JSON response, the extraModelOutput key is returned alongside each row. This key is a dictionary containing the values of each additional column in the row.

Preview documentation

Feature flag OFF by default: Enable Additional Custom Model Output in Prediction Responses

New runtime parameter definition options for custom models¶

When you create runtime parameters for custom models through the model metadata, you can now set the type key to boolean or numeric, in addition to string or credential. You can also add the following new, optional, runtimeParameterDefinitions in model-metadata.yaml:

Key	Description
`defaultValue`	Set the default string value for the runtime parameter (the credential type doesn't support default values).
`minValue`	For `numeric` runtime parameters, set the minimum numeric value allowed in the runtime parameter.
`maxValue`	For `numeric` runtime parameters, set the maximum numeric value allowed in the runtime parameter.
`allowEmpty`	Set the empty field policy for the runtime parameter: `True`: (Default) Allows an empty runtime parameter. `False`: Enforces providing a value for the runtime parameter before deployment.

For more information, see the documentation.

Feature flag OFF by default: Enable the Injection of Runtime Parameters for Custom Models

All product and company names are trademarks™ or registered® trademarks of their respective holders. Use of them does not imply any affiliation with or endorsement by them.

February 2024¶

In the spotlight¶

Manage notebook file systems with Codespaces¶

February features¶