Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

September 2024

September 2024

September 25, 2024

This page provides announcements of newly released features available in DataRobot's SaaS single- and multi-tenant AI Platform, with links to additional resources. From the release center, you can also access:

September features

The following table lists each new feature:

Features grouped by capability

*Premium

GA

Azure OpenAI GPT-4o LLM now available

With this deployment, the Azure OpenAI GPT-4o (“omni”) LLM is available from the playground. The multimodal GPT-4o brings high efficiency to text inputs, with faster text generation, less overhead, and better non-English language support. The ongoing addition of LLMs is an indicator of DataRobot’s commitment to delivering newly released LLMs as they are made available. A list of available LLMs is maintained here.

Wrangling enhancements added to Workbench

This release introduces the following improvements to data wrangling in Workbench:

  • The Remove features operation allows you to select all/deselect all features.
  • You can import operations from an existing recipe, either at the beginning or during a wrangling session.
  • Access settings for the live preview from the Preview settings button on the wrangling page.
  • Additional actions are available from the Actions menu for individual operations, including, adding operation above/below, importing a recipe above/below, duplicate, and preview up to a specific operation, which allows you to quickly see how different combinations of operations affect the live sample.

ADLS Gen2 connector is GA in DataRobot

Support for the native ADLS Gen2 connector is now generally available in DataRobot. Additionally, you can create and share Azure service principal and Azure OAuth credentials using secure configurations.

Compute Prediction Explanations for data in OTV and time series projects

Now generally available, you can compute Prediction Explanations for time series and OTV projects. Specifically, you can get XEMP Prediction Explanations for the holdout partition and sections of the training data. DataRobot only computes Prediction Explanations for the validation partition of backtest one in the training data.

Clustering in Incremental Learning

This deployment adds support for K-Means clustering models to DataRobot’s incremental learning capabilities. Incremental learning (IL) is a model training method specifically tailored for large datasets—those between 10GB and 100GB—that chunks data and creates training iterations. With this support, you can build non-time series clustering projects with larger datasets, helping you to explore your data by grouping and identifying natural segments.

Increased training sizes for geospatial modeling

With this deployment, DataRobot has increased the maximum number of rows supported for geospatial modeling (Location AI) from 100,000 rows to 10,000,000 rows in DataRobot Classic. Location AI allows ingesting common geospatial formats, automatically recognizing geospatial coordinates to support geospatial analysis modeling. The increased training data size improves your ability to find geospatial patterns in your models.

Manage custom applications in the Registry

Now generally available, the Applications page in the NextGen Registry is home to all custom applications and application sources available to you. You can now create application sources—which contain the files, environment, and runtime parameters for custom applications you want to build—and build custom applications directly from these sources. You can also use the Applications page to manage applications by sharing or deleting them.

With general availability, you can open and manage application sources in a codespace, allowing you to directly edit a source's files, upload new files to it, and use all the codespace's functionality.

Open Prediction API snippets in a codespace

You can now open a Prediction API code snippet in a codespace to edit the snippet directly, share it with other users, and incorporate additional files. When selected, DataRobot generates a codespace instance and populates the snippet inside as a python file. The codespace allows for full access to file storage. You can use the Upload button to add additional datasets for scoring, and have the prediction output (output.json, output.csv, etc.) return to the codespace file directory after executing the snippet.

Convert standalone notebooks to codespaces

Now generally available, you can use DataRobot to convert a standalone notebook into a codespace to incorporate additional workflow capabilities such as persistent file storage and Git compatibility. These types of features require a codespace. When converting a notebook, DataRobot maintains a number of notebook assets, including the environment configuration, the notebook contents, scheduled job definitions, and more.

Time series model package prediction intervals

To run a DataRobot time series model in a remote prediction environment and compute time series prediction intervals (from 1 to 100) for that model, download a model package (.mlpkg file) from the model's deployment or the Leaderboard with Compute prediction intervals enabled. You can then run prediction jobs with a portable prediction server (PPS) outside DataRobot.

For more information, see the documentation.

Configure maximum compute instances for a serverless platform

Admins can now increase deployments' max compute instances limit on per-organization basis. If not specified, the default is 8. To limit compute resource usage, set the maximum value equal to the minimum.

Preview

New operations, automated derivation plan available for time series data wrangling

This deployment extends the existing data wrangling framework with tools to help prepare input for time series modeling, allowing you to perform time series feature engineering during the data preparation phase. This change works in conjunction with the fundamental wrangler improvements announced this month. Use the Derive time series features operation to execute lags and rolling statistics on the input data, either using a suggested derivation plan that automates feature generation or by manually selecting features and applying tasks, and DataRobot will build new features and apply them to the live data sample.

While these operations can be added to any recipe, setting the preview sample method to date/time enables an option to have DataRobot suggest feature transformations based on the configuration you provide. With the automated option, DataRobot expands the data according to forecast distances, adds known in advance columns (if specified) and naive baseline features, and then replaces the original sample. Once complete, you can modify the plan as needed.

Feature flag ON by default: Enable Time Series Data Wrangling

Preview documentation.

Create categorical custom metrics

In the NextGen Console, on a deployment’s Custom metrics tab, you can define categorical metrics when you create an external metric. For each categorical metric, you can define up to 10 classes.

By default, these metrics are visualized in a bar chart on the Custom metrics tab; however, you can configure the chart type from the settings menu.

Feature flag ON by default: Enable Categorical Custom Metrics

Preview documentation.

Additional support added to wrangling for DataRobot datasets

The ability to wrangle datasets stored in the Workbench Data Registry, first introduced for preview in July, is now supported by all environments.

Manage network policies to limit access to public resources

By default, some DataRobot capabilities, including Notebooks, have full public internet access from within the cluster DataRobot is deployed on; however, admins can limit the public resources users can access within DataRobot by setting network access controls. To do so, open User settings > Policies and enable the toggle to the left of Enable network policy control. When this toggle is enabled, by default, users cannot access public resources from within DataRobot.

Feature flag ON by default: Enable Network Policy Enforcement

Preview documentation.

Deprecations and migrations

Accuracy over time data storage for python 3 projects

DataRobot has changed the storage type for Accuracy over Time data from MongoDB to S3.

On the managed AI Platform (cloud), DataRobot uses blob storage by default. The feature flag BLOB_STORAGE_FOR_ACCURACY_OVER_TIME has been removed from the feature access settings.


Updated November 15, 2024