AIプラットフォームリリース¶
DataRobotのマネージドAIプラットフォーム向けに毎月発表されているプレビューと一般提供の新機能を記録しています。 サポート終了のお知らせも含まれており、必要に応じて、サポート終了ガイドにリンクしています。
10月にリリースされたSaaS機能のお知らせ¶
2023年10月30日
このページでは、新たにリリースされ、DataRobotのSaaS型シングル/マルチテナントAIプラットフォームで利用できる機能についてのお知らせと、追加情報へのリンクを掲載しています。 リリースセンターからは、次のものにもアクセスできます。
- A monthly record of the feature announcement history
- セルフマネージドAIプラットフォーム版リリースノート
10月リリースの機能¶
次の表は、新機能の一覧です。
目的別にグループ化された機能
- プレミアム機能
一般提供¶
New LLM, Anthropic Claude 3 Opus, now available¶
Now generally available, Anthropic Claude 3 Opus brings support for another Claude-family offering to the DataRobot GenAI product. Each model in the family is targeted at specific needs; Claude 3 Opus, the largest model of the Claude family, excels at heavyweight reasoning and complicated tasks. See the full list of LLM availability in DataRobot, with links to creator documentation for assistance in choosing the appropriate model.
Multiclass classification now GA in Workbench¶
Initially released to Workbench in March 2024, multiclass modeling and the associated confusion matrix are now generally available. To support an expansive set of multiclass modeling experiments—classification problems in which the answer has more than two outcomes—DataRobot provides support for an unlimited number of classes using aggregation.
Geospatial modeling now available in Workbench¶
To help gain insights into geospatial patterns in your data, you can now natively ingest common geospatial formats and build enhanced model blueprints with spatially-explicit modeling tasks when building in Workbench. During experiment setup, from Additional settings, select a location feature in the Geospatial insights section and make sure that feature is in the modeling feature list. DataRobot will then create geospatial insights—Accuracy Over Space for supervised projects and Anomaly Over Space for unsupervised.
Personal data detection now GA in SaaS, Self-Managed¶
Because the use of personal data as a modeling feature is forbidden in some regulated use cases, DataRobot Classic provides personal data detection capabilities. The feature is now generally available in both SaaS and self-managed environments. Access the check after uploading data to the AI Catalog.
XEMP Individual Prediction Explanations now in Workbench¶
Workbench now offers two methodologies for computing Individual Prediction Explanations: SHAP (based on Shapley Values) and XEMP (eXemplar-based Explanations of Model Predictions). This insight, regardless of method, helps explain what drives predictions. The XEMP-based explanations are a proprietary method that support all models—they have long been available in DataRobot Classic. In Workbench, they are only available in experiments that don't support SHAP.
Custom tasks now available for Self-Managed users¶
Custom tasks allow you to add custom vertices into a DataRobot blueprint, and then train, evaluate, and deploy that blueprint in the same way as you would for any DataRobot-generated blueprint. With v10.2 the functionality is available via DataRobot Classic and the API for on-premise installations as well.
ネットワークポリシーを管理して、公開リソースへのアクセスを制限¶
By default, some DataRobot capabilities, including Notebooks, have full public internet access from within the cluster DataRobot is deployed on; however, admins can limit the public resources users can access within DataRobot by setting network access controls. To do so, open User settings > Policies and enable the network policy control toggle. When enabled, users cannot access public resources from within DataRobot.
Monitor EDA resource usage across an organization¶
Now generally available, administrators can monitor the number of configured workers being used for EDA1 and related tasks on the EDA tab of the Resource Monitor. The Resource Monitor provides visibility into DataRobot's active modeling and EDA workers across the installation, providing general information about the current state of the application and specific information about the status of components.
Understand how individual catalog assets relate to other DataRobot entities¶
AIカタログとは、データおよび関連アセットを操作するために一元化されたコラボレーションハブです。 On the Info tab for individual assets, you can now see how other entities in the application are related to—or dependent on—the current asset. This is useful for a number of reasons, allowing you to view how popular an item is based on the number of projects in which it is used, understand which other entities might be affected if you were to make changes or deletions, and gain understanding on how the entity is used.
Automatically remove date features before running Autopilot¶
When setting up a non-time aware project in DataRobot Classic, you can now automatically remove date features from the feature list you want to use to run Autopilot. To do so, open Advanced options for the project, select the Additional tab, and then select Remove date features from selected list and create new modeling feature list. Enabling this parameter duplicates the selected feature list, removes raw date features, and uses the new list to run Autopilot. Excluding raw date features from non-time aware projects can prevent issues like overfitting.
DataRobotでSAP Datasphereコネクターをサポート¶
Available as a premium feature, DataRobot now supports the SAP Datasphere connector, available for preview, in both NextGen and DataRobot Classic.
デフォルトではオフの機能フラグ: SAP Datasphereコネクターを有効にする(プレミアム機能)
SAP Datasphere integration for batch predictions¶
Available as a premium feature, SAP Datasphere is supported as an intake source and output destination for batch prediction jobs.
デフォルトではオフの機能フラグ: SAP Datasphereコネクターを有効にする(プレミアム機能)、SAP Datasphereとバッチ予測の連携を有効にする(プレミアム機能)、
For more information, see the prediction intake and output options documentation.
Additional EDA insights added to Workbench¶
This release introduces the following EDA insights on the Features tab of the data explore page in Workbench:
-
Data quality checks appear as indicators on the Features tab of the data explore page as well as insights for individual features.
-
The Histogram chart displays data quality issues with outliers.
-
The Frequent Values chart reports inliers, disguised missing values, and excess zeros.
- Feature lineage insight for Feature Discovery datasets shows how a feature was generated.
Compliance documentation now available for registered text generation models¶
DataRobot has long provided model development documentation that can be used for regulatory validation of predictive models. Now, the compliance documentation is expanded to include auto-generated documentation for text generation models in the Registy's model directory. For DataRobot natively supported LLMs, the document helps reduce the time spent generating reports, including model overview, informative resources, and most notably model performance and stability tests. For non-natively supported LLMs, the generated document can serve as a template with all necessary sections. Generating compliance documentation for text generation models requires the Enable Compliance Documentation and Enable Gen AI Experimentation feature flags.
テキスト生成モデルの評価とモデレーション¶
評価とモデレーションのガードレールは、組織がプロンプトインジェクションや、悪意のある、有害な、または不適切なプロンプトや回答をブロックするのに役立ちます。 また、ハルシネーションや信頼性の低い回答を防ぎ、より一般的には、モデルをトピックに沿った状態に保つこともできます。 さらに、これらのガードレールは、個人を特定できる情報(PII)の共有を防ぐことができます。 多くの評価およびモデレーションガードレールは、デプロイされたテキスト生成モデル(LLM)をデプロイされたガードモデルに接続します。 これらのガードモデルはLLMのプロンプトと回答について予測し、これらの予測と統計を中心的なLLMデプロイに報告します。 評価とモデレーションのガードレールを使用するには、まず、LLMのプロンプトや回答について予測するガードモデルを作成してデプロイします。たとえば、ガードモデルは、プロンプトインジェクションや有害な回答を識別することができます。 次に、ターゲットタイプがテキスト生成のカスタムモデルを作成する場合、評価とモデレーションのガードレールを1つ以上定義します。 The GA Premium release of this feature introduces general configuration settings for moderation timeout and evaluation and moderation logs.
デフォルトではオフの機能フラグ: モデレーションのガードレールを有効にする(プレミアム機能)、モデルレジストリでグローバルモデルを有効にする(プレミアム機能)、予測応答で追加のカスタムモデル出力を有効にする
詳しくはドキュメントをご覧ください。
Filtering and model replacement improvements in the NextGen Console¶
This update to the NextGen Console improves deployment filtering and updates the model replacement experience to provide a more intuitive replacement workflow.
On the Console > Deployments tab, you can now filter on Created by me, Tags, and Model type.
On the Console > Deployments tab, or a deployment's Overview, you can access the updated model replacement workflow from the model actions menu.
NextGenのレジストリでカスタム実行環境を管理¶
NextGenのレジストリに「環境」タブが提供されました。このタブでは、カスタムモデル、ジョブ、アプリケーション、ノートブックのカスタム実行環境を作成および管理できます。
詳しくはドキュメントをご覧ください。
Customize feature drift tracking¶
When you enable feature drift tracking for a deployment, you can now customize the features selected for tracking. During or after the deployment process, in the Feature drift section of the deployment settings, choose a feature selection strategy, either allowing DataRobot to automatically select 25 features, or selecting up to 25 features manually.
詳しくはドキュメントをご覧ください。
Calculate insights during custom model registration¶
For custom models with training data assigned, DataRobot now computes model Insights and Prediction Explanation previews during model registration, instead of during model deployment. In addition, new model logs accessible from the model workshop can help you diagnose errors during the Insight computation process.
詳しくはドキュメントをご覧ください。
Link Registry and Console assets to a Use Case¶
Associate registered model versions, model deployments, and custom applications to a Use Case with the new Use Case linking functionality. Link these assets to an existing Use Case, create a new Use Case, or manage the list of linked Use Cases.
For more information, see the registered model , deployment, and application linking documentation.
コードベースの再トレーニングジョブ¶
Add a job, manually or from a template, implementing a code-based retraining policy. To view and add retraining jobs, navigate to the Jobs > Retraining tab, and then:
-
To add a new retraining job manually, click + Add new retraining job (or the minimized add button when the job panel is open).
-
To create a retraining job from a template, next to the add button, click , and then, under Retraining, click Create new from template.
詳しくはドキュメントをご覧ください。
Custom model workers runtime parameter¶
A new DataRobot-reserved runtime parameter, CUSTOM_MODEL_WORKERS
, is available for custom model configuration. This numeric runtime parameter allows each replica to handle the set number of concurrent processes. This option is intended for process safe custom models, primarily in generative AI use cases.
Custom model process safety
When enabling and configuring CUSTOM_MODEL_WORKERS
, ensure that your model is process safe. This configuration option is only intended for process safe custom models, it is not intended for general use with custom models to make them more resource efficient. Only process safe custom models with I/O-bound tasks (like proxy models) benefit from utilizing CPU resources this way.
詳しくはドキュメントをご覧ください。
Notebook and codespace port forwarding now GA¶
Now generally available, you can enable port forwarding for notebooks and codespaces to access web applications launched by tools and libraries like MLflow and Streamlit. ローカルで開発する場合、Webアプリケーションはhttp://localhost:PORT
でアクセスできます。しかし、ホストされたDataRobot環境で開発する場合、Webアプリケーションにアクセスするには、そのアプリケーションが実行されている(セッションコンテナ内の)ポートを転送する必要があります。 You can expose up to five ports in one notebook or codespace.
GPU support for notebooks now GA¶
GPU support for Notebook and Codespace sessions is now available as a GA Premium feature for managed AI Platform users. When configuring the environment for your DataRobot Notebook or Codespace session, you can select a GPU machine from the list of resource types. DataRobot also provides GPU-optimized built-in environments that you can select from to use for your session. These environment images contain the necessary GPU drivers as well as GPU-accelerated packages like TensorFlow, PyTorch, and RAPIDS.
Custom application runtime parameters now GA¶
Now generally available, you can configure the resources and runtime parameters for application sources in the NextGen Registry. リソースバンドルは、本番環境での潜在的な環境エラーを最小限に抑えるために、アプリケーションが消費できるメモリーとCPUの最大量を決定します。 アプリケーションのソースから構築されたmetadata.yaml
ファイルに含めることで、カスタムアプリケーションで使用されるランタイムパラメーターを作成および定義できます。
Build custom applications from the template gallery¶
DataRobot provides templates from which you can build custom applications. These templates allow you to leverage pre-built application front-ends, out of the box, and offer extensive customization options. You can leverage a model that has already been deployed to quickly start and access a Streamlit, Flask, or Slack application. Use a custom application template as a simple method for building and running custom code within DataRobot.
Chat generation Q&A application now GA¶
Now generally available, you can leveraging generative AI to create a chat generation Q&A application. Explore Q&A use cases, make business decisions, and showcase business value. The Q&A app offers an intuitive and responsive way to prototype, explore, and share the results of LLM models you've built, including with non-DataRobot users, to expand its usability.
You can also use a code-first workflow to manage the chat generation Q&A application. To access the flow, navigate to DataRobot's GitHub repo. The repo contains a modifiable template for application components.
プレビュー¶
Incremental learning support for dynamic datasets is now available¶
Support for modeling on dynamic datasets larger than 10GB, for example, data in a Snowflake, BigQuery, or Databricks data source, is now available. When configuring the experiment, set an ordering feature to create a deterministic sample from the dataset and then begin incremental modeling as usual. After model building starts, View experiment info now reports the selected ordering feature.
デフォルトではオフの機能フラグ:増分学習を有効にする、ワークベンチで動的データセットを有効にする、データのチャンキングサービスを有効にする
プレビュー機能のドキュメントをご覧ください。
Template gallery for custom jobs¶
The custom jobs template gallery is now available for the generic, notification, and retraining job types—in addition to custom metric jobs. To access the new template gallery, from the Registry > Jobs tab, create a job from a template for any job type.
デフォルトではオンの機能フラグ: カスタムジョブのテンプレートギャラリーを有効にする, カスタムテンプレートを有効にする
プレビュー機能のドキュメントをご覧ください。
Create and deploy vector databases¶
With the vector database target type in the model workshop, you can register and deploy vector databases, as you would any other custom model.
プレビュー機能のドキュメントをご覧ください。
デフォルトではオフの機能フラグ:ベクターデータベースのデプロイタイプを有効にする(プレミアム機能)
Geospatial monitoring for deployments¶
For a deployed binary classification, regression, or multiclass model built with location data in the training dataset, you can now leverage DataRobot Location AI to perform geospatial monitoring on the deployment's Data drift and Accuracy tabs. To enable geospatial analysis for a deployment, enable segmented analysis and define a segment for the location feature geometry
, generated during location data ingest. The geometry
segment contains the identifier used to segment the world into a grid of H3 cells.
デフォルトではオンの機能フラグ: 地理空間特徴量の監視を有効にする、ワークベンチで特徴量探索を有効にする
Prompt monitoring improvements for deployments¶
For deployed text generation models, the Monitoring > Data exploration tab includes additional sort and filter options on the Tracing table, providing new ways to interact with a Generative AI deployment's stored prompt and response data and gain insight into a model's performance through the configured custom metrics. In addition, this release introduces custom metric templates for Cosine Similarity and Euclidean Distance.
プレビュー機能のドキュメントをご覧ください。
デフォルトではオフの機能フラグ: テキスト生成のターゲットタイプでデータ品質テーブルを有効にする(プレミアム機能)、生成モデルで実測値の保存を有効にする(プレミアム機能)
デフォルトではオンの機能フラグ: カスタムジョブのテンプレートギャラリーを有効にする, カスタムテンプレートを有効にする
Editable resource settings and runtime parameters for deployments¶
For deployed custom models, the custom model CPU (or GPU) resource bundle and runtime parameters defined during custom model assembly are now editable after assembly.
If the custom model is deployed on a DataRobot Serverless prediction environment and the deployment is inactive, you can modify the Resource bundle settings from the Resources tab.
プレビュー機能のドキュメントをご覧ください。
You can modify a custom model's runtime parameters during or after the deployment process.
プレビュー機能のドキュメントをご覧ください。
Feature flags OFF by default: Enable Resource Bundles, Enable Custom Model GPU Inference (Premium feature), Enable Editing Custom Model Runtime-Parameters on Deployments
Data Registry wrangling for batch predictions¶
Use a deployment's Predictions > Make predictions tab to make batch predictions on a recipe wrangled from the Data Registry. バッチ予測とは、大規模なデータセットで予測を行う方法で、入力データを渡すと各行の予測結果が得られます。 In the Prediction dataset box, click Choose file > Wrangler recipe, then pick a recipe from the Data Registry:
ワークベンチでの予測
Batch predictions on recipes wrangled from the Data Registry are also available in Workbench. デプロイ前のモデルで予測を行うには、エクスペリメントのモデルリストからモデルを選択し、モデルアクション > 予測を作成をクリックします。
予測データの送信元と送信先を指定し、予測が実行されるタイミングを決定することで、バッチ予測ジョブをスケジュールすることもできます。
プレビュー機能のドキュメントをご覧ください。
デフォルトではオフの機能フラグ:データレジストリのデータセットでラングリングのプッシュダウンを有効にする
コードファースト¶
Use the declarative API to provision DataRobot assets¶
You can use the DataRobot declarative API as a code-first method for provisioning resources end-to-end in a way that is both repeatable and scalable. Supporting both Terraform and Pulumi, you can use the declarative API to programmatically provision DataRobot entities such as models, deployments, applications, and more. The declarative API allows you to:
- Specify the desired end state of infrastructure, simplifying management and enhancing adaptability across cloud providers.
- Automate the provisioning of DataRobot assets to ensure consistency across environments and alleviate concerns about execution order. Terraform and Pulumi allow you to provision in two phases: planning and application. You can view a plan that outlines what resources are created before committing to provisioning actions, and then resolve any infrastructure dependencies on your behalf when a change is made. Then, you can execute the provisioning separately. This makes provisioning easier to manage within a complex infrastructure. You can preview the impacts that changes will have to DataRobot assets downstream in the workflow.
- Simplify version control.
- Use application templates to reduce workflow duplication and ensure consistency.
- Integrate with DevOps and CI/CD to ensure predictable, consistent infrastructure and reduce deployment risks.
Review an example below of how you can use the declarative API to provision DataRobot resources using the Pulumi CLI:
import pulumi_datarobot as datarobot
import pulumi
import os
for var in [
"OPENAI_API_KEY",
"OPENAI_API_BASE",
"OPENAI_API_DEPLOYMENT_ID",
"OPENAI_API_VERSION",
]:
assert var in os.environ
pe = datarobot.PredictionEnvironment(
"pulumi_serverless_env", platform="datarobotServerless"
)
credential = datarobot.ApiTokenCredential(
"pulumi_credential", api_token=os.environ["OPENAI_API_KEY"]
)
cm = datarobot.CustomModel(
"pulumi_custom_model",
base_environment_id="65f9b27eab986d30d4c64268", # GenAI 3.11 w/ moderations
folder_path="model/",
runtime_parameter_values=[
{"key": "OPENAI_API_KEY", "type": "credential", "value": credential.id},
{
"key": "OPENAI_API_BASE",
"type": "string",
"value": os.environ["OPENAI_API_BASE"],
},
{
"key": "OPENAI_API_DEPLOYMENT_ID",
"type": "string",
"value": os.environ["OPENAI_API_DEPLOYMENT_ID"],
},
{
"key": "OPENAI_API_VERSION",
"type": "string",
"value": os.environ["OPENAI_API_VERSION"],
},
],
target_name="resultText",
target_type="TextGeneration",
)
rm = datarobot.RegisteredModel(
resource_name="pulumi_registered_model",
name=None,
custom_model_version_id=cm.version_id,
)
d = datarobot.Deployment(
"pulumi_deployment",
label="pulumi_deployment",
prediction_environment_id=pe.id,
registered_model_version_id=rm.version_id,
)
pulumi.export("deployment_id", d.id)