Changelog¶
3.6.0b0¶
New features¶
- Added
OCRJobResource
for running OCR jobs. - Added new Jina V2 embedding model in VectorDatabaseEmbeddingModel.
- Added new Small MultiLingual Embedding Model in VectorDatabaseEmbeddingModel.
- Added
Deployment.get_segment_attributes
to retrieve segment attributes. - Added
Deployment.get_segment_values
to retrieve segment values. - Added
AutomatedDocument.list_all_available_document_types
to return a list of document types. - Added
Model.request_per_class_fairness_insights
to return per-class bias & fairness insights. - Added
MLOpsEvent
to report MLOps Events. Currently supportingmoderation
MLOps events only - Added
Deployment.get_moderation_events
to retrieve moderation events for that deployment. - Extended the advanced options available when setting a target to include new parameter: ‘number_of_incremental_learning_iterations_before_best_model_selection’(part of the AdvancedOptions object). This parameter allows you to specify how long top 5 models will run for prior to best model selection.
- Add support for ‘connector_type’ in :meth:` Connector.create
`. - Deprecate file_path for :meth:` Connector.create
` and :meth:` Connector.update `. - Added
DataQualityExport
andDeployment.list_data_quality_exports
to retrieve a list of data quality records. - Added secure config support for Azure Service Principal credentials.
- Added support for categorical custom metrics in
CustomMetric
. - Added
NemoConfiguration
to manage Nemo configurations. - Added
NemoConfiguration.create
to create or update a Nemo configuration. - Added
NemoConfiguration.get
to retrieve a Nemo configuration. - Added a new class
ShapDistributions
to interact with SHAP distribution insights. - Added the
MODEL_COMPLIANCE_GEN_AI
value to the attributedocument_type
fromDocumentOption
to generate compliance documentation for LLMs in the Registry. - Added new attribute
prompts_count
toChat
. - Added
Recipe
modules for Data Wrangling. - Added :class:
RecipeOperation <datarobot.models.recipe_operation.RecipeOperation>
and a set of subclasses to represent a singleRecipe.operations
operation. - Added new attribute
similarity_score
toCitation
. - Added new attributes
retriever
andadd_neighbor_chunks
toVectorDatabaseSettings
. - Added new attribute
metadata
toCitation
. - Added new attribute
metadata_filter
toChatPrompt
. - Added new attribute
metadata_filter
toComparisonPrompt
. - Added new attribute
custom_chunking
toChunkingParameters
. - Added new attribute
custom_chunking
toVectorDatabase
. - Added a new class
LLMTestConfiguration
for LLM test configurations.LLMTestConfiguration.get
to retrieve a hosted LLM test configuration.LLMTestConfiguration.list
to list hosted LLM test configurations.LLMTestConfiguration.create
to create an LLM test configuration.LLMTestConfiguration.update
to update an LLM test configuration.LLMTestConfiguration.delete
to delete an LLM test configuration. - Added a new class
LLMTestConfigurationSupportedInsights
for LLM test configuration supported insights.LLMTestConfigurationSupportedInsights.list
to list hosted LLM test configuration supported insights. - Added a new class
LLMTestResult
for LLM test results.LLMTestResult.get
to retrieve a hosted LLM test result.LLMTestResult.list
to list hosted LLM test results.LLMTestResult.create
to create an LLM test result.LLMTestResult.delete
to delete an LLM test result. - Added new attribute
dataset_name
toOOTBDatasetDict
. - Added new attribute
rows_count
toOOTBDatasetDict
. - Added new attribute
max_num_prompts
toDatasetEvaluationDict
. - Added new attribute
prompt_sampling_strategy
toDatasetEvaluationDict
. - Added a new class
DatasetEvaluationRequestDict
for Dataset Evaluations in create/edit requests. - Added new attribute
evaluation_dataset_name
toInsightEvaluationResult
. - Added new attribute
chat_name
toInsightEvaluationResult
. - Added new attribute
llm_test_configuration_name
toLLMTestResult
. - Added new attribute
creation_user_name
toLLMTestResult
. - Added new attribute
pass_percentage
toLLMTestResult
. - Added new attribute
evaluation_dataset_name
toDatasetEvaluation
. - Added new attribute
datasets_compatibility
toLLMTestConfigurationSupportedInsights
. - Added a new class
NonOOTBDataset
for non out-of-the-box (OOTB) dataset entities.NonOOTBDataset.list
to retrieve non OOTB datasets for compliance testing. - Added a new class
OOTBDataset
for OOTB dataset entities.OOTBDataset.list
to retrieve OOTB datasets for compliance testing. - Added a new class
TraceMetadata
to retrieve trace metadata. - Add new attributes to
VectorDatabase
:parent_id
,family_id
,metadata_columns
,added_dataset_ids
,added_dataset_names
, and`version`.VectorDatabase.get_supported_retrieval_settings
to retrieve supported retrieval settings.VectorDatabase.submit_export_dataset_job
to submit the vector database as dataset to the AI catalog. - Updated the method
VectorDatabase.create
to create a new vector database version. - Added a new class
SupportedRetrievalSettings
for supported vector database retrieval settings. - Added a new class
SupportedRetrievalSetting
for supported vector database retrieval setting. - Added a new class
VectorDatabaseDatasetExportJob
for vector database dataset export jobs. - Added new attribute
playground_id
toCostMetricConfiguration
. - Added new attribute
name
toCostMetricConfiguration
. - Added a new class
SupportedInsights
to support lists. :SupportedInsights.list
to list supported insights. - Added a new class
MetricInsights
for the new metric insights routes. :MetricInsights.list
to list metric insights.MetricInsights.copy_to_playground
to copy metrics to another playground. - Added a new class
PlaygroundOOTBMetricConfiguration
for OOTB metric configurations. - Updated the schema for
EvaluationDatasetMetricAggregation
to include the new attributesootb_dataset_name
,dataset_id
anddataset_name
. - Updated the method
EvaluationDatasetMetricAggregation.list
with additional optional filter parameters. - Added new attribute
warning
toOOTBDataset
. - Added new attribute
warning
toOOTBDatasetDict
. - Added new attribute
warnings
toLLMTestConfiguration
. - Added a new parameter
playground_id
toSidecarModelMetricValidation.create
to support sidecar model metrics transition to playground. - Updated the schema for
NemoConfiguration
to include the new attributesprompt_pipeline_template_id
andresponse_pipeline_template_id
. - Added new attributes to
EvaluationDatasetConfiguration
:rows_count
,playground_id
.
API changes¶
- Updated
ServerError
’sexc_message
to be constructed with a request ID to help with debugging. - Added method
Deployment.get_capabilities
to retrieve a list of :class:Capability <datarobot.models.deployment.Capability>
objects containing capability details. - Advanced options parameters: ‘modelGroupId’, ‘modelRegimeId’, and ‘modelBaselines’ were renamed into ‘seriesId’, ‘forecastDistance’, and ‘forecastOffsets’.
- Added the parameter
use_sample_from_dataset
fromProject.create_from_dataset
. This parameter, when set, uses the EDA sample of the dataset to start the project. - Added the parameter
quick_compute
toShapImpact
functions. - Added the parameter
copy_insights
toPlayground.create
to copy the insights from existing Playground to the new one. - Added the parameter
llm_test_configuration_ids
,LLMBlueprint.register_custom_model
, to run LLM compliance tests when a blueprint is sent to the custom model workshop.
Enhancements¶
- Added standard pagination parameters (e.g.
limit
,offset
) toDeployment.list
, allowing you to get deployment data in smaller chunks.
Bugfixes¶
- Fixed field in
CustomTaskVersion
for controlling network policies. This is changed fromoutgoing_network_policy
tooutbound_network_policy
. When performing aGET
action, this field was incorrect and always resolved toNone
. When attempting aPOST
orPATCH
action, the incorrect field would result in a 422. Also changed the name ofdatarobot.enums.CustomTaskOutgoingNetworkPolicy
todatarobot.enums.CustomTaskOutboundNetworkPolicy
to reflect the proper field name. - Fixed schema for
DataSliceSizeInfo
, so it now allows an empty list for themessages
field.
Deprecation summary¶
- Removed the parameter
in_use
fromImageAugmentationList.create
. This parameter was deprecated in v3.1.0. - Deprecated
AutomatedDocument.list_available_document_types
. Please useAutomatedDocument.list_all_available_document_types
instead. - Deprecated
Model.request_fairness_insights
. Please useModel.request_per_class_fairness_insights
instead, to returnStatusCheckJob
instead ofstatus_id
. - Deprecated
Model.get_prime_eligibility
. Prime models are no longer supported. eligibleForPrime
field will no longer be returned fromModel.get_supported_capabilities
and will be removed after version 3.8 is released.- Deprecated the property
ShapImpact.row_count <datarobot.insights.ShapImpact.row_count>
and it will be removed after version 3.7 is released. - Advanced options parameters: ‘modelGroupId’, ‘modelRegimeId’, and ‘modelBaselines’ were renamed into ‘seriesId’, ‘forecastDistance’, and ‘forecastOffsets’ and are deprecated and they will be removed after version 3.6 is released.
- Renamed
datarobot.enums.CustomTaskOutgoingNetworkPolicy
todatarobot.enums.CustomTaskOutboundNetworkPolicy
to reflect bug fix changes. The original enum was unusable.
Configuration changes¶
- Removed upper bound pin on
urllib3
package to allow versions 2.0.2 and above. - Upgraded the
Pillow
library to version 10.3.0. Users installing DataRobot with the “images” extra (pip install datarobot[images]
) should note that this is a required library.
Documentation changes¶
- The API Reference page has been split into multiple sections for better usability.
- Fixed docs for
Project.refresh
to clarify that it does not return a value. - Fixed code example for
ExternalScores
. - Added copy button to code examples in ReadTheDocs documentation, for convenience.
- Removed the outdated ‘examples’ section from the documentation. Please refer to DataRobot’s API Documentation Home for more examples.
- Removed the duplicate ‘getting started’ section from the documentation.
- Updated to Sphinx RTD Theme v3.
Experimental changes¶
- Added the
force_update
parameter to theupdate
method inChunkDefinition
. - Removed attribute
select_columns
fromChunkDefinition
- Added initial experimental support for Chunking Service V2
-
DatasetDefinition
-DatasetProps
-DatasetInfo
-DynamicDatasetProps
-RowsChunkDefinition
-FeaturesChunkDefinition
-ChunkDefinitionStats
-ChunkDefinition
- Added new method
update
toChunkDefinition
- Added experimental support for time series wrangling, including usage template:
datarobot._experimental.models.time_series_wrangling_template.user_flow_template
Experimental changes offer automated time series feature engineering for the data in Snowflake or Postgres.- Added the ability to use the Spark dialect when creating a recipe, allowing data wrangling support for files.
- Added new attribute
warning
toChat
. - Moved all modules from
datarobot._experimental.models.genai
todatarobot.models.genai
. - Added a new method
Model.train_first_incremental_from_sample
that will train first incremental learning iteration from existing sample model. Requires “Project Creation from a Dataset Sample” feature flag.
3.5.0b0¶
New features¶
- Added support for BYO LLMs using serverless predictions in
CustomModelLLMValidation
. - Added attribute
creation_user_name
toLLMBlueprint
. - Added a new class
HostedCustomMetricTemplate
for hosted custom metrics templates.HostedCustomMetricTemplate.get
to retrieve a hosted custom metric template.HostedCustomMetricTemplate.list
to list hosted custom metric templates. - Added
Job.create_from_custom_metric_gallery_template
to create a job from a custom metric gallery template. - Added a new class
HostedCustomMetricTemplate
for hosted custom metrics. :HostedCustomMetric.list
to list hosted custom metrics.HostedCustomMetric.update
to update a hosted custom metrics.HostedCustomMetric.delete
to delete a hosted custom metric.HostedCustomMetric.create_from_custom_job
to create a hosted custom metric from existing custom job.HostedCustomMetric.create_from_template
to create hosted custom metric from template. - Added a new class
datarobot.models.deployment.custom_metrics.HostedCustomMetricBlueprint
for hosted custom metric blueprints. :HostedCustomMetricBlueprint.get
to get a hosted custom metric blueprint.HostedCustomMetricBlueprint.create
to create a hosted custom metric blueprint.HostedCustomMetricBlueprint.update
to update a hosted custom metric blueprint. - Added
Job.list_schedules
to list job schedules. - Added a new class
JobSchedule
for the registry job schedule. :JobSchedule.create
to create a job schedule.JobSchedule.update
to update a job schedule.JobSchedule.delete
to delete a job schedule. - Added attribute
credential_type
toRuntimeParameter
. - Added a new class
EvaluationDatasetConfiguration <datarobot._experimental.models.genai.evaluation_dataset_configuration.EvaluationDatasetConfiguration>
for configuration of evaluation datasets.EvaluationDatasetConfiguration.get <datarobot._experimental.models.genai.evaluation_dataset_configuration.EvaluationDatasetConfiguration.get>
to get an evaluation dataset configuration.EvaluationDatasetConfiguration.list <datarobot._experimental.models.genai.evaluation_dataset_configuration.EvaluationDatasetConfiguration.list>
to list the evaluation dataset configurations for a Use Case.EvaluationDatasetConfiguration.create <datarobot._experimental.models.genai.evaluation_dataset_configuration.EvaluationDatasetConfiguration.create>
to create an evaluation dataset configuration.EvaluationDatasetConfiguration.update <datarobot._experimental.models.genai.evaluation_dataset_configuration.EvaluationDatasetConfiguration.update>
to update an evaluation dataset configuration.EvaluationDatasetConfiguration.delete <datarobot._experimental.models.genai.evaluation_dataset_configuration.EvaluationDatasetConfiguration.delete>
to delete an evaluation dataset configuration. - Added a new class
EvaluationDatasetMetricAggregation <datarobot._experimental.models.genai.evaluation_dataset_metric_aggregation.EvaluationDatasetMetricAggregation>
for metric aggregation results.EvaluationDatasetMetricAggregation.list <datarobot._experimental.models.genai.evaluation_dataset_metric_aggregation.EvaluationDatasetMetricAggregation.list>
to get the metric aggregation results.EvaluationDatasetMetricAggregation.create <datarobot._experimental.models.genai.evaluation_dataset_metric_aggregation.EvaluationDatasetMetricAggregation.create>
to create the metric aggregation job.EvaluationDatasetMetricAggregation.delete <datarobot._experimental.models.genai.evaluation_dataset_metric_aggregation.EvaluationDatasetMetricAggregation.delete>
to delete metric aggregation results. - Added a new class
SyntheticEvaluationDataset <datarobot._experimental.models.genai.synthetic_evaluation_dataset_generation.SyntheticEvaluationDataset>
for synthetic dataset generation. UseSyntheticEvaluationDataset.create <datarobot._experimental.models.genai.synthetic_evaluation_dataset_generation.SyntheticEvaluationDataset.create>
to create a synthetic evaluation dataset. - Added a new class
SidecarModelMetricValidation <datarobot._experimental.models.genai.sidecar_model_metric.SidecarModelMetricValidation>
for sidecar model metric validations.SidecarModelMetricValidation.create <datarobot._experimental.models.genai.sidecar_model_metric.SidecarModelMetricValidation.create>
to create a sidecar model metric validation.SidecarModelMetricValidation.list <datarobot._experimental.models.genai.sidecar_model_metric.SidecarModelMetricValidation.list>
to list sidecar model metric validations.SidecarModelMetricValidation.get <datarobot._experimental.models.genai.sidecar_model_metric.SidecarModelMetricValidation.get>
to get a sidecar model metric validation.SidecarModelMetricValidation.revalidate <datarobot._experimental.models.genai.sidecar_model_metric.SidecarModelMetricValidation.revalidate>
to rerun a sidecar model metric validation.SidecarModelMetricValidation.update <datarobot._experimental.models.genai.sidecar_model_metric.SidecarModelMetricValidation.update>
to update a sidecar model metric validation.SidecarModelMetricValidation.delete <datarobot._experimental.models.genai.sidecar_model_metric.SidecarModelMetricValidation.delete>
to delete a sidecar model metric validation. - Added experimental support for Chunking Service:
- Added a new attribute,
is_descending_order
to:
Bugfixes¶
- Updated the trafaret column
prediction
fromTrainingPredictionsIterator
for supporting extra list of strings.
Configuration changes¶
- Updated black version to 23.1.0.
- Removes dependency on package mock, since it is part of the standard library.
Documentation changes¶
- Removed incorrect can_share parameters in Use Case sharing example
- Added usage of
external_llm_context_size
inllm_settings
ingenai_example.rst
. - Updated doc string for
llm_settings
to include attributeexternal_llm_context_size
for external LLMs. - Updated
genai_example.rst
to link to DataRobot doc pages for external vector database and external LLM deployment creation.
API changes¶
- Remove
ImportedModel
object since it was API for SSE (standalone scoring engine) which is not part of DataRobot anymore. - Added
number_of_clusters
parameter toProject.get_model_records
to filter models by number of clusters in unsupervised clustering projects. - Remove an unsupported
NETWORK_EGRESS_POLICY.DR_API_ACCESS
value for custom models. This value was used by a feature that was never released as a GA and is not supported in the current API. - Implemented support for
dr-connector-v1
toDataStore <datarobot.models.DataStore>
andDataSource <datarobot.models.DataStore>
. - Added a new parameter
name
toDataStore.list
for searching data stores by name. - Added a new parameter
entity_type
to thecompute
andcreate
methods of the classesShapMatrix
,ShapImpact
,ShapPreview
. Insights can be computed for custom models if the parameterentity_type="customModel"
is passed. See also the User Guide: :ref:SHAP insights overview<shap_insights_overview>
.
Experimental changes¶
- Added experimental api support for Data Wrangling. See
Recipe
.Recipe.from_data_store
to create a Recipe from data store.Recipe.retrieve_preview
to get a sample of the data after recipe is applied.Recipe.set_inputs
to set inputs to the recipe.Recipe.set_operations
to set operations to the recipe. - Added new experimental
DataStore
that addsget_spark_session
for Databricksdatabricks-v1
data stores to get a Spark session. - Added attribute
chunking_type
toDatasetChunkDefinition
. - Added OTV attributes to
DatasourceDefinition
. - Added
DatasetChunkDefinition.patch_validation_dates
to patch validation dates of OTV datasource definitions after sampling job.
3.4.1¶
New features¶
Enhancements¶
Bugfixes¶
- Updated the validation logic of
RelationshipsConfiguration
to work with native database connections
API changes¶
Deprecation summary¶
Configuration changes¶
Documentation changes¶
Experimental changes¶
3.4.0¶
New features¶
- Added the following classes for generative AI. Importing these from
datarobot._experimental.models.genai
is deprecated and will be removed by the release of DataRobot 10.1 and SDK 3.5. Playground
to manage generative AI playgrounds.LLMDefinition
to get information about supported LLMs.LLMBlueprint
to manage LLM blueprints.Chat
to manage chats for LLM blueprints.ChatPrompt
to submit prompts within a chat.ComparisonChat
to manage comparison chats across multiple LLM blueprints within a playground.ComparisonPrompt
to submit a prompt to multiple LLM blueprints within a comparison chat.VectorDatabase
to create vector databases from datasets in the AI Catalog for retrieval augmented generation with an LLM blueprint.CustomModelVectorDatabaseValidation
to validate a deployment for use as a vector database.CustomModelLLMValidation
to validate a deployment for use as an LLM.UserLimits
to get counts of vector databases and LLM requests for a user.- Extended the advanced options available when setting a target to include new parameter: ‘incrementalLearningEarlyStoppingRounds’(part of the AdvancedOptions object). This parameter allows you to specify when to stop for incremental learning automation.
- Added experimental support for Chunking Service:
DatasetChunkDefinition
for defining how chunks are created from a data source.DatasetChunkDefinition.create
to create a new dataset chunk definition.DatasetChunkDefinition.get
to get a specific dataset chunk definition.DatasetChunkDefinition.list
to list all dataset chunk definitions.DatasetChunkDefinition.get_datasource_definition
to retrieve the data source definition.DatasetChunkDefinition.get_chunk
to get specific chunk metadata belonging to a dataset chunk definition.DatasetChunkDefinition.list_chunks
to list all chunk metadata belonging to a dataset chunk definition.DatasetChunkDefinition.create_chunk
to submit a job to retrieve the data from the origin data source.DatasetChunkDefinition.create_chunk_by_index
to submit a job to retrieve data from the origin data source by index.
OriginStorageType
Chunk
ChunkStorageType
ChunkStorage
DatasourceDefinition
DatasourceAICatalogInfo
to define the datasource AI catalog information to create a new dataset chunk definition.DatasourceDataWarehouseInfo
to define the datasource data warehouse (snowflake, big query, etc) information to create a new dataset chunk definition.RuntimeParameter
for retrieving runtime parameters assigned toCustomModelVersion
.RuntimeParameterValue
to define runtime parameter override value, to be assigned toCustomModelVersion
.- Added Snowflake Key Pair authentication for uploading datasets from Snowflake or creating a project from Snowflake data
- Added
Project.get_model_records
to retrieve models. MethodProject.get_models
is deprecated and will be removed soon in favour ofProject.get_model_records
. - Extended the advanced options available when setting a target to include new parameter: ‘chunkDefinitionId’(part of the AdvancedOptions object). This parameter allows you to specify the chunking definition needed for incremental learning automation.
- Extended the advanced options available when setting a target to include new Autopilot parameters: ‘incrementalLearningOnlyMode’ and ‘incrementalLearningOnBestModel’ (part of the AdvancedOptions object). These parameters allow you to specify how Autopilot is performed with the chunking service.
- Added a new method
DatetimeModel.request_lift_chart
to support Lift Chart calculations for datetime partitioned projects with support of Sliced Insights. - Added a new method
DatetimeModel.get_lift_chart
to support Lift chart retrieval for datetime partitioned projects with support of Sliced Insights. - Added a new method
DatetimeModel.request_roc_curve
to support ROC curve calculation for datetime partitioned projects with support of Sliced Insights. - Added a new method
DatetimeModel.get_roc_curve
to support ROC curve retrieval for datetime partitioned projects with support of Sliced Insights. - Update method
DatetimeModel.request_feature_impact
to support use of Sliced Insights. - Update method
DatetimeModel.get_feature_impact
to support use of Sliced Insights. - Update method
DatetimeModel.get_or_request_feature_impact
to support use of Sliced Insights. - Update method
DatetimeModel.request_feature_effect
to support use of Sliced Insights. - Update method
DatetimeModel.get_feature_effect
to support use of Sliced Insights. - Update method
DatetimeModel.get_or_request_feature_effect
to support use of Sliced Insights. - Added a new method
FeatureAssociationMatrix.create
to support the creation of FeatureAssociationMatricies for Featurelists. - Introduced a new method
Deployment.perform_model_replace
as a replacement forDeployment.replace_model
. - Introduced a new property,
model_package
, which provides an overview of the currently used model package indatarobot.models.Deployment
. - Added new parameter
prediction_threshold
toBatchPredictionJob.score_with_leaderboard_model
andBatchPredictionJob.score
that automatically assigns the positive class label to any prediction exceeding the threshold. - Added two new enum values to :class:
datarobot.models.data_slice.DataSlicesOperators
, “BETWEEN” and “NOT_BETWEEN”, which are used to allow slicing. - Added a new class
Challenger
for interacting with DataRobot challengers to support the following methods:Challenger.get
to retrieve challenger objects by ID.Challenger.list
to list all challengers.Challenger.create
to create a new challenger.Challenger.update
to update a challenger.Challenger.delete
to delete a challenger. - Added a new method
Deployment.get_challenger_replay_settings
to retrieve the challenger replay settings of a deployment. - Added a new method
Deployment.list_challengers
to retrieve the challengers of a deployment. - Added a new method
Deployment.get_champion_model_package
to retrieve the champion model package from a deployment. - Added a new method
Deployment.list_prediction_data_exports
to retrieve deployment prediction data exports. - Added a new method
Deployment.list_actuals_data_exports
to retrieve deployment actuals data exports. - Added a new method
Deployment.list_training_data_exports
to retrieve deployment training data exports. - Manage deployment health settings with the following methods:
- Get health settings
Deployment.get_health_settings
- Update health settings
Deployment.update_health_settings
- Get default health settings
Deployment.get_default_health_settings
- Added new enum value to
datarobot.enums._SHARED_TARGET_TYPE
to support Text Generation use case. - Added new enum value
datarobotServerless
todatarobot.enums.PredictionEnvironmentPlatform
to support DataRobot Serverless prediction environments. - Added new enum value
notApplicable
todatarobot.enums.PredictionEnvironmentHealthType
to support new health status from DataRobot API. - Added new enum value to
datarobot.enums.TARGET_TYPE
anddatarobot.enums.CUSTOM_MODEL_TARGET_TYPE
to support text generation custom inference models. - Updated
datarobot.CustomModel
to support the creation of text generation custom models. - Added a new class
CustomMetric
for interacting with DataRobot custom metrics to support the following methods:CustomMetric.get
to retrieve a custom metric object by ID from a given deployment.CustomMetric.list
to list all custom metrics from a given deployment.CustomMetric.create
to create a new custom metric for a given deployment.CustomMetric.update
to update a custom metric for a given deployment.CustomMetric.delete
to delete a custom metric for a given deployment.CustomMetric.unset_baseline
to remove baseline for a given custom metric.CustomMetric.submit_values
to submit aggregated custom metrics values from code. The provided data should be in the form of a dict or a Pandas DataFrame.CustomMetric.submit_single_value
to submit a single custom metric value.CustomMetric.submit_values_from_catalog
to submit aggregated custom metrics values from a dataset via the AI Catalog.CustomMetric.get_values_over_time
to retrieve values of a custom metric over a time period.CustomMetric.get_summary
to retrieve the summary of a custom metric over a time period.CustomMetric.get_values_over_batch
to retrieve values of a custom metric over batches.CustomMetric.get_batch_summary
to retrieve the summary of a custom metric over batches. - Added
CustomMetricValuesOverTime
to retrieve custom metric over time information. - Added
CustomMetricSummary
to retrieve custom metric over time summary. - Added
CustomMetricValuesOverBatch
to retrieve custom metric over batch information. - Added
CustomMetricBatchSummary
to retrieve custom metric batch summary. - Added
Job
andJobRun
to create, read, update, run, and delete jobs in the Registry. - Added
KeyValue
to create, read, update, and delete key values. - Added a new class
PredictionDataExport
for interacting with DataRobot deployment data export to support the following methods:PredictionDataExport.get
to retrieve a prediction data export object by ID from a given deployment.PredictionDataExport.list
to list all prediction data exports from a given deployment.PredictionDataExport.create
to create a new prediction data export for a given deployment.PredictionDataExport.fetch_data
to retrieve a prediction export data as a DataRobot dataset. - Added a new class
ActualsDataExport
for interacting with DataRobot deployment data export to support the following methods:ActualsDataExport.get
to retrieve an actuals data export object by ID from a given deployment.ActualsDataExport.list
to list all actuals data exports from a given deployment.ActualsDataExport.create
to create a new actuals data export for a given deployment.ActualsDataExport.fetch_data
to retrieve an actuals export data as a DataRobot dataset. - Added a new class
TrainingDataExport
for interacting with DataRobot deployment data export to support the following methods:TrainingDataExport.get
to retrieve a training data export object by ID from a given deployment.TrainingDataExport.list
to list all training data exports from a given deployment.TrainingDataExport.create
to create a new training data export for a given deployment.TrainingDataExport.fetch_data
to retrieve a training export data as a DataRobot dataset. - Added a new parameter
base_environment_version_id
toCustomModelVersion.create_clean
for overriding the default environment version selection behavior. - Added a new parameter
base_environment_version_id
toCustomModelVersion.create_from_previous
for overriding the default environment version selection behavior. - Added a new class
PromptTrace <datarobot._experimental.models.genai.prompt_trace.PromptTrace>
for interacting with DataRobot prompt trace to support the following methods:PromptTrace.list <datarobot._experimental.models.genai.prompt_trace.PromptTrace.list>
to list all prompt traces from a given playground.PromptTrace.export_to_ai_catalog <datarobot._experimental.models.genai.prompt_trace.PromptTrace.export_to_ai_catalog>
to export prompt traces for the playground to AI catalog. - Added a new class
InsightsConfiguration <datarobot._experimental.models.genai.insights_configuration.InsightsConfiguration>
for describing available insights and configured insights for a playground.InsightsConfiguration.list <datarobot._experimental.models.genai.insights_configuration.InsightsConfiguration.list>
to list the insights that are available to be configured. - Added a new class
Insights <datarobot._experimental.models.genai.insights_configuration.Insights>
for configuring insights for a playground.Insights.get <datarobot._experimental.models.genai.insights_configuration.Insights.get>
to get the current insights configuration for a playground.Insights.create <datarobot._experimental.models.genai.insights_configuration.Insights.create>
to create or update the insights configuration for a playground. - Added a new class :class:
CostMetricConfiguration <datarobot._experimental.models.genai.cost_metric_configurations.CostMetricConfiguration>
for describing available cost metrics and configured cost metrics for a Use Case.CostMetricConfiguration.get <datarobot._experimental.models.genai.cost_metric_configurations.CostMetricConfiguration.get>
to get the cost metric configuration.CostMetricConfiguration.create <datarobot._experimental.models.genai.cost_metric_configurations.CostMetricConfiguration.create>
to create a cost metric configuration.CostMetricConfiguration.update <datarobot._experimental.models.genai.cost_metric_configurations.CostMetricConfiguration.update>
to update the cost metric configuration.CostMetricConfiguration.delete <datarobot._experimental.models.genai.cost_metric_configurations.CostMetricConfiguration.delete>
to delete the cost metric configuration.Key - Added a new class
LLMCostConfiguration <datarobot._experimental.models.genai.cost_metric_configurations.LLMCostConfiguration>
for the cost configuration of a specific llm within a Use Case. - Added new classes
ShapMatrix
,ShapImpact
,ShapPreview
to interact with SHAP-based insights. See also the User Guide: :ref:SHAP insights overview<shap_insights_overview>
API changes¶
- Parameter Overrides: Users can now override most of the previously set configuration values directly through parameters when initializing the Client. Exceptions: The endpoint and token values must be initialized from one source (client params, environment, or config file) and cannot be overridden individually, for security and consistency reasons. The new configuration priority is as follows:
- Client Params
- Client config_path param
- Environment Variables
- Default to reading YAML config file from
~/.config/datarobot/drconfig.yaml
DATAROBOT_API_CONSUMER_TRACKING_ENABLED
now always defaults toTrue
.- Added Databricks personal access token and service principal (also shared credentials via secure config) authentication for uploading datasets from Databricks or creating a project from Databricks data.
- Added secure config support for AWS long term credentials.
- Implemented support for
dr-database-v1
toDataStore <datarobot.models.DataStore>
,DataSource <datarobot.models.DataStore>
, andDataDriver <datarobot.models.DataDriver>.
Added enum classes to support the changes. - You can retrieve the canonical URI for a Use Case using
UseCase.get_uri
. - You can open a Use Case in a browser using
UseCase.open_in_browser
.
Enhancements¶
- Added a new parameter to
Dataset.create_from_url
to support fast dataset registration: sample_size
- Added a new parameter to
Dataset.create_from_data_source
to support fast dataset registration: sample_size
Job.get_result_when_complete
returnsdatarobot.models.DatetimeModel
instead of thedatarobot.models.Model
if a datetime model was trained.Dataset.get_as_dataframe
can handle downloading parquet files as well as csv files.- Implement support for
dr-database-v1
inDataStore <datarobot.models.DataStore>
- Added two new parameters to
BatchPredictionJobDefinition.list
for paginating long job definitions lists: offset
limit
- Added two new parameters to
BatchPredictionJobDefinition.list
for filtering the job definitions: deployment_id
search_name
- Added new parameter to
Deployment.validate_replacement_model
to support replacement validation based on model package ID: new_registered_model_version_id
- Added support for Native Connectors to
Connector <datarobot.models.Connector>
for everything other than :meth:` Connector.create` and :meth:` Connector.update `
Deprecation summary¶
- Removed
Model.get_leaderboard_ui_permalink
andModel.open_model_browser
- Deprecated
Project.get_models
in favour ofProject.get_model_records
. BatchPredictionJobDefinition.list
will no longer return all job definitions after version 3.6 is released. To preserve current behavior please pass limit=0.new_model_id
parameter inDeployment.validate_replacement_model
will be removed after version 3.6 is released.Deployment.replace_model
will be removed after version 3.6 is released. MethodDeployment.perform_model_replace
should be used instead.CustomInferenceModel.assign_training_data
was marked as deprecated in v3.2. The deprecation period has been extended, and the feature will now be removed in v3.5. UseCustomModelVersion.create_clean
andCustomModelVersion.create_from_previous
instead.
Documentation changes¶
- Updated
genai_example.rst
to utilize latest genAI features and methods introduced most recently in the API client.
Experimental changes¶
- Added new attribute,
prediction_timeout
toCustomModelValidation
. - Added new attributes,
feedback_result
,metrics
, andfinal_prompt
toResultMetadata <datarobot._experimental.models.genai.chat_prompt.ResultMetadata>
. - Added
use_case_id
toCustomModelValidation
. - Added
llm_blueprints_count
anduser_name
toPlayground
. - Added
custom_model_embedding_validations
toSupportedEmbeddings <datarobot._experimental.models.genai.vector_database.SupportedEmbeddings>
. - Added
embedding_validation_id
andis_separator_regex
toVectorDatabase <datarobot._experimental.models.genai.vector_database.VectorDatabase>
. - Added optional parameters,
use_case
,name
, andmodel
toCustomModelValidation.create
. - Added a method
CustomModelValidation.list
, to list custom model validations available to a user with several optional parameters to filter the results. - Added a method
CustomModelValidation.update
, to update a custom model validation. - Added an optional parameter,
use_case
, toLLMDefinition.list
, to include in the returned LLMs the external LLMs available for the specifieduse_case
as well. - Added optional parameter,
playground
toVectorDatabase.list
to list vector databases by playground. - Added optional parameter,
comparison_chat
, toComparisonPrompt.list
, to list comparison prompts by comparison chat. - Added optional parameter,
comparison_chat
, toComparisonPrompt.create
, to specify the comparison chat to create the comparison prompt in. - Added optional parameter,
feedback_result
, toComparisonPrompt.update <datarobot._experimental.models.genai.comparison_prompt.ComparisonPrompt.update>
, to update a comparison prompt with feedback. - Added optional parameters,
is_starred
toLLMBlueprint.update
to update the LLM blueprint’s starred status. - Added optional parameters,
is_starred
toLLMBlueprint.list
to filter the returned LLM blueprints to those matchingis_starred
. - Added a new enum PromptType,
PromptType
to identify the LLMBlueprint’s prompting type. - Added optional parameters,
prompt_type
toLLMBlueprint.create
, to specify the LLM blueprint’s prompting type. This can be set withPromptType
. - Added optional parameters,
prompt_type
toLLMBlueprint.update
, to specify the updated LLM blueprint’s prompting type. This can be set withPromptType
. - Added a new class,
ComparisonChat
, for interacting with DataRobot generative AI comparison chats.ComparisonChat.get
retrieves a comparison chat object by ID.ComparisonChat.list
lists all comparison chats available to the user.ComparisonChat.create
creates a new comparison chat.ComparisonChat.update
updates the name of a comparison chat.ComparisonChat.delete
deletes a single comparison chat. - Added optional parameters,
playground
andchat
toChatPrompt.list
, to list chat prompts by playground and chat. - Added optional parameter,
chat
toChatPrompt.create
, to specify the chat to create the chat prompt in. - Added a new method,
ChatPrompt.update <datarobot._experimental.models.genai.chat_prompt.ChatPrompt.update>
, to update a chat prompt with custom metrics and feedback. - Added a new class,
Chat
, for interacting with DataRobot generative AI chats.Chat.get
retrieves a chat object by ID.Chat.list
lists all chats available to the user.Chat.create
creates a new chat.Chat.update
updates the name of a chat.Chat.delete
deletes a single chat. - Removed the
model_package
module. UseRegisteredModelVersion
instead. - Added new class
UserLimits
- Added support to get the count of users’ LLM API requests.
UserLimits.get_llm_requests_count
- Added support to get the count of users’ vector databases.
UserLimits.get_vector_database_count
- Added new methods to the class
Notebook
which includesNotebook.run
andNotebook.download_revision
. See the documentation for example usage. - Added new class
NotebookScheduledJob
. - Added new class
NotebookScheduledRun
. - Added a new method
Model.get_incremental_learning_metadata
that retrieves incremental learning metadata for a model. - Added a new method
Model.start_incremental_learning
that starts incremental learning for a model. - Updated the API endpoint prefix for all GenerativeAI routes to align with the publicly documented routes.
Bugfixes¶
- Fixed how async url is build in
Model.get_or_request_feature_impact
- Fixed setting ssl_verify by env variables.
- Resolved a problem related to tilde-based paths in the Client’s ‘config_path’ attribute.
- Changed the force_size default of
ImageOptions
to apply the same transformations by default, which are applied when image archive datasets are uploaded to DataRobot.
3.3.0¶
New features¶
- Added support for Python 3.11.
- Added new library “strenum” to add
StrEnum
support while maintaining backwards compatibility with Python 3.7-3.10. DataRobot does not use the nativeStrEnum
class in Python 3.11. - Added a new class
PredictionEnvironment
for interacting with DataRobot Prediction environments. - Extended the advanced options available when setting a target to include new parameters: ‘modelGroupId’, ‘modelRegimeId’, and ‘modelBaselines’ (part of the AdvancedOptions object). These parameters allow you to specify the user columns required to run time series models without feature derivation in OTV projects.
- Added a new method
PredictionExplanations.create_on_training_data
, for computing prediction explanation on training data. - Added a new class
RegisteredModel
for interacting with DataRobot registered models to support the following methods:RegisteredModel.get
to retrieve RegisteredModel object by ID.RegisteredModel.list
to list all registered models.RegisteredModel.archive
to permanently archive registered model.RegisteredModel.update
to update registered model.RegisteredModel.get_shared_roles
to retrieve access control information for registered model.RegisteredModel.share
to share a registered model.RegisteredModel.get_version
to retrieve RegisteredModelVersion object by ID.RegisteredModel.list_versions
to list registered model versions.RegisteredModel.list_associated_deployments
to list deployments associated with a registered model. - Added a new class
RegisteredModelVersion
for interacting with DataRobot registered model versions (also known as model packages) to support the following methods:RegisteredModelVersion.create_for_external
to create a new registered model version from an external model.RegisteredModelVersion.list_associated_deployments
to list deployments associated with a registered model version.RegisteredModelVersion.create_for_leaderboard_item
to create a new registered model version from a Leaderboard model.RegisteredModelVersion.create_for_custom_model_version
to create a new registered model version from a custom model version. - Added a new method
Deployment.create_from_registered_model_version
to support creating deployments from registered model version. - Added a new method
Deployment.download_model_package_file
to support downloading model package files (.mlpkg) of the currently deployed model. - Added support for retrieving document thumbnails:
DocumentThumbnail
DocumentPageFile
- Added support to retrieve document text extraction samples using:
DocumentTextExtractionSample
DocumentTextExtractionSamplePage
DocumentTextExtractionSampleDocument
- Added new fields to
CustomTaskVersion
for controlling network policies. The new fields were also added to the response. This can be set withdatarobot.enums.CustomTaskOutgoingNetworkPolicy
. - Added a new method
BatchPredictionJob.score_with_leaderboard_model
to run batch predictions using a Leaderboard model instead of a deployment. - Set
IntakeSettings
andOutputSettings
to useIntakeAdapters
andOutputAdapters
enum values respectively for the propertytype
. - Added method meth:
Deployment.get_predictions_vs_actuals_over_time <datarobot.models.Deployment.get_predictions_vs_actuals_over_time>
to retrieve a deployment’s predictions vs actuals over time data.
Bugfixes¶
- Payload property
subset
renamed tosource
inModel.request_feature_effect
- Fixed an issue where Context.trace_context was not being set from environment variables or DR config files.
Project.refresh
no longer setsProject.advanced_options
to a dictionary.- Fixed
Dataset.modify
to clarify behavior of when to preserve or clear categories. - Fixed an issue with enums in f-strings resulting in the enum class and property being printed instead of the enum property’s value in Python 3.11 environments.
Deprecation summary¶
Project.refresh
will no longer setProject.advanced_options
to a dictionary after version 3.5 is released. : All interactions withProject.advanced_options
should be expected to be through theAdvancedOptions
class.
Experimental changes¶
- Added a new class,
VectorDatabase
, for interacting with DataRobot vector databases.VectorDatabase.get
retrieves a VectorDatabase object by ID.VectorDatabase.list
lists all VectorDatabases available to the user.VectorDatabase.create
creates a new VectorDatabase.VectorDatabase.create
allows you to use a validated deployment of a custom model as your own Vector Database.VectorDatabase.update
updates the name of a VectorDatabase.VectorDatabase.delete
deletes a single VectorDatabase.VectorDatabase.get_supported_embeddings
retrieves all supported embedding models.VectorDatabase.get_supported_text_chunkings
retrieves all supported text chunking configurations.VectorDatabase.download_text_and_embeddings_asset
download a parquet file with internal vector database data. - Added a new class,
CustomModelVectorDatabaseValidation
, for validating custom model deployments for use as a vector database.CustomModelVectorDatabaseValidation.get
retrieves a CustomModelVectorDatabaseValidation object by ID.CustomModelVectorDatabaseValidation.get_by_values
retrieves a CustomModelVectorDatabaseValidation object by field values.CustomModelVectorDatabaseValidation.create
starts validation of the deployment.CustomModelVectorDatabaseValidation.revalidate
repairs an unlinked external vector database. - Added a new class,
Playground
, for interacting with DataRobot generative AI playgrounds.Playground.get
retrieves a playground object by ID.Playground.list
lists all playgrounds available to the user.Playground.create
creates a new playground.Playground.update
updates the name and description of a playground.Playground.delete
deletes a single playground. - Added a new class,
LLMDefinition
, for interacting with DataRobot generative AI LLMs.LLMDefinition.list
lists all LLMs available to the user. - Added a new class,
LLMBlueprint
, for interacting with DataRobot generative AI LLM blueprints.LLMBlueprint.get
retrieves an LLM blueprint object by ID.LLMBlueprint.list
lists all LLM blueprints available to the user.LLMBlueprint.create
creates a new LLM blueprint.LLMBlueprint.create_from_llm_blueprint
creates a new LLM blueprint from an existing one.LLMBlueprint.update
updates an LLM blueprint.LLMBlueprint.delete
deletes a single LLM blueprint. - Added a new class,
ChatPrompt
, for interacting with DataRobot generative AI chat prompts.ChatPrompt.get
retrieves a chat prompt object by ID.ChatPrompt.list
lists all chat prompts available to the user.ChatPrompt.create
creates a new chat prompt.ChatPrompt.delete
deletes a single chat prompt. - Added a new class,
CustomModelLLMValidation
, for validating custom model deployments for use as a custom model LLM.CustomModelLLMValidation.get
retrieves a CustomModelLLMValidation object by ID.CustomModelLLMValidation.get_by_values
retrieves a CustomModelLLMValidation object by field values.CustomModelLLMValidation.create
starts validation of the deployment.CustomModelLLMValidation.revalidate
repairs an unlinked external custom model LLM. - Added a new class,
ComparisonPrompt
, for interacting with DataRobot generative AI comparison prompts.ComparisonPrompt.get
retrieves a comparison prompt object by ID.ComparisonPrompt.list
lists all comparison prompts available to the user.ComparisonPrompt.create
creates a new comparison prompt.ComparisonPrompt.update
updates a comparison prompt.ComparisonPrompt.delete
deletes a single comparison prompt. - Extended
UseCase
, adding two new fields to represent the count of vector databases and playgrounds. - Added a new method,
ChatPrompt.create_llm_blueprint
, to create an LLM blueprint from a chat prompt. - Added a new method,
CustomModelLLMValidation.delete
, to delete a custom model LLM validation record. - Added a new method,
LLMBlueprint.register_custom_model
, for registering a custom model from a generative AI LLM blueprint.
3.2.0¶
New features¶
- Added new methods to trigger batch monitoring jobs without providing a job definition.
BatchMonitoringJob.run
BatchMonitoringJob.get_status
BatchMonitoringJob.cancel
BatchMonitoringJob.download
- Added
Deployment.submit_actuals_from_catalog_async
to submit actuals from the AI Catalog. - Added a new class
StatusCheckJob
which represents a job for a status check of submitted async jobs. - Added a new class
JobStatusResult
represents the result for a status check job of a submitted async task. - Added
DatetimePartitioning.datetime_partitioning_log_retrieve
to download the datetime partitioning log. - Added method
DatetimePartitioning.datetime_partitioning_log_list
to list the datetime partitioning log. - Added
DatetimePartitioning.get_input_data
to retrieve the input data used to create an optimized datetime partitioning. - Added
DatetimePartitioningId
, which can be passed as apartitioning_method
toProject.analyze_and_model
. - Added the ability to share deployments. See :ref:
deployment sharing <deployment_sharing>
for more information on sharing deployments. - Added new methods get_bias_and_fairness_settings and update_bias_and_fairness_settings to retrieve or update bias and fairness settings.
Deployment.get_bias_and_fairness_settings
Deployment.update_bias_and_fairness_settings
- Added a new class
UseCase
for interacting with the DataRobot Use Cases API. - Added a new class
Application
for retrieving DataRobot Applications available to the user. - Added a new class
SharingRole
to hold user or organization access rights. - Added a new class
BatchMonitoringJob
for interacting with batch monitoring jobs. - Added a new class
BatchMonitoringJobDefinition
for interacting with batch monitoring jobs definitions. - Added a new methods for handling monitoring job definitions: list, get, create, update, delete, run_on_schedule and run_once
BatchMonitoringJobDefinition.list
BatchMonitoringJobDefinition.get
BatchMonitoringJobDefinition.create
BatchMonitoringJobDefinition.update
BatchMonitoringJobDefinition.delete
BatchMonitoringJobDefinition.run_on_schedule
BatchMonitoringJobDefinition.run_once
- Added a new method to retrieve a monitoring job
BatchMonitoringJob.get
- Added the ability to filter return objects by a Use Case ID passed to the following methods:
Dataset.list
Project.list
- Added the ability to automatically add a newly created dataset or project to a Use Case by passing a UseCase, list of UseCase objects, UseCase ID or list of UseCase IDs using the keyword argument
use_cases
to the following methods:Dataset.create_from_file
Dataset.create_from_in_memory_data
Dataset.create_from_url
Dataset.create_from_data_source
Dataset.create_from_query_generator
Dataset.create_project
Project.create
Project.create_from_data_source
Project.create_from_dataset
Project.create_segmented_project_from_clustering_model
Project.start
- Added the ability to set a default
UseCase
for requests. It can be set in several ways. - If the user configures the client via
Client(...)
, then invokeClient(..., default_use_case = <id>)
. - If the user configures the client via dr.config.yaml, then add the property
default_use_case: <id>
. - If the user configures the client via env vars, then set the env var
DATAROBOT_DEFAULT_USE_CASE
. - The default use case can also be set programmatically as a context manager via
with UseCase.get(<id>):
. - Added the ability to configure the collection of client usage metrics to send to DataRobot. Note that this feature only tracks which DataRobot package methods are called and does not collect any user data. You can configure collection with the following settings:
- If the user configures the client via
Client(...)
, then invokeClient(..., enable_api_consumer_tracking = <True/False>)
. - If the user configures the client via dr.config.yaml, then add the property
enable_api_consumer_tracking: <True/False>
. - If the user configures the client via env vars, then set the env var
DATAROBOT_API_CONSUMER_TRACKING_ENABLED
.
Currently the default value for enable_api_consumer_tracking
is True
.
- Added method meth:Deployment.get_predictions_over_time <datarobot.models.Deployment.get_predictions_over_time>
to retrieve deployment predictions over time data.
- Added a new class FairnessScoresOverTime
to retrieve fairness over time information.
- Added a new method Deployment.get_fairness_scores_over_time
to retrieve fairness scores over time of a deployment.
- Added a new use_gpu
parameter to the method Project.analyze_and_model
to set whether the project should allow usage of GPU
- Added a new use_gpu
parameter to the class Project
with information whether project allows usage of GPU
- Added a new class TrainingData
for retrieving TrainingData assigned to CustomModelVersion
.
- Added a new class HoldoutData
for retrieving HoldoutData assigned to CustomModelVersion
.
- Added the ability to retrieve the model and blueprint json using the following methods:
Model.get_model_blueprint_json
Blueprint.get_json
- Added Credential.update
which allows you to update existing credential resources.
- Added a new optional parameter trace_context
to datarobot.Client
to provide additional information on the DataRobot code being run. This parameter defaults to None
.
- Updated methods in Model
to support use of Sliced Insights:
Model.get_feature_effect
Model.request_feature_effect
Model.get_or_request_feature_effect
Model.get_lift_chart
Model.get_all_lift_charts
Model.get_residuals_chart
Model.get_all_residuals_charts
Model.request_lift_chart
Model.request_residuals_chart
Model.get_roc_curve
Model.get_feature_impact
Model.request_feature_impact
Model.get_or_request_feature_impact
- Added support for SharingRole
to the following methods:
- DataStore.share
- Added new methods for retrieving SharingRole
information for the following classes:
- DataStore.get_shared_roles
- Added new method for calculating sliced roc curve Model.request_roc_curve
- Added new DataSlice
to support the following slices methods:
DataSlice.list
to retrieve all data slices in a project.
DataSlice.create
to create a new data slice.
DataSlice.delete
to delete the data slice calling this method.
DataSlice.request_size
to submit a request to calculate a data slice size on a source.
DataSlice.get_size_info
to get the data slice’s info when applied to a source.
DataSlice.get
to retrieve a specific data slice.
- Added new DataSliceSizeInfo
to define the result of a data slice applied to a source.
- Added new method for retrieving all available feature impacts for the model Model.get_all_feature_impacts
.
- Added new method for StatusCheckJob to wait and return the completed object once it is generated datarobot.models.StatusCheckJob.get_result_when_complete()
Enhancements¶
- Improve error message of
SampleImage.list
to clarify that a selected parameter cannot be used when a project has not proceeded to the correct stage prior to calling this method. - Extended
SampleImage.list
by two parameters to filter for a target value range in regression projects. - Added text explanations data to
PredictionExplanations
and made sure it is returned in bothdatarobot.PredictionExplanations.get_all_as_dataframe()
anddatarobot.PredictionExplanations.get_rows()
method. - Added two new parameters to
Project.upload_dataset_from_catalog
: : -credential_id
credential_data
- Implemented training and holdout data assignment for Custom Model Version creation APIs:
: -
CustomModelVersion.create_clean
CustomModelVersion.create_from_previous
The parameters added to both APIs are:
: - training_dataset_id
- partition_column
- holdout_dataset_id
- keep_training_holdout_data
- max_wait
- Extended CustomInferenceModel.create
and CustomInferenceModel.update
with the parameter is_training_data_for_versions_permanently_enabled
.
- Added value DR_API_ACCESS
to the NETWORK_EGRESS_POLICY
enum.
- Added new parameter low_memory
to Dataset.get_as_dataframe
to allow a low memory mode for larger datasets
- Added two new parameters to Project.list
for paginating long project lists:
: - offset
- limit
Bugfixes¶
- Fixed incompatibilities with Pandas 2.0 in
DatetimePartitioning.to_dataframe
. - Fixed a crash when using non-“latin-1” characters in Panda’s DataFrame used as prediction data in
BatchPredictionJob.score
. - Fixed an issue where failed authentication when invoking
datarobot.client.Client()
raises a misleading error about client-server compatibility. - Fixed incompatibilities with Pandas 2.0 in
AccuracyOverTime.get_as_dataframe
. The method will now throw aValueError
if an empty list is passed to the parametermetrics
.
API changes¶
- Added parameter
unsupervised_type
to the classDatetimePartitioning
. - The sliced insight API endpoint
GET: api/v2/insights/<insight_name>/
returns a paginated response. This means that it returns an empty response if no insights data is found, unlikeGET: api/v2/projects/<pid>/models/<lid>/<insight_name>/
, which returns 404 NOT FOUND in this case. To maintain backwards-compatibility, all methods that retrieve insights data raise 404 NOT FOUND if the insights API returns an empty response.
Deprecation summary¶
Model.get_feature_fit_metadata
has been removed. UseModel.get_feature_effect_metadata
instead.DatetimeModel.get_feature_fit_metadata
has been removed. UseDatetimeModel.get_feature_effect_metadata
instead.Model.request_feature_fit
has been removed. UseModel.request_feature_effect
instead.DatetimeModel.request_feature_fit
has been removed. UseDatetimeModel.request_feature_effect
instead.Model.get_feature_fit
has been removed. UseModel.get_feature_effect
instead.DatetimeModel.get_feature_fit
has been removed. UseDatetimeModel.get_feature_effect
instead.Model.get_or_request_feature_fit
has been removed. UseModel.get_or_request_feature_effect
instead.DatetimeModel.get_or_request_feature_fit
has been removed. UseDatetimeModel.get_or_request_feature_effect
instead.- Deprecated the use of
SharingAccess
in favor ofSharingRole
for sharing in the following classes: DataStore.share
- Deprecated the following methods for retrieving
SharingAccess
information. DataStore.get_access_list
. Please useDataStore.get_shared_roles
instead.CustomInferenceModel.assign_training_data
was marked as deprecated and will be removed in v3.4. UseCustomModelVersion.create_clean
andCustomModelVersion.create_from_previous
instead.
Configuration changes¶
- Pins dependency on package urllib3 to be less than version 2.0.0.
Deprecation summary¶
- Deprecated parameter
user_agent_suffix
indatarobot.Client
.user_agent_suffix
will be removed in v3.4. Please usetrace_context
instead.
Documentation changes¶
- Fixed in-line documentation of
DataRobotClientConfig
. - Fixed documentation around client configuration from environment variables or config file.
Experimental changes¶
- Added experimental support for data matching:
DataMatching
DataMatchingQuery
- Added new method
DataMatchingQuery.get_result
for returning data matching query results as pandas dataframes toDataMatchingQuery
. - Changed behavior for returning results in the
DataMatching
. Instead of saving the results as a file, a pandas dataframe will be returned in the following methods: : -DataMatching.get_closest_data
DataMatching.get_closest_data_for_model
DataMatching.get_closest_data_for_featurelist
- Added experimental support for model lineage:
ModelLineage
- Changed behavior for methods that search for the closest data points in
DataMatching
. If the index is missing, instead of throwing the error, methods try to create the index and then query it. This is enabled by default, but if this is not the intended behavior it can be changed by passingFalse
to the newbuild_index
parameter added to the methods: : -DataMatching.get_closest_data
DataMatching.get_closest_data_for_model
DataMatching.get_closest_data_for_featurelist
- Added a new class
Notebook
for retrieving DataRobot Notebooks available to the user. - Added experimental support for data wrangling:
Recipe
3.1.1¶
Configuration changes¶
- Removes dependency on package contextlib2 since the package is Python 3.7+.
- Update typing-extensions to be inclusive of versions from 4.3.0 to < 5.0.0.
3.1.0¶
Enhancements¶
- Added new methods
BatchPredictionJob.apply_time_series_data_prep_and_score
andBatchPredictionJob.apply_time_series_data_prep_and_score_to_file
that apply time series data prep to a file or dataset and make batch predictions with a deployment. - Added new methods
DataEngineQueryGenerator.prepare_prediction_dataset
andDataEngineQueryGenerator.prepare_prediction_dataset_from_catalog
that apply time series data prep to a file or catalog dataset and upload the prediction dataset to a project. - Added new
max_wait
parameter to methodProject.create_from_dataset
. Values larger than the default can be specified to avoid timeouts when creating a project from Dataset. - Added new method for creating a segmented modeling project from an existing clustering project and model
Project.create_segmented_project_from_clustering_model
. Please switch to this function if you are previously using ModelPackage for segmented modeling purposes. - Added new method is_unsupervised_clustering_or_multiclass for checking whether the clustering or multiclass parameters are used, quick and efficient without extra API calls.
PredictionExplanations.is_unsupervised_clustering_or_multiclass
- Retry idempotent requests which result in HTTP 502 and HTTP 504 (in addition to the previous HTTP 413, HTTP 429 and HTTP 503)
- Added value PREPARED_FOR_DEPLOYMENT to the RECOMMENDED_MODEL_TYPE enum
- Added two new methods to the ImageAugmentationList class:
ImageAugmentationList.list
,ImageAugmentationList.update
Bugfixes¶
- Added
format
key to Batch Prediction intake and output settings for S3, GCP and Azure
API changes¶
- The method
PredictionExplanations.is_multiclass
now adds an additional API call to check for multiclass target validity, which adds a small delay. AdvancedOptions
parameterblend_best_models
defaults to falseAdvancedOptions
parameterconsider_blenders_in_recommendation
defaults to falseDatetimePartitioning
has parameterunsupervised_mode
Deprecation summary¶
- Deprecated method
Project.create_from_hdfs
. - Deprecated method
DatetimePartitioning.generate
. - Deprecated parameter
in_use
fromImageAugmentationList.create
as DataRobot will take care of it automatically. - Deprecated property
Deployment.capabilities
fromDeployment
. ImageAugmentationSample.compute
was removed in v3.1. You can get the same information with the methodImageAugmentationList.compute_samples
.sample_id
parameter removed fromImageAugmentationSample.list
. Please useauglist_id
instead.
Documentation changes¶
- Update the documentation to suggest that setting
use_backtest_start_end_format
ofDatetimePartitioning.to_specification
toTrue
will mirror the same behavior as the Web UI. - Update the documentation to suggest setting
use_start_end_format
ofBacktest.to_specification
toTrue
will mirror the same behavior as the Web UI.
3.0.3¶
Bugfixes¶
- Fixed an issue affecting backwards compatibility in
datarobot.models.DatetimeModel
, where an unexpected keyword from the DataRobot API would break class deserialization.
3.0.2¶
Bugfixes¶
- Restored
Model.get_leaderboard_ui_permalink
,Model.open_model_browser
, These methods were accidentally removed instead of deprecated. - Fix for ipykernel < 6.0.0 which does not persist contextvars across cells
Deprecation summary¶
- Deprecated method
Model.get_leaderboard_ui_permalink
. Please useModel.get_uri
instead. - Deprecated method
Model.open_model_browser
. Please useModel.open_in_browser
instead.
3.0.1¶
Bugfixes¶
- Added
typing-extensions
as a required dependency for the DataRobot Python SDK.
3.0.0¶
New features¶
- Version 3.0 of the Python client does not support Python 3.6 and earlier versions. Version 3.0 currently supports Python 3.7+.
- The default Autopilot mode for
project.start_autopilot
has changed to Quick mode. - For datetime-aware models, you can now calculate and retrieve feature impact for backtests other than zero and holdout:
DatetimeModel.get_feature_impact
DatetimeModel.request_feature_impact
DatetimeModel.get_or_request_feature_impact
- Added a
backtest
field to feature impact metadata:Model.get_or_request_feature_impact
. This field is null for non-datetime-aware models and greater than or equal to zero for holdout in datetime-aware models. - You can use a new method to retrieve the canonical URI for a project, model, deployment, or dataset:
Project.get_uri
Model.get_uri
Deployment.get_uri
Dataset.get_uri
- You can use a new method to open a class in a browser based on their URI (project, model, deployment, or dataset):
Project.open_in_browser
Model.open_in_browser
Deployment.open_in_browser
Dataset.open_in_browser
- Added a new method for opening DataRobot in a browser:
datarobot.rest.RESTClientObject.open_in_browser()
. Invoke the method viadr.Client().open_in_browser()
. - Altered method
Project.create_featurelist
to accept five new parameters (please see documentation for information about usage): starting_featurelist
starting_featurelist_id
starting_featurelist_name
features_to_include
features_to_exclude
- Added a new method to retrieve a feature list by name:
Project.get_featurelist_by_name
. - Added a new convenience method to create datasets:
Dataset.upload
. - Altered the method
Model.request_predictions
to accept four new parameters: dataset
file
file_path
dataframe
- Note that the method already supports the parameter
dataset_id
and all data source parameters are mutually exclusive. - Added a new method to
datarobot.models.Dataset
,Dataset.get_as_dataframe
, which retrieves all the originally uploaded data in a pandas DataFrame. - Added a new method to
datarobot.models.Dataset
,Dataset.share
, which allows the sharing of a dataset with another user. - Added new convenience methods to
datarobot.models.Project
for dealing with partition classes. Both methods should be called beforeProject.analyze_and_model
. Project.set_partitioning_method
intelligently creates the correct partition class for a regular project, based on input arguments.Project.set_datetime_partitioning
creates the correct partition class for a time series project.- Added a new method to
datarobot.models.Project
Project.get_top_model
which returns the highest scoring model for a metric of your choice. - Use the new method
Deployment.predict_batch
to pass a file, file path, or DataFrame todatarobot.models.Deployment
to easily make batch predictions and return the results as a DataFrame. - Added support for passing in a credentials ID or credentials data to
Project.create_from_data_source
as an alternative to providing a username and password. - You can now pass in a
max_wait
value toAutomatedDocument.generate
. - Added a new method to
datarobot.models.Project
Project.get_dataset
which retrieves the dataset used during creation of a project. - Added two new properties to
datarobot.models.Project
: catalog_id
catalog_version_id
- Added a new Autopilot method to
datarobot.models.Project
Project.analyze_and_model
which allows you to initiate Autopilot or data analysis against data uploaded to DataRobot. - Added a new convenience method to
datarobot.models.Project
Project.set_options
which allows you to saveAdvancedOptions
values for use in modeling. - Added a new convenience method to
datarobot.models.Project
Project.get_options
which allows you to retrieve saved modeling options.
Enhancements¶
- Refactored the global singleton client connection (
datarobot.client.Client()
) to use ContextVar instead of a global variable for better concurrency support. - Added support for creating monotonic feature lists for time series projects. Set
skip_datetime_partition_column
to True to create monotonic feature list. For more information seedatarobot.models.Project.create_modeling_featurelist()
. - Added information about vertex to advanced tuning parameters
datarobot.models.Model.get_advanced_tuning_parameters()
. - Added the ability to automatically use saved
AdvancedOptions
set usingProject.set_options
inProject.analyze_and_model
.
Bugfixes¶
Dataset.list
no longer throws errors when listing datasets with no owner.- Fixed an issue with the creation of
BatchPredictionJobDefinitions
containing a schedule. - Fixed error handling in
datarobot.helpers.partitioning_methods.get_class
. - Fixed issue with portions of the payload not using camelCasing in
Project.upload_dataset_from_catalog
.
API changes¶
- The Python client now outputs a
DataRobotProjectDeprecationWarning
when you attempt to access certain resources (projects, models, deployments, etc.) that are deprecated or disabled as a result of the DataRobot platform’s migration to Python 3. - The Python client now raises a
TypeError
when you try to retrieve a labelwise ROC on a binary model or a binary ROC on a multilabel model. - The method
Dataset.create_from_data_source
now raisesInvalidUsageError
ifusername
andpassword
are not passed as a pair together.
Deprecation summary¶
Model.get_leaderboard_ui_permalink
has been removed. UseModel.get_uri
instead.Model.open_model_browser
has been removed. UseModel.open_in_browser
instead.Project.get_leaderboard_ui_permalink
has been removed. UseProject.get_uri
instead.Project.open_leaderboard_browser
has been removed. UseProject.open_in_browser
instead.- Enum
VARIABLE_TYPE_TRANSFORM.CATEGORICAL
has been removed - Instantiation of
Blueprint
using a dict has been removed. UseBlueprint.from_data
instead. - Specifying an environment to use for testing with
CustomModelTest
has been removed. CustomModelVersion
’srequired_metadata
parameter has been removed. Userequired_metadata_values
instead.CustomTaskVersion
’srequired_metadata
parameter has been removed. Userequired_metadata_values
instead.- Instantiation of
Feature
using a dict has been removed. UseFeature.from_data
instead. - Instantiation of
Featurelist
using a dict has been removed. UseFeaturelist.from_data
instead. - Instantiation of
Model
using a dict, tuple, or thedata
parameter has been removed. UseModel.from_data
instead. - Instantiation of
Project
using a dict has been removed. UseProject.from_data
instead. Project
’squickrun
parameter has been removed. PassAUTOPILOT_MODE.QUICK
as themode
instead.Project
’sscaleout_max_train_pct
andscaleout_max_train_rows
parameters have been removed.ComplianceDocumentation
has been removed. UseAutomatedDocument
instead.- The
Deployment
methodcreate_from_custom_model_image
was removed. UseDeployment.create_from_custom_model_version
instead. PredictJob.create
has been removed. UseModel.request_predictions
instead.Model.fetch_resource_data
has been removed. UseModel.get
instead.- The class
CustomInferenceImage
was removed. UseCustomModelVersion
withbase_environment_id
instead. Project.set_target
has been deprecated. UseProject.analyze_and_model
instead.
Configuration changes¶
- Added a context manager
client_configuration
that can be used to change the connection configuration temporarily, for use in asynchronous or multithreaded code. - Upgraded the
Pillow
library to version 9.2.0. Users installing DataRobot with the “images” extra (pip install datarobot[images]
) should note that this is a required library.
Experimental changes¶
- Added experimental support for retrieving document thumbnails:
DocumentThumbnail
DocumentPageFile
- Added experimental support to retrieve document text extraction samples using:
DocumentTextExtractionSample
DocumentTextExtractionSamplePage
DocumentTextExtractionSampleDocument
- Added experimental deployment improvements:
RetrainingPolicy
can be used to manage retraining policies associated with a deployment.- Added an experimental deployment improvement:
- Use
RetrainingPolicyRun
to manage retraining policies run for a retraining policy associated with a deployment. - Added new methods to
RetrainingPolicy
: - Use
RetrainingPolicy.get
to get a retraining policy associated with a deployment. - Use
RetrainingPolicy.delete
to delete a retraining policy associated with a deployment.
2.29.0b0¶
New features¶
- Added support to pass
max_ngram_explanations
parameter in batch predictions that will trigger the compute of text prediction explanations. BatchPredictionJob.score
- Added support to pass calculation mode to prediction explanations
(
mode
parameter inPredictionExplanations.create
) as well as batch scoring (explanations_mode
inBatchPredictionJob.score
) for multiclass models. Supported modes: TopPredictionsMode
ClassListMode
- Added method
datarobot.CalendarFile.create_calendar_from_dataset()
to the calendar file that allows us to create a calendar from a dataset. - Added experimental support for
n_clusters
parameter inModel.train_datetime
andDatetimeModel.retrain
that allows to specify number of clusters when creating models in Time Series Clustering project. - Added new parameter
clone
todatarobot.CombinedModel.set_segment_champion()
that allows to set a new champion model in a cloned model instead of the original one, leaving latter unmodified. - Added new property
is_active_combined_model
todatarobot.CombinedModel
that indicates if the selected combined model is currently the active one in the segmented project. - Added new
datarobot.models.Project.get_active_combined_model()
that allows users to get the currently active combined model in the segmented project. - Added new parameters
read_timeout
to methodShapMatrix.get_as_dataframe
. Values larger than the default can be specified to avoid timeouts when requesting large files.ShapMatrix.get_as_dataframe
- Added support for bias mitigation with the following methods
Project.get_bias_mitigated_models
Project.apply_bias_mitigation
Project.request_bias_mitigation_feature_info
Project.get_bias_mitigation_feature_info
and by adding new bias mitigation params- bias_mitigation_feature_name
- bias_mitigation_technique
- include_bias_mitigation_feature_as_predictor_variable to the existing method
Project.start
and by adding this enum to supply params to some of the above functionalitydatarobot.enums.BiasMitigationTechnique
- Added new property
status
todatarobot.models.Deployment
that represents model deployment status. - Added new
Deployment.activate
andDeployment.deactivate
that allows deployment activation and deactivation - Added new
Deployment.delete_monitoring_data
to delete deployment monitoring data.
Enhancements¶
- Added support for specifying custom endpoint URLs for S3 access in batch predictions:
BatchPredictionJob.score
BatchPredictionJob.score
See: endpoint_url
parameter.
- Added guide on :ref:working with binary data <binary_data>
- Added multithreading support to binary data helper functions.
- Binary data helpers image defaults aligned with application’s image preprocessing.
- Added the following accuracy metrics to be retrieved for a deployment - TPR, PPV, F1 and MCC :ref:Deployment monitoring <deployment_monitoring>
Bugfixes¶
- Don’t include holdout start date, end date, or duration in datetime partitioning payload when holdout is disabled.
- Removed ICE Plot capabilities from Feature Fit.
- Handle undefined calendar_name in CalendarFile.create_calendar_from_dataset
- Raise ValueError for submitted calendar names that are not strings
API changes¶
version
field is removed fromImportedModel
object
Deprecation summary¶
- Reason Codes objects deprecated in 2.13 version were removed. Please use Prediction Explanations instead.
Configuration changes¶
- The upper version constraint on pandas has been removed.
Documentation changes¶
- Fixed a minor typo in the example for Dataset.create_from_data_source.
- Update the documentation to suggest that
feature_derivation_window_end
ofdatarobot.DatetimePartitioningSpecification
class should be a negative or zero.
2.28.0¶
New features¶
- Added new parameter
upload_read_timeout
toBatchPredictionJob.score
andBatchPredictionJob.score_to_file
to indicate how many seconds to wait until intake dataset uploads to server. Default value 600s. - Added the ability to turn off supervised feature reduction for Time Series projects. Option
use_supervised_feature_reduction
can be set inAdvancedOptions
. - Allow
maximum_memory
to be input for custom tasks versions. This will be used for setting the limit to which a custom task prediction container memory can grow. - Added method
datarobot.models.Project.get_multiseries_names()
to the project service which will return all the distinct entries in the multiseries column - Added new
segmentation_task_id
attribute todatarobot.models.Project.set_target()
that allows to start project as Segmented Modeling project. - Added new property
is_segmented
todatarobot.models.Project
that indicates if project is a regular one or Segmented Modeling project. - Added method
datarobot.models.Project.restart_segment()
to the project service that allows to restart single segment that hasn’t reached modeling phase. - Added the ability to interact with Combined Models in Segmented Modeling projects.
Available with new class:
datarobot.CombinedModel
.
Functionality:
: - datarobot.CombinedModel.get()
- datarobot.CombinedModel.get_segments_info()
- datarobot.CombinedModel.get_segments_as_dataframe()
- datarobot.CombinedModel.get_segments_as_csv()
- datarobot.CombinedModel.set_segment_champion()
- Added the ability to create and retrieve segmentation tasks used in Segmented Modeling projects.
Available with new class: datarobot.SegmentationTask
.
Functionality:
: - datarobot.SegmentationTask.create()
- datarobot.SegmentationTask.list()
- datarobot.SegmentationTask.get()
- Added new class: datarobot.SegmentInfo
that allows to get information on all segments of
Segmented modeling projects, i.e. segment project ID, model counts, autopilot status.
Functionality:
: - datarobot.SegmentInfo.list()
- Added new methods to base APIObject
to assist with dictionary and json serialization of child objects.
Functionality:
: - APIObject.to_dict
- APIObject.to_json
- Added new methods to ImageAugmentationList
for interacting with image augmentation samples.
Functionality:
: - ImageAugmentationList.compute_samples
- ImageAugmentationList.retrieve_samples
- Added the ability to set a prediction threshold when creating a deployment from a learning model.
- Added support for governance, owners, predictionEnvironment, and fairnessHealth fields when querying for a Deployment object.
- Added helper methods for working with files, images and documents. Methods support conversion of
file contents into base64 string representations. Methods for images provide also image resize and
transformation support.
Functionality:
: - get_encoded_file_contents_from_urls
- get_encoded_file_contents_from_paths
- get_encoded_image_contents_from_paths
- get_encoded_image_contents_from_urls
Enhancements¶
- Requesting metadata instead of actual data of
datarobot.PredictionExplanations
to reduce the amount of data transfer
Bugfixes¶
- Fix a bug in
Job.get_result_when_complete
for Prediction Explanations job type to populate all attribute of ofdatarobot.PredictionExplanations
instead of just one - Fix a bug in
datarobot.models.ShapImpact
whererow_count
was not optional - Allow blank value for schema and catalog in
RelationshipsConfiguration
response data - Fix a bug where credentials were incorrectly formatted in
Project.upload_dataset_from_catalog
andProject.upload_dataset_from_data_source
- Rejecting downloads of Batch Prediction data that was not written to the localfile output adapter
- Fix a bug in
datarobot.models.BatchPredictionJobDefinition.create()
whereschedule
was not optional for all cases
API changes¶
- User can include ICE plots data in the response when requesting Feature Effects/Feature Fit. Extended methods are
: -
Model.get_feature_effect
, Model.get_feature_fit <datarobot.models.Model.get_feature_fit>
,DatetimeModel.get_feature_effect
andDatetimeModel.get_feature_fit <datarobot.models.DatetimeModel.get_feature_fit>
.
Deprecation summary¶
attrs
library is removed from library dependenciesImageAugmentationSample.compute
was marked as deprecated and will be removed in v2.30. You can get the same information with newly introduced methodImageAugmentationList.compute_samples
ImageAugmentationSample.list
usingsample_id
- Deprecating scaleout parameters for projects / models. Includes
scaleout_modeling_mode
,scaleout_max_train_pct
, andscaleout_max_train_rows
Configuration changes¶
pandas
upper version constraint is updated to include version 1.3.5.
Documentation changes¶
- Fixed “from datarobot.enums” import in Unsupervised Clustering example provided in docs.
2.27.0¶
New features¶
datarobot.UserBlueprint
is now mature with full support of functionality. Users are encouraged to use the Blueprint Workshop instead of this class directly.- Added the arguments attribute in
datarobot.CustomTaskVersion
. - Added the ability to retrieve detected errors in the potentially multicategorical feature types that prevented the
feature to be identified as multicategorical.
Project.download_multicategorical_data_format_errors
- Added the support of listing/updating user roles on one custom task.
: -
datarobot.CustomTask.get_access_list()
datarobot.CustomTask.share()
- Added a method
datarobot.models.Dataset.create_from_query_generator()
. This creates a dataset in the AI catalog from adatarobot.DataEngineQueryGenerator
. - Added the new functionality of creating a user blueprint with a custom task version id.
datarobot.UserBlueprint.create_from_custom_task_version_id()
. - The DataRobot Python Client is no longer published under the Apache-2.0 software license, but rather under the terms of the DataRobot Tool and Utility Agreement.
- Added a new class:
datarobot.DataEngineQueryGenerator
. This class generates a Spark SQL query to apply time series data prep to a dataset in the AI catalog.
Functionality:
: - datarobot.DataEngineQueryGenerator.create()
- datarobot.DataEngineQueryGenerator.get()
- datarobot.DataEngineQueryGenerator.create_dataset()
See the :ref:time series data prep documentation <time_series_data_prep>
for more information.
- Added the ability to upload a prediction dataset into a project from the AI catalog
Project.upload_dataset_from_catalog
.
- Added the ability to specify the number of training rows to use in SHAP based Feature Impact computation. Extended
method:
ShapImpact.create
- Added the ability to retrieve and restore features that have been reduced using the time series feature generation and reduction functionality. The functionality comes with a new class:
datarobot.models.restore_discarded_features.DiscardedFeaturesInfo
.
Functionality:
: - datarobot.models.restore_discarded_features.DiscardedFeaturesInfo.retrieve()
- datarobot.models.restore_discarded_features.DiscardedFeaturesInfo.restore()
- Added the ability to control class mapping aggregation in multiclass projects via
ClassMappingAggregationSettings
passed as a parameter to
Project.set_target
- Added support for :ref:unsupervised clustering projects<unsupervised_clustering>
- Added the ability to compute and retrieve Feature Effects for a Multiclass model using
datarobot.models.Model.request_feature_effects_multiclass()
,
datarobot.models.Model.get_feature_effects_multiclass()
or
datarobot.models.Model.get_or_request_feature_effects_multiclass()
methods. For datetime models use following
methods datarobot.models.DatetimeModel.request_feature_effects_multiclass()
,
datarobot.models.DatetimeModel.get_feature_effects_multiclass()
or
datarobot.models.DatetimeModel.get_or_request_feature_effects_multiclass()
with backtest_index
specified
- Added the ability to get and update challenger model settings for deployment
class: datarobot.models.Deployment
Functionality:
: - datarobot.models.Deployment.get_challenger_models_settings()
- datarobot.models.Deployment.update_challenger_models_settings()
- Added the ability to get and update segment analysis settings for deployment
class: datarobot.models.Deployment
Functionality:
: - datarobot.models.Deployment.get_segment_analysis_settings()
- datarobot.models.Deployment.update_segment_analysis_settings()
- Added the ability to get and update predictions by forecast date settings for deployment
class: datarobot.models.Deployment
Functionality:
: - datarobot.models.Deployment.get_predictions_by_forecast_date_settings()
- datarobot.models.Deployment.update_predictions_by_forecast_date_settings()
- Added the ability to specify multiple feature derivation windows when creating a Relationships Configuration using
RelationshipsConfiguration.create
- Added the ability to manipulate a legacy conversion for a custom inference model, using the
class: CustomModelVersionConversion
Functionality:
: - CustomModelVersionConversion.run_conversion
- CustomModelVersionConversion.stop_conversion
- CustomModelVersionConversion.get
- CustomModelVersionConversion.get_latest
- CustomModelVersionConversion.list
Enhancements¶
Project.get
returns the query_generator_id used for time series data prep when applicable.- Feature Fit & Feature Effects can return
datetime
instead ofnumeric
forfeature_type
field for numeric features that are derived from dates. - These methods now provide additional field
rowCount
in SHAP based Feature Impact results. - Improved performance when downloading prediction dataframes for Multilabel projects using:
: -
Predictions.get_all_as_dataframe
PredictJob.get_predictions
Job.get_result
Bugfixes¶
- fix
datarobot.CustomTaskVersion
anddatarobot.CustomModelVersion
to correctly formatrequired_metadata_values
before sending them via API - Fixed response validation that could cause
DataError
when usingdatarobot.models.Dataset
for a dataset with a description that is an empty string.
API changes¶
RelationshipsConfiguration.create
will include a new keydata_source_id
indata_source
field when applicable
Deprecation summary¶
Model.get_all_labelwise_roc_curves
has been removed. You can get the same information with multiple calls ofModel.get_labelwise_roc_curves
, one per data source.Model.get_all_multilabel_lift_charts
has been removed. You can get the same information with multiple calls ofModel.get_multilabel_lift_charts
, one per data source.
Documentation changes¶
- This release introduces a new documentation organization. The organization has been modified to better reflect the end-to-end modeling workflow. The new “Tutorials” section has 5 major topics that outline the major components of modeling: Data, Modeling, Predictions, MLOps, and Administration.
- The Getting Started workflow is now hosted at DataRobot’s API Documentation Home.
- Added an example of how to set up optimized datetime partitioning for time series projects.
2.26.0¶
New features¶
- Added the ability to use external baseline predictions for time series project. External
dataset can be validated using
datarobot.models.Project.validate_external_time_series_baseline()
. Option can be set inAdvancedOptions
to scale datarobot models’ accuracy performance using external dataset’s accuracy performance. See the :ref:external baseline predictions documentation <external_baseline_predictions>
for more information. - Added the ability to generate exponentially weighted moving average features for time series
project. Option can be set in
AdvancedOptions
and controls the alpha parameter used in exponentially weighted moving average operation. - Added the ability to request a specific model be prepared for deployment using
Project.start_prepare_model_for_deployment
. - Added a new class:
datarobot.CustomTask
. This class is a custom task that you can use as part (or all) of your blue print for training models. It needsdatarobot.CustomTaskVersion
before it can properly be used.
Functionality:
: - Create, copy, update or delete:
: - datarobot.CustomTask.create()
- datarobot.CustomTask.copy()
- datarobot.CustomTask.update()
- datarobot.CustomTask.delete()
- list, get and refresh current tasks:
: - datarobot.CustomTask.get()
- datarobot.CustomTask.list()
- datarobot.CustomTask.refresh()
- Download the latest datarobot.CustomTaskVersion
of the datarobot.CustomTask
: - datarobot.CustomTask.download_latest_version()
- Added a new class: datarobot.CustomTaskVersion
. This class
is for management of specific versions of a custom task.
Functionality:
: - Create new custom task versions:
: - datarobot.CustomTaskVersion.create_clean()
- datarobot.CustomTaskVersion.create_from_previous()
- list, get and refresh current available versions:
: - datarobot.CustomTaskVersion.list()
- datarobot.CustomTaskVersion.get()
- datarobot.CustomTaskVersion.refresh()
- datarobot.CustomTaskVersion.download()
will download a tarball of the files used to create the custom task
- datarobot.CustomTaskVersion.update()
updates the metadata for a custom task.
- Added the ability compute batch predictions for an in-memory DataFrame using
BatchPredictionJob.score
- Added the ability to specify feature discovery settings when creating a Relationships Configuration using
RelationshipsConfiguration.create
Enhancements¶
- Improved performance when downloading prediction dataframes using:
: -
Predictions.get_all_as_dataframe
PredictJob.get_predictions
Job.get_result
- Added new
max_wait
parameter to methods: : -Dataset.create_from_url
Dataset.create_from_in_memory_data
Dataset.create_from_data_source
Dataset.create_version_from_in_memory_data
Dataset.create_version_from_url
Dataset.create_version_from_data_source
Bugfixes¶
Model.get
will return aDatetimeModel
instead ofModel
whenever the project is datetime partitioned. This enables theModelRecommendation.get_model
to return aDatetimeModel
instead ofModel
whenever the project is datetime partitioned.- Try to read Feature Impact result if existing jobId is None in
Model.get_or_request_feature_impact
. - Set upper version constraints for pandas.
RelationshipsConfiguration.create
will return acatalog
indata_source
field- Argument
required_metadata_keys
was not properly being sent in the update and create requests fordatarobot.ExecutionEnvironment
. - Fix issue with
datarobot.ExecutionEnvironment
create method failing when used against older versions of the application datarobot.CustomTaskVersion
was not properly handlingrequired_metadata_values
from the API response
API changes¶
- Updated
Project.start
to useAUTOPILOT_MODE.QUICK
when theautopilot_on
param is set to True. This brings it in line withProject.set_target
. - Updated
project.start_autopilot
to accept the following new GA parameters that are already in the public API:consider_blenders_in_recommendation
,run_leakage_removed_feature_list
Deprecation summary¶
- The
required_metadata
property ofdatarobot.CustomModelVersion
has been deprecated.required_metadata_values
should be used instead. - The
required_metadata
property ofdatarobot.CustomTaskVersion
has been deprecated.required_metadata_values
should be used instead.
Configuration changes¶
- Now requires dependency on package scikit-learn rather than sklearn. Note: This dependency is only used in example code. See this scikit-learn issue for more information.
- Now permits dependency on package attrs to be less than version 21. This fixes compatibility with apache-airflow.
- Allow to setup
Authorization: <type> <token>
type header for OAuth2 Bearer tokens.
Documentation changes¶
- Update the documentation with respect to the permission that controls AI Catalog dataset snapshot behavior.
2.25.0¶
New features¶
- There is a new
AnomalyAssessmentRecord
object that implements public API routes to work with anomaly assessment insight. This also adds explanations and predictions preview classes. The insight is available for anomaly detection models in time series unsupervised projects which also support calculation of Shapley values.
Functionality:
- Initialize an anomaly assessment insight for the specified subset.
- Get anomaly assessment records, shap explanations, predictions preview:
DatetimeModel.get_anomaly_assessment_records
list available recordsAnomalyAssessmentRecord.get_predictions_preview
get predictions preview for the recordAnomalyAssessmentRecord.get_latest_explanations
get latest predictions along with shap explanations for the most anomalous records.AnomalyAssessmentRecord.get_explanations
get predictions along with shap explanations for the most anomalous records for the specified range.- Delete anomaly assessment record:
AnomalyAssessmentRecord.delete
delete record- Added an ability to calculate and retrieve Datetime trend plots for
DatetimeModel
. This includes Accuracy over Time, Forecast vs Actual, and Anomaly over Time.
Plots can be calculated using a common method:
Metadata for plots can be retrieved using the following methods:
Plots can be retrieved using the following methods:
Preview plots can be retrieved using the following methods:
DatetimeModel.get_accuracy_over_time_plot_preview
DatetimeModel.get_forecast_vs_actual_plot_preview
DatetimeModel.get_anomaly_over_time_plot_preview
- Support for Batch Prediction Job Definitions has now been added through the following class:
BatchPredictionJobDefinition
. You can create, update, list and delete definitions using the following methods:BatchPredictionJobDefinition.list
BatchPredictionJobDefinition.create
BatchPredictionJobDefinition.update
BatchPredictionJobDefinition.delete
Enhancements¶
- Added a new helper function to create Dataset Definition, Relationship and Secondary Dataset used by
Feature Discovery Project. They are accessible via
DatasetDefinition
Relationship
SecondaryDataset
- Added new helper function to projects to retrieve the recommended model.
Project.recommended_model
- Added method to download feature discovery recipe SQLs (limited beta feature).
Project.download_feature_discovery_recipe_sqls
. - Added
docker_context_size
anddocker_image_size
todatarobot.ExecutionEnvironmentVersion
Bugfixes¶
- Remove the deprecation warnings when using with latest versions of urllib3.
FeatureAssociationMatrix.get
is now using correct query param name whenfeaturelist_id
is specified.- Handle scalar values in
shapBaseValue
while converting a predictions response to a data frame. - Ensure that if a configured endpoint ends in a trailing slash, the resulting full URL does not end up with double slashes in the path.
Model.request_frozen_datetime_model
is now implementing correct validation of input parametertraining_start_date
.
API changes¶
- Arguments
secondary_datasets
now acceptSecondaryDataset
to create secondary dataset configurations SecondaryDatasetConfigurations.create
- Arguments
dataset_definitions
andrelationships
now acceptDatasetDefinition
Relationship
to create and replace relationships configuration RelationshipsConfiguration.create
creates a new relationships configuration between datasetsRelationshipsConfiguration.retrieve
retrieve the requested relationships configuration- Argument
required_metadata_keys
has been added todatarobot.ExecutionEnvironment
. This should be used to define a list ofRequiredMetadataKey
.datarobot.CustomModelVersion
that use a base environment withrequired_metadata_keys
must define values for these fields in their respectiverequired_metadata
- Argument
required_metadata
has been added todatarobot.CustomModelVersion
. This should be set with relevant values defined by the base environment’srequired_metadata_keys
2.24.0¶
New features¶
- Partial history predictions can be made with time series time series multiseries models using the
allow_partial_history_time_series_predictions
attribute of thedatarobot.DatetimePartitioningSpecification
. See the :ref:Time Series <time_series>
documentation for more info. - Multicategorical Histograms are now retrievable. They are accessible via
MulticategoricalHistogram
orFeature.get_multicategorical_histogram
. - Add methods to retrieve per-class lift chart data for multilabel models:
Model.get_multilabel_lift_charts
andModel.get_all_multilabel_lift_charts
. - Add methods to retrieve labelwise ROC curves for multilabel models:
Model.get_labelwise_roc_curves
andModel.get_all_labelwise_roc_curves
. - Multicategorical Pairwise Statistics are now retrievable. They are accessible via
PairwiseCorrelations
,PairwiseJointProbabilities
andPairwiseConditionalProbabilities
orFeature.get_pairwise_correlations
,Feature.get_pairwise_joint_probabilities
andFeature.get_pairwise_conditional_probabilities
. - Add methods to retrieve prediction results of a deployment:
: -
Deployment.get_prediction_results
Deployment.download_prediction_results
- Add method to download scoring code of a deployment using
Deployment.download_scoring_code
. - Added Automated Documentation: now you can automatically generate documentation about various
entities within the platform, such as specific models or projects. Check out the
:ref:
Automated Documentation overview<automated_documentation_overview>
and also refer to the :ref:API Reference<automated_documentation_api>
for more details. - Create a new Dataset version for a given dataset by uploading from a file, URL or in-memory datasource.
: -
Dataset.create_version_from_file
Dataset.create_version_from_in_memory_data
Dataset.create_version_from_url
Dataset.create_version_from_data_source
Enhancements¶
- Added a new
status
calledFAILED
to fromBatchPredictionJob
as this is a new status coming to Batch Predictions in an upcoming version of DataRobot. - Added
base_environment_version_id
todatarobot.CustomModelVersion
. - Support for downloading feature discovery training or prediction dataset using
Project.download_feature_discovery_dataset
. - Added
datarobot.models.FeatureAssociationMatrix
,datarobot.models.FeatureAssociationMatrixDetails
anddatarobot.models.FeatureAssociationFeaturelists
that can be used to retrieve feature associations data as an alternative toProject.get_associations
,Project.get_association_matrix_details
andProject.get_association_featurelists
methods.
Bugfixes¶
- Fixed response validation that could cause
DataError
when usingTrainingPredictions.list
andTrainingPredictions.get_all_as_dataframe
methods if there are training predictions computed withexplanation_algorithm
.
API changes¶
- Remove
desired_memory
param from the following classes:datarobot.CustomInferenceModel
,datarobot.CustomModelVersion
,datarobot.CustomModelTest
- Remove
desired_memory
param from the following methods:CustomInferenceModel.create
,CustomModelVersion.create_clean
,CustomModelVersion.create_from_previous
,CustomModelTest.create
andCustomModelTest.create
Deprecation summary¶
- class
ComplianceDocumentation
will be deprecated in v2.24 and will be removed entirely in v2.27. UseAutomatedDocument
instead. To start off, see the :ref:Automated Documentation overview<automated_documentation_overview>
for details.
Documentation changes¶
- Remove reference to S3 for
Project.upload_dataset
since it is not supported by the server
2.23.0¶
New features¶
- Calendars for time series projects can now be automatically generated by providing a country code to the method
CalendarFile.create_calendar_from_country_code
. A list of allowed country codes can be retrieved usingCalendarFile.get_allowed_country_codes
For more information, see the :ref:calendar documentation <preloaded_calendar_files>
. - Added
calculate_all_series`` param to [
DatetimeModel.compute_series_accuracy`](autodoc/models.md#datarobot.models.DatetimeModel.compute_series_accuracy). This option allows users to compute series accuracy for all available series at once, while by default it is computed for first 1000 series only. - Added ability to specify sampling method when setting target of OTV project. Option can be set
in
AdvancedOptions
and changes a way training data is defined in autopilot steps. - Add support for custom inference model k8s resources management. This new feature enables
users to control k8s resources allocation for their executed model in the k8s cluster.
It involves in adding the following new parameters:
network_egress_policy
,desired_memory
,maximum_memory
,replicas
to the following classes:datarobot.CustomInferenceModel
,datarobot.CustomModelVersion
,datarobot.CustomModelTest
- Add support for multiclass custom inference and training models. This enables users to create
classification custom models with more than two class labels. The
datarobot.CustomInferenceModel
class can now usedatarobot.TARGET_TYPE.MULTICLASS
for theirtarget_type
parameter. Class labels for inference models can be set/updated using either a file or as a list of labels. - Support for Listing all the secondary dataset configuration for a given project:
: -
SecondaryDatasetConfigurations.list
- Add support for unstructured custom inference models. The
datarobot.CustomInferenceModel
class can now usedatarobot.TARGET_TYPE.UNSTRUCTURED
for itstarget_type
parameter.target_name
parameter is optional forUNSTRUCTURED
target type. - All per-class lift chart data is now available for multiclass models using
Model.get_multiclass_lift_chart
. AUTOPILOT_MODE.COMPREHENSIVE
, a newmode
, has been added toProject.set_target
.- Add support for anomaly detection custom inference models. The
datarobot.CustomInferenceModel
class can now usedatarobot.TARGET_TYPE.ANOMALY
for itstarget_type
parameter.target_name
parameter is optional forANOMALY
target type. - Support for Updating and retrieving the secondary dataset configuration for a Feature discovery deployment:
: -
Deployment.update_secondary_dataset_config
Deployment.get_secondary_dataset_config
- Add support for starting and retrieving Feature Impact information for
datarobot.CustomModelVersion
- Search for interaction features and Supervised Feature reduction for feature discovery project can now be specified
: in
AdvancedOptions
. - Feature discovery projects can now be created using the
Project.start
method by providingrelationships_configuration_id
. - Actions applied to input data during automated feature discovery can now be retrieved using
FeatureLineage.get
Corresponding feature lineage id is available as a newdatarobot.models.Feature
fieldfeature_lineage_id
. - Lift charts and ROC curves are now calculated for backtests 2+ in time series and OTV models.
The data can be retrieved for individual backtests using
Model.get_lift_chart
andModel.get_roc_curve
. - The following methods now accept a new argument called credential_data, the credentials to authenticate with the database, to use instead of user/password or credential ID:
: -
Dataset.create_from_data_source
Dataset.create_project
Project.create_from_dataset
- Add support for DataRobot Connectors,
datarobot.Connector
provides a simple implementation to interface with connectors.
Enhancements¶
- Running Autopilot on Leakage Removed feature list can now be specified in
AdvancedOptions
. By default, Autopilot will always run on Informative Features - Leakage Removed feature list if it exists. If the parameterrun_leakage_removed_feature_list
is set to False, then Autopilot will run on Informative Features or available custom feature list. - Method
Project.upload_dataset
andProject.upload_dataset_from_data_source
support new optional parametersecondary_datasets_config_id
for Feature discovery project.
Bugfixes¶
- added
disable_holdout
param indatarobot.DatetimePartitioning
- Using
Credential.create_gcp
produced an incompatible credential SampleImage.list
now supports Regression & Multilabel projects- Using
BatchPredictionJob.score
could in some circumstances result in a crash from trying to abort the job if it fails to start - Using
BatchPredictionJob.score
orBatchPredictionJob.score
would produce incomplete results in case a job was aborted while downloading. This will now raise an exception.
API changes¶
- New
sampling_method
param inModel.train_datetime
,Project.train_datetime
,Model.train_datetime
andModel.train_datetime
. - New
target_type
param indatarobot.CustomInferenceModel
- New arguments
secondary_datasets
,name
,creator_full_name
,creator_user_id
,created
, :featurelist_id
,credentials_ids
,project_version
andis_default
indatarobot.models.SecondaryDatasetConfigurations
- New arguments
secondary_datasets
,name
,featurelist_id
to :SecondaryDatasetConfigurations.create
- Class
FeatureEngineeringGraph
has been removed. Usedatarobot.models.RelationshipsConfiguration
instead. - Param
feature_engineering_graphs
removed fromProject.set_target
. - Param
config
removed fromSecondaryDatasetConfigurations.create
.
Deprecation summary¶
supports_binary_classification
andsupports_regression
are deprecated : fordatarobot.CustomInferenceModel
and will be removed in v2.24- Argument
config
andsupports_regression
are deprecated : fordatarobot.models.SecondaryDatasetConfigurations
and will be removed in v2.24 CustomInferenceImage
has been deprecated and will be removed in v2.24. :datarobot.CustomModelVersion
with base_environment_id should be used in their place.environment_id
andenvironment_version_id
are deprecated forCustomModelTest.create
Documentation changes¶
feature_lineage_id
is added as a new parameter in the response for retrieval of adatarobot.models.Feature
created by automated feature discovery or time series feature derivation. This id is required to retrieve adatarobot.models.FeatureLineage
instance.
2.22.1¶
New features¶
- Batch Prediction jobs now support :ref:
dataset <batch_predictions-intake-types-dataset>
as intake settings forBatchPredictionJob.score
. - Create a Dataset from DataSource:
- Added support for Custom Model Dependency Management. Please see :ref:
custom model documentation<custom_models>
. New features added:- Added new argument
base_environment_id
to methodsCustomModelVersion.create_clean
andCustomModelVersion.create_from_previous
- New fields
base_environment_id
anddependencies
to classdatarobot.CustomModelVersion
- New class
datarobot.CustomModelVersionDependencyBuild
to prepare custom model versions with dependencies. - Made argument
environment_id
ofCustomModelTest.create
optional to enable using custom model versions with dependencies - New field
image_type
added to classdatarobot.CustomModelTest
Deployment.create_from_custom_model_version
can be used to create a deployment from a custom model version.
- Added new argument
- Added new parameters for starting and re-running Autopilot with customizable settings within
Project.start_autopilot
. - Added a new method to trigger Feature Impact calculation for a Custom Inference Image:
CustomInferenceImage.calculate_feature_impact
- Added new method to retrieve number of iterations trained for early stopping models. Currently supports only tree-based models.
Model.get_num_iterations_trained
.
Enhancements¶
- A description can now be added or updated for a project.
Project.set_project_description
. - Added new parameters
read_timeout
andmax_wait
to methodDataset.create_from_file
. Values larger than the default can be specified for both to avoid timeouts when uploading large files. - Added new parameter
metric
todatarobot.models.deployment.TargetDrift
,datarobot.models.deployment.FeatureDrift
,Deployment.get_target_drift
andDeployment.get_feature_drift
. - Added new parameter
timeout
toBatchPredictionJob.download
to indicate how many seconds to wait for the download to start (in case the job doesn’t start processing immediately). Set to-1
to disable. This parameter can also be sent asdownload_timeout
toBatchPredictionJob.score
andBatchPredictionJob.score
. If the timeout occurs, the pending job will be aborted. - Added new parameter
read_timeout
toBatchPredictionJob.download
to indicate how many seconds to wait between each downloaded chunk. This parameter can also be sent asdownload_read_timeout
toBatchPredictionJob.score
andBatchPredictionJob.score
. - Added parameter
catalog
toBatchPredictionJob
to both intake and output adapters for typejdbc
. - Consider blenders in recommendation can now be specified in
AdvancedOptions
. Blenders will be included when autopilot chooses a model to prepare and recommend for deployment. - Added optional parameter
max_wait
toDeployment.replace_model
to indicate the maximum time to wait for model replacement job to complete before erroring.
Bugfixes¶
- Handle
null
values inpredictionExplanationMetadata["shapRemainingTotal"]
while converting a predictions response to a data frame. - Handle
null
values incustomModel["latestVersion"]
- Removed an extra column
status
fromBatchPredictionJob
as it caused issues with never version of Trafaret validation. - Make
predicted_vs_actual
optional in Feature Effects data because a feature may have insufficient qualified samples. - Make
jdbc_url
optional in Data Store data because some data stores will not have it. - The method
Project.get_datetime_models
now correctly returns allDatetimeModel
objects for the project, instead of just the first 100. - Fixed a documentation error related to snake_case vs camelCase in the JDBC settings payload.
- Make trafaret validator for datasets use a syntax that works properly with a wider range of trafaret versions.
- Handle extra keys in CustomModelTests and CustomModelVersions
ImageEmbedding
andImageActivationMap
now supports regression projects.
API changes¶
- The default value for the
mode
param inProject.set_target
has been changed fromAUTOPILOT_MODE.FULL_AUTO
toAUTOPILOT_MODE.QUICK
Documentation changes¶
- Added links to classes with duration parameters such as
validation_duration
andholdout_duration
to provide duration string examples to users. - The :ref:
models documentation <models>
has been revised to include section on how to train a new model and how to run cross-validation or backtesting for a model.
2.21.0¶
New features¶
- Added new arguments
explanation_algorithm
andmax_explanations
to methodModel.request_training_predictions
. New fieldsexplanation_algorithm
,max_explanations
andshap_warnings
have been added to classTrainingPredictions
. New fieldsprediction_explanations
andshap_metadata
have been added to classTrainingPredictionsIterator
that is returned by methodTrainingPredictions.iterate_rows
. - Added new arguments
explanation_algorithm
andmax_explanations
to methodModel.request_predictions
. New fieldsexplanation_algorithm
,max_explanations
andshap_warnings
have been added to classPredictions
. MethodPredictions.get_all_as_dataframe
has new argumentserializer
that specifies the retrieval and results validation method (json
orcsv
) for the predictions. - Added possibility to compute
ShapImpact.create
and requestShapImpact.get
SHAP impact scores for features in a model. - Added support for accessing Visual AI images and insights. See the DataRobot Python Package documentation, Visual AI Projects, section for details.
- User can specify custom row count when requesting Feature Effects. Extended methods are
Model.request_feature_effect
andModel.get_or_request_feature_effect
. - Users can request SHAP based predictions explanations for a models that support SHAP scores using
ShapMatrix.create
. - Added two new methods to
Dataset
to lazily retrieve paginated responses.Dataset.iterate
returns an iterator of the datasets that a user can view.Dataset.iterate_all_features
returns an iterator of the features of a dataset.
- It’s possible to create an Interaction feature by combining two categorical features together using
Project.create_interaction_feature
. Operation result represented bymodels.InteractionFeature.
. Specific information about an interaction feature may be retrieved by its name usingmodels.InteractionFeature.get
- Added the
DatasetFeaturelist
class to support featurelists on datasets in the AI Catalog. DatasetFeaturelists can be updated or deleted. Two new methods were also added toDataset
to interact with DatasetFeaturelists. These areDataset.get_featurelists
andDataset.create_featurelist
which list existing featurelists and create new featurelists on a dataset, respectively. - Added
model_splits
toDatetimePartitioningSpecification
and toDatetimePartitioning
. This will allow users to control the jobs per model used when building models. A higher number ofmodel_splits
will result in less downsampling, allowing the use of more post-processed data. - Added support for :ref:
unsupervised projects<unsupervised_anomaly>
. - Added support for external test set. Please see :ref:
testset documentation<external_testset>
- A new workflow is available for assessing models on external test sets in time series unsupervised projects.
More information can be found in the :ref:
documentation<unsupervised_external_dataset>
. Project.upload_dataset
andModel.request_predictions
now acceptactual_value_column
- name of the actual value column, can be passed only with date range.PredictionDataset
objects now contain the following new fields:actual_value_column
: Actual value column which was selected for this dataset.detected_actual_value_column
: A list of detected actual value column info.
- New warning is added to
data_quality_warnings
ofdatarobot.models.PredictionDataset
:single_class_actual_value_column
. - Scores and insights on external test sets can be retrieved using
ExternalScores
,ExternalLiftChart
,ExternalRocCurve
. - Users can create payoff matrices for generating profit curves for binary classification projects
using
PayoffMatrix.create
. - Deployment Improvements:
datarobot.models.deployment.TargetDrift
can be used to retrieve target drift information.datarobot.models.deployment.FeatureDrift
can be used to retrieve feature drift information.Deployment.submit_actuals
will submit actuals in batches if the total number of actuals exceeds the limit of one single request.Deployment.create_from_custom_model_image
can be used to create a deployment from a custom model image.- Deployments now support predictions data collection that enables prediction requests and results to be saved in Predictions Data Storage. See
Deployment.get_predictions_data_collection_settings
andDeployment.update_predictions_data_collection_settings
for usage. - New arguments
send_notification
andinclude_feature_discovery_entities
are added toProject.share
. - Now it is possible to specify the number of training rows to use in feature impact computation on supported project
types (that is everything except unsupervised, multi-class, time-series). This does not affect SHAP based feature
impact. Extended methods:
- A new class
FeatureImpactJob
is added to retrieve Feature Impact records with metadata. The regularJob
still works as before. - Added support for custom models. Please see :ref:
custom model documentation<custom_models>
. Classes added:datarobot.ExecutionEnvironment
anddatarobot.ExecutionEnvironmentVersion
to create and manage custom model executions environmentsdatarobot.CustomInferenceModel
anddatarobot.CustomModelVersion
to create and manage custom inference modelsdatarobot.CustomModelTest
to perform testing of custom models
- Batch Prediction jobs now support forecast and historical Time Series predictions using the new
argument
timeseries_settings
forBatchPredictionJob.score
. - Batch Prediction jobs now support scoring to Azure and Google Cloud Storage with methods
BatchPredictionJob.score_azure
andBatchPredictionJob.score_gcp
. - Now it’s possible to create Relationships Configurations to introduce secondary datasets to projects. A configuration specifies additional datasets to be included to a project and how these datasets are related to each other, and the primary dataset. When a relationships configuration is specified for a project, Feature Discovery will create features automatically from these datasets.
: -
RelationshipsConfiguration.create
creates a new relationships configuration between datasets RelationshipsConfiguration.retrieve
retrieve the requested relationships configurationRelationshipsConfiguration.replace
replace the relationships configuration details with new oneRelationshipsConfiguration.delete
delete the relationships configuration
Enhancements¶
- Made creating projects from a dataset easier through the new
Dataset.create_project
. - These methods now provide additional metadata fields in Feature Impact results if called with
with_metadata=True
. Fields added:rowCount
,shapBased
,ranRedundancyDetection
,count
. - Secondary dataset configuration retrieve and deletion is easier now though new
SecondaryDatasetConfigurations.delete
soft deletes a Secondary dataset configuration.SecondaryDatasetConfigurations.get
retrieve a Secondary dataset configuration. - Retrieve relationships configuration which is applied on the given feature discovery project using
Project.get_relationships_configuration
.
Bugfixes¶
- An issue with input validation of the Batch Prediction module
- parent_model_id was not visible for all frozen models
- Batch Prediction jobs that used other output types than
local_file
failed when using.wait_for_completion()
- A race condition in the Batch Prediction file scoring logic
API changes¶
- Three new fields were added to the
Dataset
object. This reflects the updated fields in the public API routes atapi/v2/datasets/
. The added fields are:- processing_state: Current ingestion process state of the dataset
- row_count: The number of rows in the dataset.
- size: The size of the dataset as a CSV in bytes.
Deprecation summary¶
datarobot.enums.VARIABLE_TYPE_TRANSFORM.CATEGORICAL
for is deprecated for the following and will be removed in v2.22. : - meth:Project.batch_features_type_transform
- meth:
Project.create_type_transform_feature
2.20.0¶
New features¶
- There is a new
Dataset
object that implements some of the public API routes atapi/v2/datasets/
. This also adds two new feature classes and a details class.
Functionality:
- Create a Dataset by uploading from a file, URL or in-memory datasource.
- Get Datasets or elements of Dataset with:
Dataset.list
lists available DatasetsDataset.get
gets a specified DatasetDataset.update
updates the Dataset with the latest server information.Dataset.get_details
gets the DatasetDetails of the Dataset.Dataset.get_all_features
gets a list of the Dataset’s Features.Dataset.get_file
downloads the Dataset as a csv file.Dataset.get_projects
gets a list of Projects that use the Dataset.- Modify, delete or un-delete a Dataset:
Dataset.modify
Changes the name and categories of the DatasetDataset.delete
soft deletes a Dataset.Dataset.un_delete
un-deletes the Dataset. You cannot retrieve the IDs of deleted Datasets, so if you want to un-delete a Dataset, you need to store its ID before deletion.- You can also create a Project using a
Dataset
with:- It is possible to create an alternative configuration for the secondary dataset which can be used during the prediction
SecondaryDatasetConfigurations.create
allow to create secondary dataset configuration- You can now filter the deployments returned by the
Deployment.list
command. You can do this by passing an instance of theDeploymentListFilters
class to thefilters
keyword argument. The currently supported filters are:role
service_health
model_health
accuracy_health
execution_environment_type
materiality
- A new workflow is available for making predictions in time series projects. To that end,
PredictionDataset
objects now contain the following new fields:forecast_point_range
: The start and end date of the range of dates available for use as the forecast point, detected based on the uploaded prediction datasetdata_start_date
: A datestring representing the minimum primary date of the prediction datasetdata_end_date
: A datestring representing the maximum primary date of the prediction datasetmax_forecast_date
: A datestring representing the maximum forecast date of this prediction dataset
Additionally, users no longer need to specify a forecast_point
or predictions_start_date
and
predictions_end_date
when uploading datasets for predictions in time series projects. More information can be
found in the :ref:time series predictions<new_pred_ux>
documentation.
- Per-class lift chart data is now available for multiclass models using
Model.get_multiclass_lift_chart
.
- Unsupervised projects can now be created using the Project.start
and Project.set_target
methods by providing unsupervised_mode=True
,
provided that the user has access to unsupervised machine learning functionality. Contact support for more information.
- A new boolean attribute unsupervised_mode
was added to datarobot.DatetimePartitioningSpecification
.
When it is set to True, datetime partitioning for unsupervised time series projects will be constructed for
nowcasting: forecast_window_start=forecast_window_end=0
.
- Users can now configure the start and end of the training partition as well as the end of the validation partition for
backtests in a datetime-partitioned project. More information and example usage can be found in the
:ref:backtesting documentation <backtest_configuration>
.
Enhancements¶
- Updated the user agent header to show which python version.
Model.get_frozen_child_models
can be used to retrieve models that are frozen from a given model- Added
datarobot.enums.TS_BLENDER_METHOD
to make it clearer which blender methods are allowed for use in time series projects.
Bugfixes¶
- An issue where uploaded CSV’s would loose quotes during serialization causing issues when columns containing line terminators where loaded in a dataframe, has been fixed
Project.get_association_featurelists
is now using the correct endpoint name, but the old one will continue to work- Python API
PredictionServer
supports now on-premise format of API response.
2.19.0¶
New features¶
- Projects can be cloned using
Project.clone_project
- Calendars used in time series projects now support having series-specific events, for instance if a holiday only affects some stores. This can be controlled by using new argument of the
CalendarFile.create
method. If multiseries id columns are not provided, calendar is considered to be single series and all events are applied to all series. - We have expanded prediction intervals availability to the following use-cases:
- Time series model deployments now support prediction intervals. See
Deployment.get_prediction_intervals_settings
andDeployment.update_prediction_intervals_settings
for usage. - Prediction intervals are now supported for model exports for time series. To that end, a new optional parameter
prediction_intervals_size
has been added toModel.request_transferable_export <datarobot.models.Model.request_transferable_export>
.
- Time series model deployments now support prediction intervals. See
More details on prediction intervals can be found in the :ref:prediction intervals documentation <prediction_intervals>
.
- Allowed pairwise interaction groups can now be specified in AdvancedOptions
.
They will be used in GAM models during training.
- New deployments features:
- Update the label and description of a deployment using
Deployment.update
.- :ref:
Association ID setting<deployment_association_id>
can be retrieved and updated.- Regression deployments now support :ref:
prediction warnings<deployment_prediction_warning>
.- For multiclass models now it’s possible to get feature impact for each individual target class using
Model.get_multiclass_feature_impact
- Added support for new :ref:
Batch Prediction API <batch_predictions>
.- It is now possible to create and retrieve basic, oauth and s3 credentials with
Credential
.- It’s now possible to get feature association statuses for featurelists using
Project.get_association_featurelists
- You can also pass a specific featurelist_id into
Project.get_associations
Enhancements¶
- Added documentation to
Project.get_metrics
to detail the newascending
field that indicates how a metric should be sorted. - Retraining of a model is processed asynchronously and returns a
ModelJob
immediately. - Blender models can be retrained on a different set of data or a different feature list.
- Word cloud ngrams now has
variable
field representing the source of the ngram. - Method
WordCloud.ngrams_per_class
can be used to split ngrams for better usability in multiclass projects. - Method
Project.set_target
support new optional parametersfeatureEngineeringGraphs
andcredentials
. - Method
Project.upload_dataset
andProject.upload_dataset_from_data_source
support new optional parametercredentials
. - Series accuracy retrieval methods (
DatetimeModel.get_series_accuracy_as_dataframe
andDatetimeModel.download_series_accuracy_as_csv
) for multiseries time series projects now support additional parameters for specifying what data to retrieve, including:metric
: Which metric to retrieve scores formultiseries_value
: Only returns series with a matching multiseries IDorder_by
: An attribute by which to sort the results
Bugfixes¶
- An issue when using
Feature.get
andModelingFeature.get
to retrieve summarized categorical feature has been fixed.
API changes¶
- The datarobot package is now no longer a namespace package.
datarobot.enums.BLENDER_METHOD.FORECAST_DISTANCE
is removed (deprecated in 2.18.0).
Documentation changes¶
- Updated :ref:
Residuals charts <residuals_chart>
documentation to reflect that the data rows include row numbers from the source dataset for projects created in DataRobot 5.3 and newer.
2.18.0¶
New features¶
- :ref:
Residuals charts <residuals_chart>
can now be retrieved for non-time-aware regression models. - :ref:
Deployment monitoring <deployment_monitoring>
can now be used to retrieve service stats, service health, accuracy info, permissions, and feature lists for deployments. - :ref:
Time series <time_series>
projects now support the Average by Forecast Distance blender, configured with more than one Forecast Distance. The blender blends the selected models, selecting the best three models based on the backtesting score for each Forecast Distance and averaging their predictions. The new blender methodFORECAST_DISTANCE_AVG
has been added todatarobot.enums.BLENDER_METHOD
. Deployment.submit_actuals
can now be used to submit data about actual results from a deployed model, which can be used to calculate accuracy metrics.
Enhancements¶
- Monotonic constraints are now supported for OTV projects. To that end, the parameters
monotonic_increasing_featurelist_id
andmonotonic_decreasing_featurelist_id
can be specified in calls toModel.train_datetime
orProject.train_datetime
. - When
retrieving information about features
, information about summarized categorical variables is now available in a newkeySummary
. - For
Word Clouds
in multiclass projects, values of the target class for corresponding word or ngram can now be passed using the newclass
parameter. - Listing deployments using
Deployment.list
now support sorting and searching the results using the neworder_by
andsearch
parameters. - You can now get the model associated with a model job by getting the
model
variable on themodel job object
. - The
Blueprint
class can now retrieve therecommended_featurelist_id
, which indicates which feature list is recommended for this blueprint. If the field is not present, then there is no recommended feature list for this blueprint. - The
Model
class now can be used to retrieve themodel_number
. - The method
Model.get_supported_capabilities
now has an extra fieldsupportsCodeGeneration
to explain whether the model supports code generation. - Calls to
Project.start
andProject.upload_dataset
now support uploading data via S3 URI andpathlib.Path
objects. - Errors upon connecting to DataRobot are now clearer when an incorrect API Token is used.
- The datarobot package is now a namespace package.
Deprecation summary¶
datarobot.enums.BLENDER_METHOD.FORECAST_DISTANCE
is deprecated and will be removed in 2.19. UseFORECAST_DISTANCE_ENET
instead.
Documentation changes¶
- Various typo and wording issues have been addressed.
- A new notebook showing regression-specific features is now been added to the examples_index.
- Documentation for :ref:
Access lists <sharing>
has been added.
2.17.0¶
New features¶
- :ref:
Deployments <deployments_overview>
can now be managed via the API by using the newDeployment
class. - Users can now list available prediction servers using
PredictionServer.list
. - When
specifying datetime partitioning
settings , :ref:time series <time_series>
projects can now mark individual features as excluded from feature derivation using theFeatureSettings.do_not_derive
attribute. Any features not specified will be assigned according to theDatetimePartitioningSpecification.default_to_do_not_derive
value. - Users can now submit multiple feature type transformations in a single batch request using
Project.batch_features_type_transform
. - :ref:
Advanced Tuning <advanced_tuning>
for non-Eureqa models (beta feature) is now enabled by default for all users. As of v2.17, all models are now supported other than blenders, open source, prime, scaleout, baseline and user-created. - Information on feature clustering and the association strength between pairs of numeric or categorical features is now available.
Project.get_associations
can be used to retrieve pairwise feature association statistics andProject.get_association_matrix_details
can be used to get a sample of the actual values used to measure association strength.
Enhancements¶
number_of_do_not_derive_features
has been added to thedatarobot.DatetimePartitioning
class to specify the number of features that are marked as excluded from derivation.- Users with PyYAML>=5.1 will no longer receive a warning when using the
datarobot
package - It is now possible to use files with unicode names for creating projects and prediction jobs.
- Users can now embed DataRobot-generated content in a
ComplianceDocTemplate
using keyword tags. :ref:See here <automated_documentation_overview>
for more details. - The field
calendar_name
has been added todatarobot.DatetimePartitioning
to display the name of the calendar used for a project. - :ref:
Prediction intervals <prediction_intervals>
are now supported for start-end retrained models in a time series project. - Previously, all backtests had to be run before :ref:
prediction intervals <prediction_intervals>
for a time series project could be requested with predictions. Now, backtests will be computed automatically if needed when prediction intervals are requested.
Bugfixes¶
- An issue affecting time series project creation for irregularly spaced dates has been fixed.
ComplianceDocTemplate
now supports empty text blocks in user sections.- An issue when using
Predictions.get
to retrieve predictions metadata has been fixed.
Documentation changes¶
- An overview on working with class
ComplianceDocumentation
andComplianceDocTemplate
has been created. :ref:See here <automated_documentation_overview>
for more details.
2.16.0¶
New features¶
- Three new methods for Series Accuracy have been added to the
DatetimeModel
class.- Start a request to calculate Series Accuracy with
DatetimeModel.compute_series_accuracy
- Once computed, Series Accuracy can be retrieved as a pandas.DataFrame using
DatetimeModel.get_series_accuracy_as_dataframe
- Or saved as a CSV using
DatetimeModel.download_series_accuracy_as_csv
- Start a request to calculate Series Accuracy with
- Users can now access :ref:
prediction intervals <prediction_intervals>
data for each prediction with aDatetimeModel
. For each model, prediction intervals estimate the range of values DataRobot expects actual values of the target to fall within. They are similar to a confidence interval of a prediction, but are based on the residual errors measured during the backtesting for the selected model.
Enhancements¶
- Information on the effective feature derivation window is now available for :ref:
time series projects <time_series>
to specify the full span of historical data required at prediction time. It may be longer than the feature derivation window of the project depending on the differencing settings used.
Additionally, more of the project partitioning settings are also available on the
DatetimeModel
class. The new attributes are:
effective_feature_derivation_window_start
effective_feature_derivation_window_end
forecast_window_start
forecast_window_end
windows_basis_unit
- Prediction metadata is now included in the return of
Predictions.get
Documentation changes¶
- Various typo and wording issues have been addressed.
- The example data that was meant to accompany the Time Series examples has been added to the zip file of the download in the examples_index.
2.15.1¶
Enhancements¶
CalendarFile.get_access_list
has been added to theCalendarFile
class to return a list of users with access to a calendar file.- A
role
attribute has been added to theCalendarFile
class to indicate the access level a current user has to a calendar file. For more information on the specific access levels, see the :ref:sharing <sharing>
documentation.
Bugfixes¶
- Previously, attempting to retrieve the
calendar_id
of a project without a set target would result in an error. This has been fixed to returnNone
instead.
2.15.0¶
New features¶
- Previously available for only Eureqa models, Advanced Tuning methods and objects, including
Model.start_advanced_tuning_session
,Model.get_advanced_tuning_parameters
,Model.advanced_tune
, andAdvancedTuningSession
, now support all models other than blender, open source, and user-created models. Use of Advanced Tuning via API for non-Eureqa models is in beta and not available by default, but can be enabled. - Calendar Files for time series projects can now be created and managed through the
CalendarFile
class.
Enhancements¶
- The dataframe returned from
datarobot.PredictionExplanations.get_all_as_dataframe()
will now have each class labelclass_X
be the same from row to row. - The client is now more robust to networking issues by default. It will retry on more errors and respects
Retry-After
headers in HTTP 413, 429, and 503 responses. - Added Forecast Distance blender for Time-Series projects configured with more than one Forecast Distance. It blends the selected models creating separate linear models for each Forecast Distance.
Project
can now be :ref:shared <sharing>
with other users.Project.upload_dataset
andProject.upload_dataset_from_data_source
will return aPredictionDataset
withdata_quality_warnings
if potential problems exist around the uploaded dataset.relax_known_in_advance_features_check
has been added toProject.upload_dataset
andProject.upload_dataset_from_data_source
to allow missing values from the known in advance features in the forecast window at prediction time.cross_series_group_by_columns
has been added todatarobot.DatetimePartitioning
to allow users the ability to indicate how to further split series into related groups.- Information retrieval for
ROC Curve
has been extended to includefraction_predicted_as_positive
,fraction_predicted_as_negative
,lift_positive
andlift_negative
Bugfixes¶
- Fixes an issue where the client would not be usable if it could not be sure it was compatible with the configured server
API changes¶
- Methods for creating
datarobot.models.Project
:create_from_mysql
,create_from_oracle
, andcreate_from_postgresql
, deprecated in 2.11, have now been removed. Usedatarobot.models.Project.create_from_data_source()
instead. datarobot.FeatureSettings
attributeapriori
, deprecated in 2.11, has been removed. Usedatarobot.FeatureSettings.known_in_advance
instead.datarobot.DatetimePartitioning
attributedefault_to_a_priori
, deprecated in 2.11, has been removed. Usedatarobot.DatetimePartitioning.known_in_advance
instead.datarobot.DatetimePartitioningSpecification
attributedefault_to_a_priori
, deprecated in 2.11, has been removed. Usedatarobot.DatetimePartitioningSpecification.known_in_advance
instead.
Configuration changes¶
- Now requires dependency on package requests to be at least version 2.21.
- Now requires dependency on package urllib3 to be at least version 1.24.
Documentation changes¶
- Advanced model insights notebook extended to contain information on visualization of cumulative gains and lift charts.
2.14.2¶
Bugfixes¶
- Fixed an issue where searches of the HTML documentation would sometimes hang indefinitely
Documentation changes¶
- Python3 is now the primary interpreter used to build the docs (this does not affect the ability to use the package with Python2)
2.14.1¶
Documentation changes¶
- Documentation for the Model Deployment interface has been removed after the corresponding interface was removed in 2.13.0.
2.14.0¶
New features¶
- The new method
Model.get_supported_capabilities
retrieves a summary of the capabilities supported by a particular model, such as whether it is eligible for Prime and whether it has word cloud data available. - New class for working with model compliance documentation feature of DataRobot:
class
ComplianceDocumentation
- New class for working with compliance documentation templates:
ComplianceDocTemplate
- New class
FeatureHistogram
has been added to retrieve feature histograms for a requested maximum bin count - Time series projects now support binary classification targets.
- Cross series features can now be created within time series multiseries projects using the
use_cross_series_features
andaggregation_type
attributes of thedatarobot.DatetimePartitioningSpecification
. See the :ref:Time Series <time_series>
documentation for more info.
Enhancements¶
- Client instantiation now checks the endpoint configuration and provides more informative error messages. It also automatically corrects HTTP to HTTPS if the server responds with a redirect to HTTPS.
Project.upload_dataset
andProject.create
now accept an optional parameter ofdataset_filename
to specify a file name for the dataset. This is ignored for url and file path sources.- New optional parameter
fallback_to_parent_insights
has been added toModel.get_lift_chart
,Model.get_all_lift_charts
,Model.get_confusion_chart
,Model.get_all_confusion_charts
,Model.get_roc_curve
, andModel.get_all_roc_curves
. WhenTrue
, a frozen model with missing insights will attempt to retrieve the missing insight data from its parent model. - New
number_of_known_in_advance_features
attribute has been added to thedatarobot.DatetimePartitioning
class. The attribute specifies number of features that are marked as known in advance. Project.set_worker_count
can now update the worker count on a project to the maximum number available to the user.- :ref:
Recommended Models API <recommended_models>
can now be used to retrieve model recommendations for datetime partitioned projects - Timeseries projects can now accept feature derivation and forecast windows intervals in terms of
number of the rows rather than a fixed time unit.
DatetimePartitioningSpecification
andProject.set_target
support new optional parameterwindowsBasisUnit
, either ‘ROW’ or detected time unit. - Timeseries projects can now accept feature derivation intervals, forecast windows, forecast points and prediction start/end dates in milliseconds.
DataSources
andDataStores
can now be :ref:shared <sharing>
with other users.- Training predictions for datetime partitioned projects now support the new data subset
dr.enums.DATA_SUBSET.ALL_BACKTESTS
for requesting the predictions for all backtest validation folds.
API changes¶
- The model recommendation type “Recommended” (deprecated in version 2.13.0) has been removed.
Documentation changes¶
- Example notebooks have been updated: : - Notebooks now work in Python 2 and Python 3
- A notebook illustrating time series capability has been added
- The financial data example has been replaced with an updated introductory example.
- To supplement the embedded Python notebooks in both the PDF and HTML docs bundles, the notebook files and supporting data can now be downloaded from the HTML docs bundle.
- Fixed a minor typo in the code sample for
get_or_request_feature_impact
2.13.0¶
New features¶
- The new method
Model.get_or_request_feature_impact
functionality will attempt to request feature impact and return the newly created feature impact object or the existing object so two calls are no longer required. - New methods and objects, including
Model.start_advanced_tuning_session
,Model.get_advanced_tuning_parameters
,Model.advanced_tune
, andAdvancedTuningSession
, were added to support the setting of Advanced Tuning parameters. This is currently supported for Eureqa models only. - New
is_starred
attribute has been added to theModel
class. The attribute specifies whether a model has been marked as starred by user or not. - Model can be marked as starred or being unstarred with
Model.star_model
andModel.unstar_model
. - When listing models with
Project.get_models
, the model list can now be filtered by theis_starred
value. - A custom prediction threshold may now be configured for each model via
Model.set_prediction_threshold
. When making predictions in binary classification projects, this value will be used when deciding between the positive and negative classes. Project.check_blendable
can be used to confirm if a particular group of models are eligible for blending as some are not, e.g. scaleout models and datetime models with different training lengths.- Individual cross validation scores can be retrieved for new models using
Model.get_cross_validation_scores
.
Enhancements¶
- Python 3.7 is now supported.
- Feature impact now returns not only the impact score for the features but also whether they were detected to be redundant with other high-impact features.
- A new
is_blocked
attribute has been added to theJob
class, specifying whether a job is blocked from execution because one or more dependencies are not yet met. - The
Featurelist
object now has new attributes reporting its creation time, whether it was created by a user or by DataRobot, and the number of models using the featurelist, as well as a new description field. - Featurelists can now be renamed and have their descriptions updated with
Featurelist.update
andModelingFeaturelist.update
. - Featurelists can now be deleted with
Featurelist.delete
andModelingFeaturelist.delete
. ModelRecommendation.get
now accepts an optional parameter of typedatarobot.enums.RECOMMENDED_MODEL_TYPE
which can be used to get a specific kind of recommendation.- Previously computed predictions can now be listed and retrieved with the
Predictions
class, without requiring a reference to the originalPredictJob
.
Bugfixes¶
- The Model Deployment interface which was previously visible in the client has been removed to allow the interface to mature, although the raw API is available as a “beta” API without full backwards compatibility support.
API changes¶
- Added support for retrieving the Pareto Front of a Eureqa model. See
ParetoFront
. - A new recommendation type “Recommended for Deployment” has been added to
ModelRecommendation
which is now returns as the default recommended model when available. See :ref:model_recommendation
.
Deprecation summary¶
- The feature previously referred to as “Reason Codes” has been renamed to “Prediction
Explanations”, to provide increased clarity and accessibility. The old
ReasonCodes interface has been deprecated and replaced with
PredictionExplanations
. - The recommendation type “Recommended” is deprecated and will no longer be returned in v2.14 of the API.
Documentation changes¶
- Added a new documentation section :ref:
model_recommendation
. - Time series projects support multiseries as well as single series data. They are now documented in
the :ref:
Time Series Projects <time_series>
documentation.
2.12.0¶
New features¶
- Some models now have Missing Value reports allowing users with access to uncensored blueprints to
retrieve a detailed breakdown of how numeric imputation and categorical converter tasks handled
missing values. See the :ref:
documentation <missing_values_report>
for more information on the report.
2.11.0¶
New features¶
- The new
ModelRecommendation
class can be used to retrieve the recommended models for a project. - A new helper method cross_validate was added to class Model. This method can be used to request Model’s Cross Validation score.
- Training a model with monotonic constraints is now supported. Training with monotonic constraints allows users to force models to learn monotonic relationships with respect to some features and the target. This helps users create accurate models that comply with regulations (e.g. insurance, banking). Currently, only certain blueprints (e.g. xgboost) support this feature, and it is only supported for regression and binary classification projects.
- DataRobot now supports “Database Connectivity”, allowing databases to be used
as the source of data for projects and prediction datasets. The feature works
on top of the JDBC standard, so a variety of databases conforming to that standard are available;
a list of databases with tested support for DataRobot is available in the user guide
in the web application. See :ref:
Database Connectivity <database_connectivity_overview>
for details. - Added a new feature to retrieve feature logs for time series projects. Check
datarobot.DatetimePartitioning.feature_log_list()
anddatarobot.DatetimePartitioning.feature_log_retrieve()
for details.
API changes¶
- New attributes supporting monotonic constraints have been added to the
AdvancedOptions
,Project
,Model
, andBlueprint
classes. See :ref:monotonic constraints<monotonic_constraints>
for more information on how to configure monotonic constraints. - New parameters
predictions_start_date
andpredictions_end_date
added toProject.upload_dataset
to support bulk predictions upload for time series projects.
Deprecation summary¶
- Methods for creating
datarobot.models.Project
:create_from_mysql
,create_from_oracle
, andcreate_from_postgresql
, have been deprecated and will be removed in 2.14. Usedatarobot.models.Project.create_from_data_source()
instead. datarobot.FeatureSettings
attributeapriori
, has been deprecated and will be removed in 2.14. Usedatarobot.FeatureSettings.known_in_advance
instead.datarobot.DatetimePartitioning
attributedefault_to_a_priori
, has been deprecated and will be removed in 2.14.datarobot.DatetimePartitioning.known_in_advance
instead.datarobot.DatetimePartitioningSpecification
attributedefault_to_a_priori
, has been deprecated and will be removed in 2.14. Usedatarobot.DatetimePartitioningSpecification.known_in_advance
instead.
Configuration changes¶
- Retry settings compatible with those offered by urllib3’s Retry interface can now be configured. By default, we will now retry connection errors that prevented requests from arriving at the server.
Documentation changes¶
- “Advanced Model Insights” example has been updated to properly handle bin weights when rebinning.
2.9.0¶
New features¶
- New
ModelDeployment
class can be used to track status and health of models deployed for predictions.
Enhancements¶
- DataRobot API now supports creating 3 new blender types - Random Forest, TensorFlow, LightGBM.
- Multiclass projects now support blenders creation for 3 new blender types as well as Average and ENET blenders.
- Models can be trained by requesting a particular row count using the new
training_row_count
argument withProject.train
,Model.train
andModel.request_frozen_model
in non-datetime partitioned projects, as an alternative to the previous option of specifying a desired percentage of the project dataset. Specifying model size by row count is recommended when the float precision ofsample_pct
could be problematic, e.g. when training on a small percentage of the dataset or when training up to partition boundaries. - New attributes
max_train_rows
,scaleout_max_train_pct
, andscaleout_max_train_rows
have been added toProject
.max_train_rows
specified the equivalent value to the existingmax_train_pct
as a row count. The scaleout fields can be used to see how far scaleout models can be trained on projects, which for projects taking advantage of scalable ingest may exceed the limits on the data available to non-scaleout blueprints. - Individual features can now be marked as a priori or not a priori using the new
feature_settings
attribute when setting the target or specifying datetime partitioning settings on time series projects. Any features not specified in thefeature_settings
parameter will be assigned according to thedefault_to_a_priori
value. - Three new options have been made available in the
datarobot.DatetimePartitioningSpecification
class to fine-tune how time-series projects derive modeling features.treat_as_exponential
can control whether data is analyzed as an exponential trend and transformations like log-transform are applied.differencing_method
can control which differencing method to use for stationary data.periodicities
can be used to specify periodicities occurring within the data. All are optional and defaults will be chosen automatically if they are unspecified.
API changes¶
- Now
training_row_count
is available on non-datetime models as well as “rowCount” based datetime models. It reports the number of rows used to train the model (equivalent tosample_pct
). - Features retrieved from
Feature.get
now includetarget_leakage
.
2.8.1¶
Bugfixes¶
- The documented default connect_timeout will now be correctly set for all configuration mechanisms,
so that requests that fail to reach the DataRobot server in a reasonable amount of time will now
error instead of hanging indefinitely. If you observe that you have started seeing
ConnectTimeout
errors, please configure your connect_timeout to a larger value. - Version of
trafaret
library this package depends on is now pinned totrafaret>=0.7,<1.1
since versions outside that range are known to be incompatible.
2.8.0¶
New features¶
- The DataRobot API supports the creation, training, and predicting of multiclass classification projects. DataRobot, by default, handles a dataset with a numeric target column as regression. If your data has a numeric cardinality of fewer than 11 classes, you can override this behavior to instead create a multiclass classification project from the data. To do so, use the set_target function, setting target_type=’Multiclass’. If DataRobot recognizes your data as categorical, and it has fewer than 11 classes, using multiclass will create a project that classifies which label the data belongs to.
- The DataRobot API now includes Rating Tables. A rating table is an exportable csv representation
of a model. Users can influence predictions by modifying them and creating a new model with the
modified table. See the :ref:
documentation<rating_table>
for more information on how to use rating tables. scaleout_modeling_mode
has been added to theAdvancedOptions
class used when setting a project target. It can be used to control whether scaleout models appear in the autopilot and/or available blueprints. Scaleout models are only supported in the Hadoop environment with the corresponding user permission set.- A new premium add-on product, Time Series, is now available. New projects can be created as time series
projects which automatically derive features from past data and forecast the future. See the
:ref:
time series documentation<time_series>
for more information. - The
Feature
object now returns the EDA summary statistics (i.e., mean, median, minimum, maximum, and standard deviation) for features where this is available (e.g., numeric, date, time, currency, and length features). These summary statistics will be formatted in the same format as the data it summarizes. - The DataRobot API now supports Training Predictions workflow. Training predictions are made by a
model for a subset of data from original dataset. User can start a job which will make those
predictions and retrieve them. See the :ref:
documentation<predictions>
for more information on how to use training predictions. - DataRobot now supports retrieving a :ref:
model blueprint chart<model_blueprint_chart>
and a :ref:model blueprint docs<model_blueprint_doc>
. - With the introduction of Multiclass Classification projects, DataRobot needed a better way to explain the performance of a multiclass model so we created a new Confusion Chart. The API now supports retrieving and interacting with confusion charts.
Enhancements¶
DatetimePartitioningSpecification
now includes the optionaldisable_holdout
flag that can be used to disable the holdout fold when creating a project with datetime partitioning.- When retrieving reason codes on a project using an exposure column, predictions that are adjusted for exposure can be retrieved.
- File URIs can now be used as sourcedata when creating a project or uploading a prediction dataset. The file URI must refer to an allowed location on the server, which is configured as described in the user guide documentation.
- The advanced options available when setting the target have been extended to include the new parameter ‘events_count’ as a part of the AdvancedOptions object to allow specifying the events count column. See the user guide documentation in the webapp for more information on events count.
- PredictJob.get_predictions now returns predicted probability for each class in the dataframe.
- PredictJob.get_predictions now accepts prefix parameter to prefix the classes name returned in the predictions dataframe.
API changes¶
- Add
target_type
parameter to set_target() and start(), used to override the project default.
2.7.2¶
Documentation changes¶
- Updated link to the publicly hosted documentation.
2.7.1¶
Documentation changes¶
- Online documentation hosting has migrated from PythonHosted to Read The Docs. Minor code changes have been made to support this.
2.7.0¶
New features¶
- Lift chart data for models can be retrieved using the
Model.get_lift_chart
andModel.get_all_lift_charts
methods. - ROC curve data for models in classification projects can be retrieved using the
Model.get_roc_curve
andModel.get_all_roc_curves
methods. - Semi-automatic autopilot mode is removed.
- Word cloud data for text processing models can be retrieved using
Model.get_word_cloud
method. - Scoring code JAR file can be downloaded for models supporting code generation.
Enhancements¶
- A
__repr__
method has been added to thePredictionDataset
class to improve readability when using the client interactively. Model.get_parameters
now includes an additional key in the derived features it includes, showing the coefficients for individual stages of multistage models (e.g. Frequency-Severity models).- When training a
DatetimeModel
on a window of data, atime_window_sample_pct
can be specified to take a uniform random sample of the training data instead of using all data within the window. - Installing of DataRobot package now has an “Extra Requirements” section that will install all of the dependencies needed to run the example notebooks.
Documentation changes¶
- A new example notebook describing how to visualize some of the newly available model insights including lift charts, ROC curves, and word clouds has been added to the examples section.
- A new section for
Common Issues
has been added toGetting Started
to help debug issues related to client installation and usage.
2.6.1¶
Bugfixes¶
- Fixed a bug with
Model.get_parameters
raising an exception on some valid parameter values.
Documentation changes¶
- Fixed sorting order in Feature Impact example code snippet.
2.6.0¶
New features¶
- A new partitioning method (datetime partitioning) has been added. The recommended workflow is to
preview the partitioning by creating a
DatetimePartitioningSpecification
and passing it intoDatetimePartitioning.generate
, inspect the results and adjust as needed for the specific project dataset by adjusting theDatetimePartitioningSpecification
and re-generating, and then set the target by passing the finalDatetimePartitioningSpecification
object to the partitioning_method parameter ofProject.set_target
. - When interacting with datetime partitioned projects,
DatetimeModel
can be used to access more information specific to models in datetime partitioned projects. See :ref:the documentation<datetime_modeling_workflow>
for more information on differences in the modeling workflow for datetime partitioned projects. - The advanced options available when setting the target have been extended to include the new parameters ‘offset’ and ‘exposure’ (part of the AdvancedOptions object) to allow specifying offset and exposure columns to apply to predictions generated by models within the project. See the user guide documentation in the webapp for more information on offset and exposure columns.
- Blueprints can now be retrieved directly by project_id and blueprint_id via
Blueprint.get
. - Blueprint charts can now be retrieved directly by project_id and blueprint_id via
BlueprintChart.get
. If you already have an instance ofBlueprint
you can retrieve its chart usingBlueprint.get_chart
. - Model parameters can now be retrieved using
ModelParameters.get
. If you already have an instance ofModel
you can retrieve its parameters usingModel.get_parameters
. - Blueprint documentation can now be retrieved using
Blueprint.get_documents
. It will contain information about the task, its parameters and (when available) links and references to additional sources. - The DataRobot API now includes Reason Codes. You can now compute reason codes for prediction
datasets. You are able to specify thresholds on which rows to compute reason codes for to speed
up computation by skipping rows based on the predictions they generate. See the reason codes
:ref:
documentation<reason_codes>
for more information.
Enhancements¶
- A new parameter has been added to the
AdvancedOptions
used withProject.set_target
. By specifyingaccuracyOptimizedMb=True
when creatingAdvancedOptions
, longer-running models that may have a high accuracy will be included in the autopilot and made available to run manually. - A new option for
Project.create_type_transform_feature
has been added which explicitly truncates data when casting numerical data as categorical data. - Added 2 new blenders for projects that use MAD or Weighted MAD as a metric. The MAE blender uses BFGS optimization to find linear weights for the blender that minimize mean absolute error (compared to the GLM blender, which finds linear weights that minimize RMSE), and the MAEL1 blender uses BFGS optimization to find linear weights that minimize MAE + a L1 penalty on the coefficients (compared to the ENET blender, which minimizes RMSE + a combination of the L1 and L2 penalty on the coefficients).
Bugfixes¶
- Fixed a bug (affecting Python 2 only) with printing any model (including frozen and prime models) whose model_type is not ascii.
- FrozenModels were unable to correctly use methods inherited from Model. This has been fixed.
- When calling
get_result
for a Job, ModelJob, or PredictJob that has errored,AsyncProcessUnsuccessfulError
will now be raised instead ofJobNotFinished
, consistently with the behavior ofget_result_when_complete
.
Deprecation summary¶
- Support for the experimental Recommender Problems projects has been removed. Any code relying on
RecommenderSettings
or therecommender_settings
argument ofProject.set_target
andProject.start
will error. Project.update
, deprecated in v2.2.32, has been removed in favor of specific updates:rename
,unlock_holdout
,set_worker_count
.
Documentation changes¶
- The link to Configuration from the Quickstart page has been fixed.
2.5.1¶
Bugfixes¶
- Fixed a bug (affecting Python 2 only) with printing blueprints whose names are not ascii.
- Fixed an issue where the weights column (for weighted projects) did not appear
in the
advanced_options
of aProject
.
2.5.0¶
New features¶
- Methods to work with blender models have been added. Use
Project.blend
method to create new blenders,Project.get_blenders
to get the list of existing blenders andBlenderModel.get
to retrieve a model with blender-specific information. - Projects created via the API can now use smart downsampling when setting the target by passing
smart_downsampled
andmajority_downsampling_rate
into theAdvancedOptions
object used withProject.set_target
. The smart sampling options used with an existing project will be available as part ofProject.advanced_options
. - Support for frozen models, which use tuning parameters from a parent model for more efficient
training, has been added. Use
Model.request_frozen_model
to create a new frozen model,Project.get_frozen_models
to get the list of existing frozen models andFrozenModel.get
to retrieve a particular frozen model.
Enhancements¶
- The inferred date format (e.g. “%Y-%m-%d %H:%M:%S”) is now included in the Feature object. For non-date features, it will be None.
- When specifying the API endpoint in the configuration, the client will now behave correctly for endpoints with and without trailing slashes.
2.4.0¶
New features¶
- The premium add-on product
DataRobot Prime
has been added. You can now approximate a model on the leaderboard and download executable code for it. See documentation for further details, or talk to your account representative if the feature is not available on your account. - (Only relevant for on-premise users with a Standalone Scoring cluster.) Methods
(
request_transferable_export
anddownload_export
) have been added to theModel
class for exporting models (which will only work if model export is turned on). There is a new classImportedModel
for managing imported models on a Standalone Scoring cluster. - It is now possible to create projects from a WebHDFS, PostgreSQL, Oracle or MySQL data source. For more information see the
documentation for the relevant
Project
classmethods:create_from_hdfs
,create_from_postgresql
,create_from_oracle
andcreate_from_mysql
. Job.wait_for_completion
, which waits for a job to complete without returning anything, has been added.
Enhancements¶
- The client will now check the API version offered by the server specified in configuration, and give a warning if the client version is newer than the server version. The DataRobot server is always backwards compatible with old clients, but new clients may have functionality that is not implemented on older server versions. This issue mainly affects users with on-premise deployments of DataRobot.
Bugfixes¶
- Fixed an issue where
Model.request_predictions
might raise an error when predictions finished very quickly instead of returning the job.
API changes¶
- To set the target with quickrun autopilot, call
Project.set_target
withmode=AUTOPILOT_MODE.QUICK
instead of specifyingquickrun=True
.
Deprecation summary¶
- Semi-automatic mode for autopilot has been deprecated and will be removed in 3.0. Use manual or fully automatic instead.
- Use of the
quickrun
argument inProject.set_target
has been deprecated and will be removed in 3.0. Usemode=AUTOPILOT_MODE.QUICK
instead.
Configuration changes¶
- It is now possible to control the SSL certificate verification by setting the parameter
ssl_verify
in the config file.
Documentation changes¶
- The “Modeling Airline Delay” example notebook has been updated to work with the new 2.3 enhancements.
- Documentation for the generic
Job
class has been added. - Class attributes are now documented in the
API Reference
section of the documentation. - The changelog now appears in the documentation.
- There is a new section dedicated to configuration, which lists all of the configuration options and their meanings.
2.3.0¶
New features¶
- The DataRobot API now includes Feature Impact, an approach to measuring the relevance of each feature
that can be applied to any model. The
Model
class now includes methodsrequest_feature_impact
(which creates and returns a feature impact job) andget_feature_impact
(which can retrieve completed feature impact results). - A new improved workflow for predictions now supports first uploading a dataset via
Project.upload_dataset
, then requesting predictions viaModel.request_predictions
. This allows us to better support predictions on larger datasets and non-ascii files. - Datasets previously uploaded for predictions (represented by the
PredictionDataset
class) can be listed fromProject.get_datasets
and retrieve and deleted viaPredictionDataset.get
andPredictionDataset.delete
. - You can now create a new feature by re-interpreting the type of an existing feature in a project by
using the
Project.create_type_transform_feature
method. - The
Job
class now includes aget
method for retrieving a job and acancel
method for canceling a job. - All of the jobs classes (
Job
,ModelJob
,PredictJob
) now include the following new methods:refresh
(for refreshing the data in the job object),get_result
(for getting the completed resource resulting from the job), andget_result_when_complete
(which waits until the job is complete and returns the results, or times out). - A new method
Project.refresh
can be used to updateProject
objects with the latest state from the server. - A new function
datarobot.async.wait_for_async_resolution
can be used to poll for the resolution of any generic asynchronous operation on the server.
Enhancements¶
- The
JOB_TYPE
enum now includesFEATURE_IMPACT
. - The
QUEUE_STATUS
enum now includesABORTED
andCOMPLETED
. - The
Project.create
method now has aread_timeout
parameter which can be used to keep open the connection to DataRobot while an uploaded file is being processed. For very large files this time can be substantial. Appropriately raising this value can help avoid timeouts when uploading large files. - The method
Project.wait_for_autopilot
has been enhanced to error if the project enters a state where autopilot may not finish. This avoids a situation that existed previously where users could wait indefinitely on their project that was not going to finish. However, users are still responsible to make sure a project has more than zero workers, and that the queue is not paused. - Feature.get now supports retrieving features by feature name. (For backwards compatibility, feature IDs are still supported until 3.0.)
- File paths that have unicode directory names can now be used for creating projects and PredictJobs. The filename itself must still be ascii, but containing directory names can have other encodings.
- Now raises more specific JobAlreadyRequested exception when we refuse a model fitting request as a duplicate. Users can explicitly catch this exception if they want it to be ignored.
- A
file_name
attribute has been added to theProject
class, identifying the file name associated with the original project dataset. Note that if the project was created from a data frame, the file name may not be helpful. - The connect timeout for establishing a connection to the server can now be set directly. This can be done in the yaml configuration of the client, or directly in the code. The default timeout has been lowered from 60 seconds to 6 seconds, which will make detecting a bad connection happen much quicker.
Bugfixes¶
- Fixed a bug (affecting Python 2 only) with printing features and featurelists whose names are not ascii.
API changes¶
- Job class hierarchy is rearranged to better express the relationship between these objects. See
documentation for
datarobot.models.job
for details. Featurelist
objects now have aproject_id
attribute to indicate which project they belong to. Directly accessing theproject
attribute of aFeaturelist
object is now deprecated- Support INI-style configuration, which was deprecated in v2.1, has been removed. yaml is the only supported configuration format.
- The method
Project.get_jobs
method, which was deprecated in v2.1, has been removed. Users should use theProject.get_model_jobs
method instead to get the list of model jobs.
Deprecation summary¶
PredictJob.create
has been deprecated in favor of the alternate workflow usingModel.request_predictions
.- Feature.converter (used internally for object construction) has been made private.
- Model.fetch_resource_data has been deprecated and will be removed in 3.0. To fetch a model from : its ID, use Model.get.
- The ability to use Feature.get with feature IDs (rather than names) is deprecated and will be removed in 3.0.
- Instantiating a
Project
,Model
,Blueprint
,Featurelist
, orFeature
instance from adict
of data is now deprecated. Please use thefrom_data
classmethod of these classes instead. Additionally, instantiating aModel
from a tuple or by using the keyword argumentdata
is also deprecated. - Use of the attribute
Featurelist.project
is now deprecated. You can use theproject_id
attribute of aFeaturelist
to instantiate aProject
instance usingProject.get
. - Use of the attributes
Model.project
,Model.blueprint
, andModel.featurelist
are all deprecated now to avoid use of partially instantiated objects. Please use the ids of these objects instead. - Using a
Project
instance as an argument inFeaturelist.get
is now deprecated. Please use a project_id instead. Similarly, using aProject
instance inModel.get
is also deprecated, and a project_id should be used in its place.
Configuration changes¶
- Previously it was possible (though unintended) that the client configuration could be mixed through
environment variables, configuration files, and arguments to
datarobot.Client
. This logic is now simpler - please see theGetting Started
section of the documentation for more information.
2.2.33¶
Bugfixes¶
- Fixed a bug with non-ascii project names using the package with Python 2.
- Fixed an error that occurred when printing projects that had been constructed from an ID only or printing printing models that had been constructed from a tuple (which impacted printing PredictJobs).
- Fixed a bug with project creation from non-ascii file names. Project creation from non-ascii file names is not supported, so this now raises a more informative exception. The project name is no longer used as the file name in cases where we do not have a file name, which prevents non-ascii project names from causing problems in those circumstances.
- Fixed a bug (affecting Python 2 only) with printing projects, features, and featurelists whose names are not ascii.
2.2.32¶
New features¶
Project.get_features
andFeature.get
methods have been added for feature retrieval.- A generic
Job
entity has been added for use in retrieving the entire queue at once. CallingProject.get_all_jobs
will retrieve all (appropriately filtered) jobs from the queue. Those can be cancelled directly as generic jobs, or transformed into instances of the specific job class usingModelJob.from_job
andPredictJob.from_job
, which allow all functionality previously available via the ModelJob and PredictJob interfaces. Model.train
now supportsfeaturelist_id
andscoring_type
parameters, similar toProject.train
.
Enhancements¶
- Deprecation warning filters have been updated. By default, a filter will be added ensuring that
usage of deprecated features will display a warning once per new usage location. In order to
hide deprecation warnings, a filter like
warnings.filterwarnings('ignore', category=DataRobotDeprecationWarning)
can be added to a script so no such warnings are shown. Watching for deprecation warnings to avoid reliance on deprecated features is recommended. - If your client is misconfigured and does not specify an endpoint, the cloud production server is no longer used as the default as in many cases this is not the correct default.
- This changelog is now included in the distributable of the client.
Bugfixes¶
- Fixed an issue where updating the global client would not affect existing objects with cached clients. Now the global client is used for every API call.
- An issue where mistyping a filepath for use in a file upload has been resolved. Now an error will be raised if it looks like the raw string content for modeling or predictions is just one single line.
API changes¶
- Use of username and password to authenticate is no longer supported - use an API token instead.
- Usage of
start_time
andfinish_time
parameters inProject.get_models
is not supported both in filtering and ordering of models - Default value of
sample_pct
parameter ofModel.train
method is nowNone
instead of100
. If the default value is used, models will be trained with all of the available training data based on project configuration, rather than with entire dataset including holdout for the previous default value of100
. order_by
parameter ofProject.list
which was deprecated in v2.0 has been removed.recommendation_settings
parameter ofProject.start
which was deprecated in v0.2 has been removed.Project.status
method which was deprecated in v0.2 has been removed.Project.wait_for_aim_stage
method which was deprecated in v0.2 has been removed.Delay
,ConstantDelay
,NoDelay
,ExponentialBackoffDelay
,RetryManager
classes fromretry
module which were deprecated in v2.1 were removed.- Package renamed to
datarobot
.
Deprecation summary¶
Project.update
deprecated in favor of specific updates:rename
,unlock_holdout
,set_worker_count
.
Documentation changes¶
- A new use case involving financial data has been added to the
examples
directory. - Added documentation for the partition methods.
2.1.31¶
Bugfixes¶
- In Python 2, using a unicode token to instantiate the client will now work correctly.
2.1.30¶
Bugfixes¶
- The minimum required version of
trafaret
has been upgraded to 0.7.1 to get around an incompatibility between it andsetuptools
.
2.1.29¶
Enhancements¶
- Minimal used version of
requests_toolbelt
package changed from 0.4 to 0.6
2.1.28¶
New features¶
- Default to reading YAML config file from
~/.config/datarobot/drconfig.yaml
- Allow
config_path
argument to client wait_for_autopilot
method added to Project. This method can be used to block execution until autopilot has finished running on the project.- Support for specifying which featurelist to use with initial autopilot in
Project.set_target
Project.get_predict_jobs
method has been added, which looks up all prediction jobs for a projectProject.start_autopilot
method has been added, which starts autopilot on specified featurelist- The schema for
PredictJob
in DataRobot API v2.1 now includes amessage
. This attribute has been added to the PredictJob class. PredictJob.cancel
now exists to cancel prediction jobs, mirroringModelJob.cancel
Project.from_async
is a new classmethod that can be used to wait for an async resolution in project creation. Most users will not need to know about it as it is used behind the scenes inProject.create
andProject.set_target
, but power users who may run into periodic connection errors will be able to catch the new ProjectAsyncFailureError and decide if they would like to resume waiting for async process to resolve
Enhancements¶
AUTOPILOT_MODE
enum now uses string names for autopilot modes instead of numbers
Deprecation summary¶
ConstantDelay
,NoDelay
,ExponentialBackoffDelay
, andRetryManager
utils are now deprecated- INI-style config files are now deprecated (in favor of YAML config files)
- Several functions in the
utils
submodule are now deprecated (they are being moved elsewhere and are not considered part of the public interface) Project.get_jobs
has been renamedProject.get_model_jobs
for clarity and deprecated- Support for the experimental date partitioning has been removed in DataRobot API, so it is being removed from the client immediately.
API changes¶
- In several places where
AppPlatformError
was being raised, nowTypeError
,ValueError
orInputNotUnderstoodError
are now used. With this change, one can now safely assume that when catching anAppPlatformError
it is because of an unexpected response from the server. AppPlatformError
has gained a two new attributes,status_code
which is the HTTP status code of the unexpected response from the server, anderror_code
which is a DataRobot-defined error code.error_code
is not used by any routes in DataRobot API 2.1, but will be in the future. In cases where it is not provided, the instance ofAppPlatformError
will have the attributeerror_code
set toNone
.- Two new subclasses of
AppPlatformError
have been introduced,ClientError
(for 400-level response status codes) andServerError
(for 500-level response status codes). These will make it easier to build automated tooling that can recover from periodic connection issues while polling. - If a
ClientError
orServerError
occurs during a call toProject.from_async
, then aProjectAsyncFailureError
(a subclass of AsyncFailureError) will be raised. That exception will have the status_code of the unexpected response from the server, and the location that was being polled to wait for the asynchronous process to resolve.
2.0.27¶
New features¶
PredictJob
class was added to work with prediction jobswait_for_async_predictions
function added topredict_job
module
Deprecation summary¶
- The
order_by
parameter of theProject.list
is now deprecated.
0.2.26¶
Enhancements¶
Projet.set_target
will re-fetch the project data after it succeeds, keeping the client side in sync with the state of the project on the serverProject.create_featurelist
now throwsDuplicateFeaturesError
exception if passed list of features contains duplicatesProject.get_models
now supports snake_case arguments to its order_by keyword
Deprecation summary¶
Project.wait_for_aim_stage
is now deprecated, as the REST Async flow is a more reliable method of determining that project creation has completed successfullyProject.status
is deprecated in favor ofProject.get_status
recommendation_settings
parameter ofProject.start
is deprecated in favor ofrecommender_settings
Bugfixes¶
Project.wait_for_aim_stage
changed to support Python 3- Fixed incorrect value of
SCORING_TYPE.cross_validation
- Models returned by
Project.get_models
will now be correctly ordered when the order_by keyword is used
0.2.25¶
- Pinned versions of required libraries
0.2.24¶
Official release of v0.2
0.1.24¶
- Updated documentation
- Renamed parameter
name
ofProject.create
andProject.start
toproject_name
- Removed
Model.predict
method wait_for_async_model_creation
function added tomodeljob
modulewait_for_async_status_service
ofProject
class renamed to_wait_for_async_status_service
- Can now use auth_token in config file to configure SDK
0.1.23¶
- Fixes a method that pointed to a removed route
0.1.22¶
- Added
featurelist_id
attribute toModelJob
class
0.1.21¶
- Removes
model
attribute fromModelJob
class
0.1.20¶
- Project creation raises
AsyncProjectCreationError
if it was unsuccessful - Removed
Model.list_prime_rulesets
andModel.get_prime_ruleset
methods - Removed
Model.predict_batch
method - Removed
Project.create_prime_model
method - Removed
PrimeRuleSet
model - Adds backwards compatibility bridge for ModelJob async
- Adds ModelJob.get and ModelJob.get_model
0.1.19¶
- Minor bugfixes in
wait_for_async_status_service
0.1.18¶
- Removes
submit_model
from Project until server-side implementation is improved - Switches training URLs for new resource-based route at /projects/
/models/ - Job renamed to ModelJob, and using modelJobs route
- Fixes an inconsistency in argument order for
train
methods
0.1.17¶
wait_for_async_status_service
timeout increased from 60s to 600s
0.1.16¶
Project.create
will now handle both async/sync project creation
0.1.15¶
- All routes pluralized to sync with changes in API
Project.get_jobs
will request all jobs when no param specified- dataframes from
predict
method will have pythonic names Project.get_status
created,Project.status
now deprecatedProject.unlock_holdout
created.- Added
quickrun
parameter toProject.set_target
- Added
modelCategory
to Model schema - Add
permalinks
feature to Project and Model objects. Project.create_prime_model
created
0.1.14¶
Project.set_worker_count
fix for compatibility with API change in project update.
0.1.13¶
- Add positive class to
set_target
. - Change attributes names of
Project
,Model
,Job
andBlueprint
: -features
inModel
,Job
andBlueprint
are nowprocesses
dataset_id
anddataset_name
migrated tofeaturelist_id
andfeaturelist_name
.samplepct
->sample_pct
Model
has nowblueprint
,project
, andfeaturlist
attributes.- Minor bugfixes.
0.1.12¶
- Minor fixes regarding rename
Job
attributes.features
attributes now namedprocesses
,samplepct
now issample_pct
.
0.1.11¶
(May 27, 2015)
- Minor fixes regarding migrating API from under_score names to camelCase.
0.1.10¶
(May 20, 2015)
- Remove
Project.upload_file
,Project.upload_file_from_url
andProject.attach_file
methods. Moved all logic that uploading file toProject.create
method.
0.1.9¶
(May 15, 2015)
- Fix uploading file causing a lot of memory usage. Minor bugfixes.