# Data management

> Data management - These are features either included in the originally uploaded dataset or added to
> it via feature transformations. In time series projects, these will be distinct from
> theModelingFeatures created during partitioning; otherwise, they will correspond to the same
> features. For more information about input and modeling features, see thetime series documentation.

This Markdown file sits beside the HTML page at the same path (with a `.md` suffix). It summarizes the topic and lists links for tools and LLM context.

Companion generated at `2026-05-06T18:17:09.833072+00:00` (UTC).

## Primary page

- [Data management](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html): Full documentation for this topic (HTML).

## Sections on this page

- [classdatarobot.models.Feature](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.Feature): In-page section heading.
- [classmethodget(project_id, feature_name)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.Feature.get): In-page section heading.
- [get_multiseries_properties(multiseries_id_columns, max_wait=600)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.Feature.get_multiseries_properties): In-page section heading.
- [get_cross_series_properties(datetime_partition_column, cross_series_group_by_columns, max_wait=600)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.Feature.get_cross_series_properties): In-page section heading.
- [get_multicategorical_histogram()](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.Feature.get_multicategorical_histogram): In-page section heading.
- [get_pairwise_correlations()](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.Feature.get_pairwise_correlations): In-page section heading.
- [get_pairwise_joint_probabilities()](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.Feature.get_pairwise_joint_probabilities): In-page section heading.
- [get_pairwise_conditional_probabilities()](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.Feature.get_pairwise_conditional_probabilities): In-page section heading.
- [classmethodfrom_data(data)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.Feature.from_data): In-page section heading.
- [classmethodfrom_server_data(data, keep_attrs=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.Feature.from_server_data): In-page section heading.
- [get_histogram(bin_limit=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.Feature.get_histogram): In-page section heading.
- [classdatarobot.models.ModelingFeature](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.ModelingFeature): In-page section heading.
- [classmethodget(project_id, feature_name)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.ModelingFeature.get): In-page section heading.
- [classdatarobot.models.DatasetFeature](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.DatasetFeature): In-page section heading.
- [get_histogram(bin_limit=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.DatasetFeature.get_histogram): In-page section heading.
- [classdatarobot.models.DatasetFeatureHistogram](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.DatasetFeatureHistogram): In-page section heading.
- [classmethodget(dataset_id, feature_name, bin_limit=None, key_name=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.DatasetFeatureHistogram.get): In-page section heading.
- [classdatarobot.models.FeatureHistogram](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.FeatureHistogram): In-page section heading.
- [classmethodget(project_id, feature_name, bin_limit=None, key_name=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.FeatureHistogram.get): In-page section heading.
- [classdatarobot.models.InteractionFeature](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.InteractionFeature): In-page section heading.
- [classmethodget(project_id, feature_name)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.InteractionFeature.get): In-page section heading.
- [classdatarobot.models.MulticategoricalHistogram](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.MulticategoricalHistogram): In-page section heading.
- [classmethodget(multilabel_insights_key)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.MulticategoricalHistogram.get): In-page section heading.
- [to_dataframe()](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.MulticategoricalHistogram.to_dataframe): In-page section heading.
- [classdatarobot.models.PairwiseCorrelations](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.PairwiseCorrelations): In-page section heading.
- [classmethodget(multilabel_insights_key)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.PairwiseCorrelations.get): In-page section heading.
- [as_dataframe()](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.PairwiseCorrelations.as_dataframe): In-page section heading.
- [classdatarobot.models.PairwiseJointProbabilities](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.PairwiseJointProbabilities): In-page section heading.
- [classmethodget(multilabel_insights_key)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.PairwiseJointProbabilities.get): In-page section heading.
- [as_dataframe(relevance_configuration)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.PairwiseJointProbabilities.as_dataframe): In-page section heading.
- [classdatarobot.models.PairwiseConditionalProbabilities](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.PairwiseConditionalProbabilities): In-page section heading.
- [classmethodget(multilabel_insights_key)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.PairwiseConditionalProbabilities.get): In-page section heading.
- [as_dataframe(relevance_configuration)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.PairwiseConditionalProbabilities.as_dataframe): In-page section heading.
- [Restoring Discarded Features](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#restoring-discarded-features): In-page section heading.
- [classdatarobot.models.restore_discarded_features.DiscardedFeaturesInfo](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.restore_discarded_features.DiscardedFeaturesInfo): In-page section heading.
- [classmethodrestore(project_id, features_to_restore, max_wait=600)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.restore_discarded_features.DiscardedFeaturesInfo.restore): In-page section heading.
- [classmethodretrieve(project_id)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.restore_discarded_features.DiscardedFeaturesInfo.retrieve): In-page section heading.
- [classdatarobot.models.restore_discarded_features.FeatureRestorationStatus](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.restore_discarded_features.FeatureRestorationStatus): In-page section heading.
- [Feature lists](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#feature-lists): In-page section heading.
- [classdatarobot.DatasetFeaturelist](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.DatasetFeaturelist): In-page section heading.
- [classmethodget(dataset_id, featurelist_id)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.DatasetFeaturelist.get): In-page section heading.
- [delete()](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.DatasetFeaturelist.delete): In-page section heading.
- [update(name=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.DatasetFeaturelist.update): In-page section heading.
- [classdatarobot.models.Featurelist](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.Featurelist): In-page section heading.
- [classmethodfrom_data(data)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.Featurelist.from_data): In-page section heading.
- [classmethodget(project_id, featurelist_id)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.Featurelist.get): In-page section heading.
- [delete(dry_run=False, delete_dependencies=False)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.Featurelist.delete): In-page section heading.
- [classmethodfrom_server_data(data, keep_attrs=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.Featurelist.from_server_data): In-page section heading.
- [update(name=None, description=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.Featurelist.update): In-page section heading.
- [classdatarobot.models.ModelingFeaturelist](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.ModelingFeaturelist): In-page section heading.
- [classmethodget(project_id, featurelist_id)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.ModelingFeaturelist.get): In-page section heading.
- [update(name=None, description=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.ModelingFeaturelist.update): In-page section heading.
- [delete(dry_run=False, delete_dependencies=False)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.ModelingFeaturelist.delete): In-page section heading.
- [classdatarobot.models.featurelist.DeleteFeatureListResult](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.featurelist.DeleteFeatureListResult): In-page section heading.
- [Dataset definition](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#dataset-definition): In-page section heading.
- [classdatarobot.helpers.feature_discovery.DatasetDefinition](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.helpers.feature_discovery.DatasetDefinition): In-page section heading.
- [Relationships](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#relationships): In-page section heading.
- [classdatarobot.helpers.feature_discovery.Relationship](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.helpers.feature_discovery.Relationship): In-page section heading.
- [Relationships configuration](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#relationships-configuration): In-page section heading.
- [classdatarobot.models.RelationshipsConfiguration](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.RelationshipsConfiguration): In-page section heading.
- [classmethodcreate(dataset_definitions, relationships, feature_discovery_settings=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.RelationshipsConfiguration.create): In-page section heading.
- [get()](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.RelationshipsConfiguration.get): In-page section heading.
- [replace(dataset_definitions, relationships, feature_discovery_settings=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.RelationshipsConfiguration.replace): In-page section heading.
- [delete()](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.RelationshipsConfiguration.delete): In-page section heading.
- [Feature lineage](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#feature-lineage): In-page section heading.
- [classdatarobot.models.FeatureLineage](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.FeatureLineage): In-page section heading.
- [classmethodget(project_id, id)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.FeatureLineage.get): In-page section heading.
- [OCR job resources](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#ocr-job-resources): In-page section heading.
- [classdatarobot.models.ocr_job_resource.OCRJobResource](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.ocr_job_resource.OCRJobResource): In-page section heading.
- [classmethodget(job_resource_id)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.ocr_job_resource.OCRJobResource.get): In-page section heading.
- [classmethodlist(offset=0, limit=10)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.ocr_job_resource.OCRJobResource.list): In-page section heading.
- [classmethodcreate(input_catalog_id, language, engine_specific_parameters=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.ocr_job_resource.OCRJobResource.create): In-page section heading.
- [start_job()](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.ocr_job_resource.OCRJobResource.start_job): In-page section heading.
- [get_job_status()](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.ocr_job_resource.OCRJobResource.get_job_status): In-page section heading.
- [download_error_report(download_file_path)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.ocr_job_resource.OCRJobResource.download_error_report): In-page section heading.
- [classmethodfrom_server_data(data, keep_attrs=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.ocr_job_resource.OCRJobResource.from_server_data): In-page section heading.
- [classdatarobot.models.ocr_job_resource.OCREngineSpecificParameters](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.ocr_job_resource.OCREngineSpecificParameters): In-page section heading.
- [get_payload()](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.ocr_job_resource.OCREngineSpecificParameters.get_payload): In-page section heading.
- [classdatarobot.models.ocr_job_resource.OCRJobDatasetLanguage](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.ocr_job_resource.OCRJobDatasetLanguage): In-page section heading.
- [classdatarobot.models.ocr_job_resource.DataRobotOCREngineType](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.ocr_job_resource.DataRobotOCREngineType): In-page section heading.
- [classdatarobot.models.ocr_job_resource.DataRobotArynOutputFormat](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.ocr_job_resource.DataRobotArynOutputFormat): In-page section heading.
- [classdatarobot.models.ocr_job_resource.OCRJobStatusEnum](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.ocr_job_resource.OCRJobStatusEnum): In-page section heading.
- [classdatarobot.models.ocr_job_resource.StartOCRJobResponse](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.ocr_job_resource.StartOCRJobResponse): In-page section heading.
- [Document text extraction](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#document-text-extraction): In-page section heading.
- [classdatarobot.models.documentai.document.FeaturesWithSamples](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.documentai.document.FeaturesWithSamples): In-page section heading.
- [document_task](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.documentai.document.FeaturesWithSamples.document_task): In-page section heading.
- [feature_name](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.documentai.document.FeaturesWithSamples.feature_name): In-page section heading.
- [model_id](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.documentai.document.FeaturesWithSamples.model_id): In-page section heading.
- [classdatarobot.models.documentai.document.DocumentPageFile](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.documentai.document.DocumentPageFile): In-page section heading.
- [propertythumbnail_bytes: bytes](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.documentai.document.DocumentPageFile.thumbnail_bytes): In-page section heading.
- [propertymime_type: str](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.documentai.document.DocumentPageFile.mime_type): In-page section heading.
- [classdatarobot.models.documentai.document.DocumentThumbnail](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.documentai.document.DocumentThumbnail): In-page section heading.
- [classmethodlist(project_id, feature_name, target_value=None, offset=None, limit=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.documentai.document.DocumentThumbnail.list): In-page section heading.
- [classdatarobot.models.documentai.document.DocumentTextExtractionSample](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.documentai.document.DocumentTextExtractionSample): In-page section heading.
- [classmethodcompute(model_id, await_completion=True, max_wait=600)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.documentai.document.DocumentTextExtractionSample.compute): In-page section heading.
- [classmethodlist_features_with_samples(project_id)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.documentai.document.DocumentTextExtractionSample.list_features_with_samples): In-page section heading.
- [classmethodlist_pages(model_id, feature_name, document_index=None, document_task=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.documentai.document.DocumentTextExtractionSample.list_pages): In-page section heading.
- [classmethodlist_documents(model_id, feature_name)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.documentai.document.DocumentTextExtractionSample.list_documents): In-page section heading.
- [classdatarobot.models.documentai.document.DocumentTextExtractionSampleDocument](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.documentai.document.DocumentTextExtractionSampleDocument): In-page section heading.
- [classmethodlist(model_id, feature_name, document_task=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.documentai.document.DocumentTextExtractionSampleDocument.list): In-page section heading.
- [classdatarobot.models.documentai.document.DocumentTextExtractionSamplePage](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.documentai.document.DocumentTextExtractionSamplePage): In-page section heading.
- [classmethodlist(model_id, feature_name, document_index=None, document_task=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.documentai.document.DocumentTextExtractionSamplePage.list): In-page section heading.
- [get_document_page_with_text_locations(line_color='blue', line_width=3, padding=3)](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.documentai.document.DocumentTextExtractionSamplePage.get_document_page_with_text_locations): In-page section heading.

## Related documentation

- [Developer documentation](https://docs.datarobot.com/en/docs/api/index.html): Linked from this page.
- [API reference](https://docs.datarobot.com/en/docs/api/reference/index.html): Linked from this page.
- [Python API client](https://docs.datarobot.com/en/docs/api/reference/sdk/index.html): Linked from this page.
- [Data preparation](https://docs.datarobot.com/en/docs/api/reference/sdk/tag-data-prep.html): Linked from this page.
- [time series documentation](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/time_series.html#input-vs-modeling): Linked from this page.
- [datarobot.errors.InvalidUsageError](https://docs.datarobot.com/en/docs/api/reference/sdk/errors.html#datarobot.errors.InvalidUsageError): Linked from this page.

## Documentation content

### class datarobot.models.Feature

A feature from a project’s dataset

These are features either included in the originally uploaded dataset or added to it via
feature transformations.  In time series projects, these will be distinct from the [ModelingFeature](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.ModelingFeature) s created during partitioning;
otherwise, they will correspond to the same features.  For more information about input and
modeling features, see the [time series documentation](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/time_series.html#input-vs-modeling).

The `min`, `max`, `mean`, `median`, and `std_dev` attributes provide information about
the distribution of the feature in the EDA sample data.  For non-numeric features or features
created prior to these summary statistics becoming available, they will be None.  For features
where the summary statistics are available, they will be in a format compatible with the data
type, i.e., date type features will have their summary statistics expressed as ISO-8601
formatted date strings.

- Variables:

#### classmethod get(project_id, feature_name)

Retrieve a single feature

- Parameters:
- Returns: feature – The queried instance
- Return type: Feature

#### get_multiseries_properties(multiseries_id_columns, max_wait=600)

Retrieve time series properties for a potential multiseries datetime partition column

Multiseries time series projects use multiseries id columns to model multiple distinct
series within a single project.  This function returns the time series properties (time step
and time unit) of this column if it were used as a datetime partition column with the
specified multiseries id columns, running multiseries detection automatically if it had not
previously been successfully ran.

- Parameters:
- Returns:properties– A dict with three keys: time_series_eligible : bool, whether the column can be used as a partition columntime_unit : str or null, the inferred time unit if used as a partition columntime_step : int or null, the inferred time step if used as a partition columnReturn type:dict

#### get_cross_series_properties(datetime_partition_column, cross_series_group_by_columns, max_wait=600)

Retrieve cross-series properties for multiseries ID column.

This function returns the cross-series properties (eligibility
as group-by column) of this column if it were used with specified datetime partition column
and with current multiseries id column, running cross-series group-by validation
automatically if it had not previously been successfully ran.

- Parameters:
- Returns:properties– A dict with three keys: name : str, column nameeligibility : str, reason for column eligibilityisEligible : bool, is column eligible as cross-series group-byReturn type:dict

#### get_multicategorical_histogram()

Retrieve multicategorical histogram for this feature

Added in version v2.24.

- Return type: datarobot.models.MulticategoricalHistogram
- Raises:

#### get_pairwise_correlations()

Retrieve pairwise label correlation for multicategorical features

Added in version v2.24.

- Return type: datarobot.models.PairwiseCorrelations
- Raises:

#### get_pairwise_joint_probabilities()

Retrieve pairwise label joint probabilities for multicategorical features

Added in version v2.24.

- Return type: datarobot.models.PairwiseJointProbabilities
- Raises:

#### get_pairwise_conditional_probabilities()

Retrieve pairwise label conditional probabilities for multicategorical features

Added in version v2.24.

- Return type: datarobot.models.PairwiseConditionalProbabilities
- Raises:

#### classmethod from_data(data)

Instantiate an object of this class using a dict.

- Parameters: data ( dict ) – Correctly snake_cased keys and their values.
- Return type: TypeVar ( T , bound= APIObject)

#### classmethod from_server_data(data, keep_attrs=None)

Instantiate an object of this class using the data directly from the server,
meaning that the keys may have the wrong camel casing

- Parameters:
- Return type: TypeVar ( T , bound= APIObject)

#### get_histogram(bin_limit=None)

Retrieve a feature histogram

- Parameters: bin_limit ( int or None ) – Desired max number of histogram bins. If omitted, by default
  endpoint will use 60.
- Returns: featureHistogram – The requested histogram with desired number or bins
- Return type: FeatureHistogram

### class datarobot.models.ModelingFeature

A feature used for modeling

In time series projects, a new set of modeling features is created after setting the
partitioning options.  These features are automatically derived from those in the project’s
dataset and are the features used for modeling.  Modeling features are only accessible once
the target and partitioning options have been set.  In projects that don’t use time series
modeling, once the target has been set, ModelingFeatures and Features will behave
the same.

For more information about input and modeling features, see the [time series documentation](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/time_series.html#input-vs-modeling).

As with the [Feature](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.Feature) object, the min, max, `mean,
median, and std_dev attributes provide information about the distribution of the feature in
the EDA sample data.  For non-numeric features, they will be None.  For features where the
summary statistics are available, they will be in a format compatible with the data type, i.e.
date type features will have their summary statistics expressed as ISO-8601 formatted date
strings.

- Variables:

#### classmethod get(project_id, feature_name)

Retrieve a single modeling feature

- Parameters:
- Returns: feature – The requested feature
- Return type: ModelingFeature

### class datarobot.models.DatasetFeature

A feature from a project’s dataset

These are features either included in the originally uploaded dataset or added to it via
feature transformations.

The `min`, `max`, `mean`, `median`, and `std_dev` attributes provide information about
the distribution of the feature in the EDA sample data.  For non-numeric features or features
created prior to these summary statistics becoming available, they will be None.  For features
where the summary statistics are available, they will be in a format compatible with the data
type, i.e., date type features will have their summary statistics expressed as ISO-8601
formatted date strings.

- Variables:

#### get_histogram(bin_limit=None)

Retrieve a feature histogram

- Parameters: bin_limit ( int or None ) – Desired max number of histogram bins. If omitted, by default
  endpoint will use 60.
- Returns: featureHistogram – The requested histogram with desired number or bins
- Return type: DatasetFeatureHistogram

### class datarobot.models.DatasetFeatureHistogram

#### classmethod get(dataset_id, feature_name, bin_limit=None, key_name=None)

Retrieve a single feature histogram

- Parameters:
- Returns: featureHistogram – The queried instance with plot attribute in it.
- Return type: FeatureHistogram

### class datarobot.models.FeatureHistogram

#### classmethod get(project_id, feature_name, bin_limit=None, key_name=None)

Retrieve a single feature histogram

- Parameters:
- Returns: featureHistogram – The queried instance with plot attribute in it.
- Return type: FeatureHistogram

### class datarobot.models.InteractionFeature

Interaction feature data

Added in version v2.21.

- Variables:

#### classmethod get(project_id, feature_name)

Retrieve a single Interaction feature

- Parameters:
- Returns: feature – The queried instance
- Return type: InteractionFeature

### class datarobot.models.MulticategoricalHistogram

Histogram for Multicategorical feature.

Added in version v2.24.

> [!NOTE] Notes
> `HistogramValues` contains:
> 
> values.[].label
> : string - Label name
> values.[].plot
> : list - Histogram for label
> values.[].plot.[].label_relevance
> : int - Label relevance value
> values.[].plot.[].row_count
> : int - Row count where label has given relevance
> values.[].plot.[].row_pct
> : float - Percentage of rows where label has given relevance

- Variables:

#### classmethod get(multilabel_insights_key)

Retrieves multicategorical histogram

You might find it more convenient to use [Feature.get_multicategorical_histogram](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.Feature.get_multicategorical_histogram) instead.

- Parameters: multilabel_insights_key ( string ) – Key for multilabel insights, unique for a project, feature and EDA stage combination.
  The multilabel_insights_key can be retrieved via Feature.multilabel_insights_key .
- Returns: The multicategorical histogram for multilabel_insights_key
- Return type: MulticategoricalHistogram

#### to_dataframe()

Convenience method to get all the information from this multicategorical_histogram instance
in form of a `pandas.DataFrame`.

- Returns: Histogram information as a multicategorical_histogram. The dataframe will contain these
  columns: feature_name, label, label_relevance, row_count and row_pct
- Return type: pandas.DataFrame

### class datarobot.models.PairwiseCorrelations

Correlation of label pairs for multicategorical feature.

Added in version v2.24.

> [!NOTE] Notes
> `CorrelationValues` contain:
> 
> values.[].label_configuration
> : list of length 2 - Configuration of the label pair
> values.[].label_configuration.[].label
> : str – Label name
> values.[].statistic_value
> : float – Statistic value

- Variables:

#### classmethod get(multilabel_insights_key)

Retrieves pairwise correlations

You might find it more convenient to use [Feature.get_pairwise_correlations](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.Feature.get_pairwise_correlations) instead.

- Parameters: multilabel_insights_key ( string ) – Key for multilabel insights, unique for a project, feature and EDA stage combination.
  The multilabel_insights_key can be retrieved via Feature.multilabel_insights_key .
- Returns: The pairwise label correlations
- Return type: PairwiseCorrelations

#### as_dataframe()

The pairwise label correlations as a (num_labels x num_labels) DataFrame.

- Returns: The pairwise label correlations. Index and column names allow the interpretation of the
  values.
- Return type: pandas.DataFrame

### class datarobot.models.PairwiseJointProbabilities

Joint probabilities of label pairs for multicategorical feature.

Added in version v2.24.

> [!NOTE] Notes
> `ProbabilityValues` contain:
> 
> values.[].label_configuration
> : list of length 2 - Configuration of the label pair
> values.[].label_configuration.[].relevance
> : int – 0 for absence of the labels,
>   1 for the presence of labels
> values.[].label_configuration.[].label
> : str – Label name
> values.[].statistic_value
> : float – Statistic value

- Variables:

#### classmethod get(multilabel_insights_key)

Retrieves pairwise joint probabilities

You might find it more convenient to use [Feature.get_pairwise_joint_probabilities](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.Feature.get_pairwise_joint_probabilities) instead.

- Parameters: multilabel_insights_key ( string ) – Key for multilabel insights, unique for a project, feature and EDA stage combination.
  The multilabel_insights_key can be retrieved via Feature.multilabel_insights_key .
- Returns: The pairwise joint probabilities
- Return type: PairwiseJointProbabilities

#### as_dataframe(relevance_configuration)

Joint probabilities of label pairs as a (num_labels x num_labels) DataFrame.

- Parameters:relevance_configuration(tupleoflength 2) – Valid options are (0, 0), (0, 1), (1, 0) and (1, 1). Values of 0 indicate absence of
labels and 1 indicates presence of labels. The first value describes the
presence for the labels in axis=0 and the second value describes the presence for the
labels in axis=1. For example the matrix values for a relevance configuration of (0, 1) describe the
probabilities of absent labels in the index axis and present labels in the column
axis. E.g. The probability P(A=0,B=1) can be retrieved via:pairwise_joint_probabilities.as_dataframe((0,1)).loc['A', 'B']*Returns:The joint probabilities for the requestedrelevance_configuration. Index and column
names allow the interpretation of the values.
*Return type:pandas.DataFrame

### class datarobot.models.PairwiseConditionalProbabilities

Conditional probabilities of label pairs for multicategorical feature.

Added in version v2.24.

> [!NOTE] Notes
> `ProbabilityValues` contain:
> 
> values.[].label_configuration
> : list of length 2 - Configuration of the label pair
> values.[].label_configuration.[].relevance
> : int – 0 for absence of the labels,
>   1 for the presence of labels
> values.[].label_configuration.[].label
> : str – Label name
> values.[].statistic_value
> : float – Statistic value

- Variables:

#### classmethod get(multilabel_insights_key)

Retrieves pairwise conditional probabilities

You might find it more convenient to use [Feature.get_pairwise_conditional_probabilities](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.Feature.get_pairwise_conditional_probabilities) instead.

- Parameters: multilabel_insights_key ( string ) – Key for multilabel insights, unique for a project, feature and EDA stage combination.
  The multilabel_insights_key can be retrieved via Feature.multilabel_insights_key .
- Returns: The pairwise conditional probabilities
- Return type: PairwiseConditionalProbabilities

#### as_dataframe(relevance_configuration)

Conditional probabilities of label pairs as a (num_labels x num_labels) DataFrame.
The label names in the columns are the events, on which we condition. The label names in the
index are the events whose conditional probability given the indexes is in the dataframe.

E.g. The probability P(A=0|B=1) can be retrieved via: `pairwise_conditional_probabilities.as_dataframe((0, 1)).loc['A', 'B']`

- Parameters:relevance_configuration(tupleoflength 2) – Valid options are (0, 0), (0, 1), (1, 0) and (1, 1). Values of 0 indicate absence of
labels and 1 indicates presence of labels. The first value describes the
presence for the labels in axis=0 and the second value describes the presence for the
labels in axis=1. For example the matrix values for a relevance configuration of (0, 1) describe the
probabilities of absent labels in the index axis given the
presence of labels in the column axis.
*Returns:The conditional probabilities for the requestedrelevance_configuration.
Index and column names allow the interpretation of the values.
*Return type:pandas.DataFrame

## Restoring Discarded Features

### class datarobot.models.restore_discarded_features.DiscardedFeaturesInfo

An object containing information about time series features which were reduced
during time series feature generation process. These features can be restored back to the
project. They will be included into All Time Series Features and can be used to create new
feature lists.

Added in version v2.27.

- Variables:

#### classmethod restore(project_id, features_to_restore, max_wait=600)

Restore discarded during time series feature generation process features back to the
project. After restoration features will be included into All Time Series Features.

Added in version v2.27.

- Parameters:
- Returns: status – information about features which were restored and which were not.
- Return type: FeatureRestorationStatus

#### classmethod retrieve(project_id)

Retrieve the discarded features information for a given project.

Added in version v2.27.

- Parameters: project_id ( string )
- Returns: info – information about features which were discarded during feature generation process and
  limits how many features can be restored.
- Return type: DiscardedFeaturesInfo

### class datarobot.models.restore_discarded_features.FeatureRestorationStatus

Status of the feature restoration process.

Added in version v2.27.

- Variables:

## Feature lists

### class datarobot.DatasetFeaturelist

A set of features attached to a dataset in the AI Catalog

- Variables:

#### classmethod get(dataset_id, featurelist_id)

Retrieve a dataset featurelist

- Parameters:
- Returns: featurelist – the specified featurelist
- Return type: DatasetFeatureList

#### delete()

Delete a dataset featurelist

Featurelists configured into the dataset as a default featurelist cannot be deleted.

- Return type: None

#### update(name=None)

Update the name of an existing featurelist

Note that only user-created featurelists can be renamed, and that names must not
conflict with names used by other featurelists.

- Parameters: name ( Optional[str] ) – the new name for the featurelist
- Return type: None

### class datarobot.models.Featurelist

A set of features used in modeling

- Variables:

#### classmethod from_data(data)

Overrides the parent method to ensure description is always populated

- Parameters: data ( dict ) – the data from the server, having gone through processing
- Return type: TypeVar ( TFeaturelist , bound= Featurelist)

#### classmethod get(project_id, featurelist_id)

Retrieve a known feature list

- Parameters:
- Returns: featurelist – The queried instance
- Return type: Featurelist
- Raises: ValueError – passed project_id parameter value is of not supported type

#### delete(dry_run=False, delete_dependencies=False)

Delete a featurelist, and any models and jobs using it

All models using a featurelist, whether as the training featurelist or as a monotonic
constraint featurelist, will also be deleted when the deletion is executed and any queued or
running jobs using it will be cancelled. Similarly, predictions made on these models will
also be deleted. All the entities that are to be deleted with a featurelist are described
as “dependencies” of it.  To preview the results of deleting a featurelist, call delete
with dry_run=True

When deleting a featurelist with dependencies, users must specify delete_dependencies=True
to confirm they want to delete the featurelist and all its dependencies. Without that
option, only featurelists with no dependencies may be successfully deleted and others will
error.

Featurelists configured into the project as a default featurelist or as a default monotonic
constraint featurelist cannot be deleted.

Featurelists used in a model deployment cannot be deleted until the model deployment is
deleted.

- Parameters:
- Returns:result– A dictionary describing the result of deleting the featurelist, with the following keys
: - dry_run : bool, whether the deletion was a dry run or an actual deletion
  - can_delete : bool, whether the featurelist can actually be deleted
  - deletion_blocked_reason : str, why the featurelist can’t be deleted (if it can’t)
  - num_affected_models : int, the number of models using this featurelist
  - num_affected_jobs : int, the number of jobs using this featurelist
*Return type:dict

#### classmethod from_server_data(data, keep_attrs=None)

Instantiate an object of this class using the data directly from the server,
meaning that the keys may have the wrong camel casing

- Parameters:
- Return type: TypeVar ( T , bound= APIObject)

#### update(name=None, description=None)

Update the name or description of an existing featurelist

Note that only user-created featurelists can be renamed, and that names must not
conflict with names used by other featurelists.

- Parameters:
- Return type: None

### class datarobot.models.ModelingFeaturelist

A set of features that can be used to build a model

In time series projects, a new set of modeling features is created after setting the
partitioning options.  These features are automatically derived from those in the project’s
dataset and are the features used for modeling.  Modeling features are only accessible once
the target and partitioning options have been set.  In projects that don’t use time series
modeling, once the target has been set, ModelingFeaturelists and Featurelists will behave
the same.

For more information about input and modeling features, see the [time series documentation](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/time_series.html#input-vs-modeling).

- Variables:

#### classmethod get(project_id, featurelist_id)

Retrieve a modeling featurelist

Modeling featurelists can only be retrieved once the target and partitioning options have
been set.

- Parameters:
- Returns: featurelist – the specified featurelist
- Return type: ModelingFeaturelist

#### update(name=None, description=None)

Update the name or description of an existing featurelist

Note that only user-created featurelists can be renamed, and that names must not
conflict with names used by other featurelists.

- Parameters:
- Return type: None

#### delete(dry_run=False, delete_dependencies=False)

Delete a featurelist, and any models and jobs using it

All models using a featurelist, whether as the training featurelist or as a monotonic
constraint featurelist, will also be deleted when the deletion is executed and any queued or
running jobs using it will be cancelled. Similarly, predictions made on these models will
also be deleted. All the entities that are to be deleted with a featurelist are described
as “dependencies” of it.  To preview the results of deleting a featurelist, call delete
with dry_run=True

When deleting a featurelist with dependencies, users must specify delete_dependencies=True
to confirm they want to delete the featurelist and all its dependencies. Without that
option, only featurelists with no dependencies may be successfully deleted and others will
error.

Featurelists configured into the project as a default featurelist or as a default monotonic
constraint featurelist cannot be deleted.

Featurelists used in a model deployment cannot be deleted until the model deployment is
deleted.

- Parameters:
- Returns:result– A dictionary describing the result of deleting the featurelist, with the following keys
: - dry_run : bool, whether the deletion was a dry run or an actual deletion
  - can_delete : bool, whether the featurelist can actually be deleted
  - deletion_blocked_reason : str, why the featurelist can’t be deleted (if it can’t)
  - num_affected_models : int, the number of models using this featurelist
  - num_affected_jobs : int, the number of jobs using this featurelist
*Return type:dict

### class datarobot.models.featurelist.DeleteFeatureListResult

## Dataset definition

### class datarobot.helpers.feature_discovery.DatasetDefinition

Dataset definition for the Feature Discovery

Added in version v2.25.

- Variables:

> [!NOTE] Examples
> ```
> import datarobot as dr
> dataset_definition = dr.DatasetDefinition(
>     identifier='profile',
>     catalog_id='5ec4aec1f072bc028e3471ae',
>     catalog_version_id='5ec4aec2f072bc028e3471b1',
> )
> 
> dataset_definition = dr.DatasetDefinition(
>     identifier='transaction',
>     catalog_id='5ec4aec1f072bc028e3471ae',
>     catalog_version_id='5ec4aec2f072bc028e3471b1',
>     primary_temporal_key='Date'
> )
> ```

## Relationships

### class datarobot.helpers.feature_discovery.Relationship

Relationship between dataset defined in DatasetDefinition

Added in version v2.25.

- Variables:

> [!NOTE] Examples
> ```
> import datarobot as dr
> relationship = dr.Relationship(
>     dataset1_identifier='profile',
>     dataset2_identifier='transaction',
>     dataset1_keys=['CustomerID'],
>     dataset2_keys=['CustomerID']
> )
> 
> relationship = dr.Relationship(
>     dataset2_identifier='profile',
>     dataset1_keys=['CustomerID'],
>     dataset2_keys=['CustomerID'],
>     feature_derivation_window_start=-14,
>     feature_derivation_window_end=-1,
>     feature_derivation_window_time_unit='DAY',
>     prediction_point_rounding=1,
>     prediction_point_rounding_time_unit='DAY'
> )
> ```

## Relationships configuration

### class datarobot.models.RelationshipsConfiguration

A Relationships configuration specifies a set of secondary datasets as well as
the relationships among them. It is used to configure Feature Discovery for a project
to generate features automatically from these datasets.

- Variables:

#### classmethod create(dataset_definitions, relationships, feature_discovery_settings=None)

Create a Relationships Configuration

- Parameters:
- Returns: relationships_configuration – Created relationships configuration
- Return type: RelationshipsConfiguration

> [!NOTE] Examples
> ```
> dataset_definition = dr.DatasetDefinition(
>     identifier='profile',
>     catalog_id='5fd06b4af24c641b68e4d88f',
>     catalog_version_id='5fd06b4af24c641b68e4d88f'
> )
> relationship = dr.Relationship(
>     dataset2_identifier='profile',
>     dataset1_keys=['CustomerID'],
>     dataset2_keys=['CustomerID'],
>     feature_derivation_window_start=-14,
>     feature_derivation_window_end=-1,
>     feature_derivation_window_time_unit='DAY',
>     prediction_point_rounding=1,
>     prediction_point_rounding_time_unit='DAY'
> )
> dataset_definitions = [dataset_definition]
> relationships = [relationship]
> relationship_config = dr.RelationshipsConfiguration.create(
>     dataset_definitions=dataset_definitions,
>     relationships=relationships,
>     feature_discovery_settings = [
>         {'name': 'enable_categorical_statistics', 'value': True},
>         {'name': 'enable_numeric_skewness', 'value': True},
>     ]
> )
> >>> relationship_config.id
> '5c88a37770fc42a2fcc62759'
> ```

#### get()

Retrieve the Relationships configuration for a given id

- Returns: relationships_configuration – The requested relationships configuration
- Return type: RelationshipsConfiguration
- Raises: ClientError – Raised if an invalid relationships config id is provided.

> [!NOTE] Examples
> ```
> relationships_config = dr.RelationshipsConfiguration(valid_config_id)
> result = relationships_config.get()
> >>> result.id
> '5c88a37770fc42a2fcc62759'
> ```

#### replace(dataset_definitions, relationships, feature_discovery_settings=None)

Update the Relationships Configuration which is not used in
the feature discovery Project

- Parameters:
- Returns: relationships_configuration – the updated relationships configuration
- Return type: RelationshipsConfiguration

#### delete()

Delete the Relationships configuration

- Raises: ClientError – Raised if an invalid relationships config id is provided.

> [!NOTE] Examples
> ```
> # Deleting with a valid id
> relationships_config = dr.RelationshipsConfiguration(valid_config_id)
> status_code = relationships_config.delete()
> status_code
> >>> 204
> relationships_config.get()
> >>> ClientError: Relationships Configuration not found
> ```

## Feature lineage

### class datarobot.models.FeatureLineage

Lineage of an automatically engineered feature.

- Variables:steps(list) – list of steps which were applied to build the feature. stepsstructure is: id - (int)
: step id starting with 0.step_type: (str)
: one of the data/action/json/generatedData.name: (str)
: name of the step.description: (str)
: description of the step.parents: (list[int])
: references to other steps id.is_time_aware: (bool)
: indicator of step being time aware. Mandatory only foractionandjoinsteps.actionstep provides additional information about feature derivation window
  in the timeInfo field.catalog_id: (str)
: id of the catalog for adatastep.catalog_version_id: (str)
: id of the catalog version for adatastep.group_by: (list[str])
: list of columns which thisactionstep aggregated by.columns: (list)
: names of columns involved into the feature generation. Available only fordatasteps.time_info: (dict)
: description of the feature derivation window which was applied to thisactionstep.join_info: (list[dict])
:joinstep details. columnsstructure is data_type: (str)
: the type of the feature, e.g., ‘Categorical’, ‘Text’is_input: (bool)
: indicates features which provided data to transform in this lineage.name: (str)
: feature name.is_cutoff: (bool)
: indicates a cutoff column. time_infostructure is: latest: (dict)
: end of the feature derivation window applied.duration: (dict)
: size of the feature derivation window applied. latestand duration structure is: time_unit: (str)
: time unit name like ‘MINUTE’, ‘DAY’, ‘MONTH’ etc.duration: (int)
: value/size of this duration object. join_infostructure is: join_type - (str)
: kind of join, left/right.left_table - (dict)
: information about a dataset which was considered as left.right_table - (str)
: information about a dataset which was considered as right. left_tableandright_tablestructure is: columns - (list[str])
: list of columns which datasets were joined by.datasteps - (list[int])
: list ofdatasteps id which brought thecolumnsinto the current step dataset.

#### classmethod get(project_id, id)

Retrieve a single FeatureLineage.

- Parameters:
- Returns: lineage – The queried instance
- Return type: FeatureLineage

## OCR job resources

### class datarobot.models.ocr_job_resource.OCRJobResource

An OCR job resource container. It is used to:
- Get an existing OCR  job resource.
- List available OCR job resources.
- Start an OCR job.
- Check the status of a started OCR job.
- Download the error report of a started OCR job.

Added in version v3.6.0b0.

- Variables:

#### classmethod get(job_resource_id)

Get an OCR job resource.

- Parameters: job_resource_id ( str ) – identifier of OCR job resource
- Returns: returned OCR job resource
- Return type: OCRJobResource

#### classmethod list(offset=0, limit=10)

Get a list of OCR job resources.

- Parameters:
- Returns: A list of OCR job resources.
- Return type: List[OCRJobResource]

#### classmethod create(input_catalog_id, language, engine_specific_parameters=None)

Create a new OCR job resource and return it.

- Parameters:
- Returns: The created OCR job resource.
- Return type: OCRJobResource

#### start_job()

Start an OCR job with this OCR job resource.

- Returns: The response of starting an OCR job.
- Return type: StartOCRJobResponse

#### get_job_status()

Get status of the OCR job associated with this OCR job resource.

- Returns: OCR job status enum
- Return type: OCRJobStatusEnum

#### download_error_report(download_file_path)

Download the error report of the OCR job associated with this OCR job resource.

- Parameters: download_file_path ( Path ) – path to download error report
- Return type: None

#### classmethod from_server_data(data, keep_attrs=None)

Instantiate an object of this class using the data directly from the server,
meaning that the keys may have the wrong camel casing

- Parameters:
- Return type: TypeVar ( TOCRJobResource , bound= OCRJobResource)

### class datarobot.models.ocr_job_resource.OCREngineSpecificParameters

Container of Engine Specific Parameters. It is used to specify required
OCR engine parameters when creating an OCR job resource.

Added in version v3.8.0.

- Variables:

#### get_payload()

return dict containing engine specific parameters whose values are not None

- Return type: Dict [ str , Optional [ str ]]

### class datarobot.models.ocr_job_resource.OCRJobDatasetLanguage

Supported OCR language

### class datarobot.models.ocr_job_resource.DataRobotOCREngineType

Supported OCR engine type

### class datarobot.models.ocr_job_resource.DataRobotArynOutputFormat

Supported ARYN OCR engine output format

### class datarobot.models.ocr_job_resource.OCRJobStatusEnum

OCR Job status enum

### class datarobot.models.ocr_job_resource.StartOCRJobResponse

Container of Start OCR Job API response

## Document text extraction

### class datarobot.models.documentai.document.FeaturesWithSamples

FeaturesWithSamples(model_id, feature_name, document_task)

#### document_task

Alias for field number 2

#### feature_name

Alias for field number 1

#### model_id

Alias for field number 0

### class datarobot.models.documentai.document.DocumentPageFile

Page of a document as an image file.

- Variables:

#### property thumbnail_bytes : bytes

Document thumbnail as bytes.

- Returns: Document thumbnail.
- Return type: bytes

#### property mime_type : str

‘image/png’

- Returns: Mime image type of the document thumbnail.
- Return type: str
- Type: Mime image type of the document thumbnail. Example

### class datarobot.models.documentai.document.DocumentThumbnail

Thumbnail of document from the project’s dataset.

If `Project.stage` is `datarobot.enums.PROJECT_STAGE.EDA2` and it is a supervised project then the `target_*` attributes
of this class will have values, otherwise the values will all be None.

- Variables:

#### classmethod list(project_id, feature_name, target_value=None, offset=None, limit=None)

Get document thumbnails from a project.

- Parameters:
- Returns: documents – A list of DocumentThumbnail objects, each representing a single document.
- Return type: List[DocumentThumbnail]

> [!NOTE] Notes
> Actual document thumbnails are not fetched from the server by this method.
> Instead the data gets loaded lazily when `DocumentPageFile` object attributes
> are accessed.

> [!NOTE] Examples
> Fetch document thumbnails for the given `project_id` and `feature_name`.
> 
> ```
> from datarobot._experimental.models.documentai.document import DocumentThumbnail
> 
> # Fetch five documents from the EDA SAMPLE for the specified project and specific feature
> document_thumbs = DocumentThumbnail.list(project_id, feature_name, limit=5)
> 
> # Fetch five documents for the specified project with target value filtering
> # This option is only available after selecting the project target and starting modeling
> target1_thumbs = DocumentThumbnail.list(project_id, feature_name, target_value='target1', limit=5)
> ```

Preview the document thumbnail.

```
from datarobot._experimental.models.documentai.document import DocumentThumbnail
from datarobot.helpers.image_utils import get_image_from_bytes

# Fetch 3 documents
document_thumbs = DocumentThumbnail.list(project_id, feature_name, limit=3)

for doc_thumb in document_thumbs:
    thumbnail = get_image_from_bytes(doc_thumb.document.thumbnail_bytes)
    thumbnail.show()
```

### class datarobot.models.documentai.document.DocumentTextExtractionSample

Stateless class for computing and retrieving Document Text Extraction Samples.

> [!NOTE] Notes
> Actual document text extraction samples are not fetched from the server in the moment of
> a function call. Detailed information on the documents, the pages and the rendered images of them
> are fetched when accessed on demand (lazy loading).

> [!NOTE] Examples
> 1) Compute text extraction samples for a specific model, and fetch all existing document text
> extraction samples for a specific project.
> 
> ```
> from datarobot._experimental.models.documentai.document import DocumentTextExtractionSample
> 
> SPECIFIC_MODEL_ID1 = "model_id1"
> SPECIFIC_MODEL_ID2 = "model_id2"
> SPECIFIC_PROJECT_ID = "project_id"
> 
> # Order computation of document text extraction sample for specific model.
> # By default `compute` method will await for computation to end before returning
> DocumentTextExtractionSample.compute(SPECIFIC_MODEL_ID1, await_completion=False)
> DocumentTextExtractionSample.compute(SPECIFIC_MODEL_ID2)
> 
> samples = DocumentTextExtractionSample.list_features_with_samples(SPECIFIC_PROJECT_ID)
> ```

2) Fetch document text extraction samples for a specific model_id and feature_name, and
display all document sample pages.

```
from datarobot._experimental.models.documentai.document import DocumentTextExtractionSample
from datarobot.helpers.image_utils import get_image_from_bytes

SPECIFIC_MODEL_ID = "model_id"
SPECIFIC_FEATURE_NAME = "feature_name"

samples = DocumentTextExtractionSample.list_pages(
    model_id=SPECIFIC_MODEL_ID,
    feature_name=SPECIFIC_FEATURE_NAME
)
for sample in samples:
    thumbnail = sample.document_page.thumbnail
    image = get_image_from_bytes(thumbnail.thumbnail_bytes)
    image.show()
```

3) Fetch document text extraction samples for specific model_id and feature_name and
display text extraction details for the first page. This example displays the image of the document
with bounding boxes of detected text lines. It also returns a list of all text
lines extracted from page along with their coordinates.

```
from datarobot._experimental.models.documentai.document import DocumentTextExtractionSample

SPECIFIC_MODEL_ID = "model_id"
SPECIFIC_FEATURE_NAME = "feature_name"

samples = DocumentTextExtractionSample.list_pages(SPECIFIC_MODEL_ID, SPECIFIC_FEATURE_NAME)
# Draw bounding boxes for first document page sample and display related text data.
image = samples[0].get_document_page_with_text_locations()
image.show()
# For each text block represented as bounding box object drawn on original image
# display its coordinates (top, left, bottom, right) and extracted text value
for text_line in samples[0].text_lines:
    print(text_line)
```

#### classmethod compute(model_id, await_completion=True, max_wait=600)

Starts computation of document text extraction samples for the model and, if successful,
returns computed text samples for it. This method allows calculation to continue for
a specified time and, if not complete, cancels the request.

- Parameters:
- Raises:
- Return type: None

#### classmethod list_features_with_samples(project_id)

Returns a list of features, model_id pairs with computed document text extraction samples.

- Parameters: project_id ( str ) – The project ID to retrieve the list of computed samples for.
- Return type: List[FeaturesWithSamples]

#### classmethod list_pages(model_id, feature_name, document_index=None, document_task=None)

Returns a list of document text extraction sample pages.

- Parameters:
- Return type: List[DocumentTextExtractionSamplePage]

#### classmethod list_documents(model_id, feature_name)

Returns a list of documents used for text extraction.

- Parameters:
- Return type: List[DocumentTextExtractionSampleDocument]

### class datarobot.models.documentai.document.DocumentTextExtractionSampleDocument

Document text extraction source.

Holds data that contains feature and model prediction values, as well as the thumbnail of the document.

- Variables:

#### classmethod list(model_id, feature_name, document_task=None)

List available documents with document text extraction samples.

- Parameters:
- Return type: List[DocumentTextExtractionSampleDocument]

### class datarobot.models.documentai.document.DocumentTextExtractionSamplePage

Document text extraction sample covering one document page.

Holds data about the document page, the recognized text, and the location of the text in the document page.

- Variables:

#### classmethod list(model_id, feature_name, document_index=None, document_task=None)

Returns a list of document text extraction sample pages.

- Parameters:
- Return type: List[DocumentTextExtractionSamplePage]

#### get_document_page_with_text_locations(line_color='blue', line_width=3, padding=3)

Returns the document page with bounding boxes drawn around the text lines as a PIL.Image.

- Parameters:
- Returns: Returns a PIL.Image with drawn text-bounding boxes.
- Return type: Image
