# Data Registry

> Data Registry - Represents a Dataset returned from the api/v2/datasets/ endpoints.

This Markdown file sits beside the HTML page at the same path (with a `.md` suffix). It summarizes the topic and lists links for tools and LLM context.

Companion generated at `2026-05-06T18:17:09.825223+00:00` (UTC).

## Primary page

- [Data Registry](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html): Full documentation for this topic (HTML).

## Sections on this page

- [classdatarobot.models.Dataset](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset): In-page section heading.
- [get_uri()](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.get_uri): In-page section heading.
- [classmethodupload(source)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.upload): In-page section heading.
- [classmethodcreate_from_file(cls, file_path=None, filelike=None, categories=None, read_timeout=600, max_wait=600, , use_cases=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.create_from_file): In-page section heading.
- [classmethodcreate_from_in_memory_data(cls, data_frame=None, records=None, categories=None, read_timeout=600, max_wait=600, fname=None, , use_cases=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.create_from_in_memory_data): In-page section heading.
- [classmethodcreate_from_url(cls, url, do_snapshot=None, persist_data_after_ingestion=None, categories=None, sample_size=None, max_wait=600, , use_cases=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.create_from_url): In-page section heading.
- [classmethodcreate_from_project(cls, project_id, categories=None, max_wait=600, , use_cases=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.create_from_project): In-page section heading.
- [classmethodcreate_from_datastage(cls, datastage_id, categories=None, max_wait=600, , use_cases=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.create_from_datastage): In-page section heading.
- [classmethodcreate_from_data_source(cls, data_source_id, username=None, password=None, do_snapshot=None, persist_data_after_ingestion=None, categories=None, credential_id=None, use_kerberos=None, credential_data=None, sample_size=None, max_wait=600, , use_cases=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.create_from_data_source): In-page section heading.
- [classmethodcreate_from_query_generator(cls, generator_id, dataset_id=None, dataset_version_id=None, max_wait=600, , use_cases=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.create_from_query_generator): In-page section heading.
- [classmethodcreate_from_recipe(cls, recipe, name=None, do_snapshot=None, persist_data_after_ingestion=None, categories=None, credential=None, credential_id=None, use_kerberos=None, materialization_destination=None, max_wait=600, , use_cases=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.create_from_recipe): In-page section heading.
- [classmethodget(dataset_id)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.get): In-page section heading.
- [classmethoddelete(dataset_id)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.delete): In-page section heading.
- [classmethodun_delete(dataset_id)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.un_delete): In-page section heading.
- [classmethodlist(category=None, filter_failed=None, order_by=None, use_cases=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.list): In-page section heading.
- [classmethoditerate(offset=None, limit=None, category=None, order_by=None, filter_failed=None, use_cases=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.iterate): In-page section heading.
- [update()](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.update): In-page section heading.
- [modify(name=None, categories=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.modify): In-page section heading.
- [share(access_list, apply_grant_to_linked_objects=False)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.share): In-page section heading.
- [get_details()](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.get_details): In-page section heading.
- [get_all_features(order_by=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.get_all_features): In-page section heading.
- [iterate_all_features(offset=None, limit=None, order_by=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.iterate_all_features): In-page section heading.
- [get_featurelists()](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.get_featurelists): In-page section heading.
- [create_featurelist(name, features)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.create_featurelist): In-page section heading.
- [get_file(file_path=None, filelike=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.get_file): In-page section heading.
- [get_as_dataframe(low_memory=False)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.get_as_dataframe): In-page section heading.
- [get_projects()](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.get_projects): In-page section heading.
- [get_raw_sample_data()](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.get_raw_sample_data): In-page section heading.
- [create_project(project_name=None, user=None, password=None, credential_id=None, use_kerberos=None, credential_data=None, , use_cases=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.create_project): In-page section heading.
- [classmethodcreate_version_from_file(dataset_id, file_path=None, filelike=None, categories=None, read_timeout=600, max_wait=600)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.create_version_from_file): In-page section heading.
- [classmethodcreate_version_from_in_memory_data(dataset_id, data_frame=None, records=None, categories=None, read_timeout=600, max_wait=600)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.create_version_from_in_memory_data): In-page section heading.
- [classmethodcreate_version_from_url(dataset_id, url, categories=None, max_wait=600)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.create_version_from_url): In-page section heading.
- [classmethodcreate_version_from_datastage(dataset_id, datastage_id, categories=None, max_wait=600)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.create_version_from_datastage): In-page section heading.
- [classmethodcreate_version_from_data_source(dataset_id, data_source_id, username=None, password=None, categories=None, credential_id=None, use_kerberos=None, credential_data=None, max_wait=600)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.create_version_from_data_source): In-page section heading.
- [classmethodcreate_version_from_recipe(dataset_id, recipe, credential=None, credential_id=None, use_kerberos=None, max_wait=600)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.create_version_from_recipe): In-page section heading.
- [classmethodfrom_data(data)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.from_data): In-page section heading.
- [classmethodfrom_server_data(data, keep_attrs=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.from_server_data): In-page section heading.
- [open_in_browser()](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset.open_in_browser): In-page section heading.
- [classdatarobot.DatasetDetails](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.DatasetDetails): In-page section heading.
- [classmethodget(dataset_id)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.DatasetDetails.get): In-page section heading.
- [to_dataset()](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.DatasetDetails.to_dataset): In-page section heading.
- [classdatarobot.models.dataset.ProjectLocation](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.dataset.ProjectLocation): In-page section heading.
- [id](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.dataset.ProjectLocation.id): In-page section heading.
- [url](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.dataset.ProjectLocation.url): In-page section heading.
- [Secondary datasets](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#secondary-datasets): In-page section heading.
- [classdatarobot.helpers.feature_discovery.SecondaryDataset](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.helpers.feature_discovery.SecondaryDataset): In-page section heading.
- [Secondary dataset configurations](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#secondary-dataset-configurations): In-page section heading.
- [classdatarobot.models.SecondaryDatasetConfigurations](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.SecondaryDatasetConfigurations): In-page section heading.
- [classmethodcreate(project_id, secondary_datasets, name, featurelist_id=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.SecondaryDatasetConfigurations.create): In-page section heading.
- [delete()](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.SecondaryDatasetConfigurations.delete): In-page section heading.
- [get()](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.SecondaryDatasetConfigurations.get): In-page section heading.
- [classmethodlist(project_id, featurelist_id=None, limit=None, offset=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.SecondaryDatasetConfigurations.list): In-page section heading.
- [Data engine query generator](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#data-engine-query-generator): In-page section heading.
- [classdatarobot.DataEngineQueryGenerator](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.DataEngineQueryGenerator): In-page section heading.
- [classmethodcreate(generator_type, datasets, generator_settings)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.DataEngineQueryGenerator.create): In-page section heading.
- [classmethodget(generator_id)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.DataEngineQueryGenerator.get): In-page section heading.
- [create_dataset(dataset_id=None, dataset_version_id=None, max_wait=600)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.DataEngineQueryGenerator.create_dataset): In-page section heading.
- [prepare_prediction_dataset_from_catalog(project_id, dataset_id, dataset_version_id=None, max_wait=600, relax_known_in_advance_features_check=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.DataEngineQueryGenerator.prepare_prediction_dataset_from_catalog): In-page section heading.
- [prepare_prediction_dataset(sourcedata, project_id, max_wait=600, relax_known_in_advance_features_check=None)](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.DataEngineQueryGenerator.prepare_prediction_dataset): In-page section heading.
- [Sharing access](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#sharing-access): In-page section heading.
- [classdatarobot.SharingAccess](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.SharingAccess): In-page section heading.
- [Sharing role](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#sharing-role): In-page section heading.
- [classdatarobot.models.sharing.SharingRole](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.sharing.SharingRole): In-page section heading.

## Related documentation

- [Developer documentation](https://docs.datarobot.com/en/docs/api/index.html): Linked from this page.
- [API reference](https://docs.datarobot.com/en/docs/api/reference/index.html): Linked from this page.
- [Python API client](https://docs.datarobot.com/en/docs/api/reference/sdk/index.html): Linked from this page.
- [Data preparation](https://docs.datarobot.com/en/docs/api/reference/sdk/tag-data-prep.html): Linked from this page.
- [InvalidUsageError](https://docs.datarobot.com/en/docs/api/reference/sdk/errors.html#datarobot.errors.InvalidUsageError): Linked from this page.
- [DatasetFeature](https://docs.datarobot.com/en/docs/api/reference/sdk/features.html#datarobot.models.DatasetFeature): Linked from this page.
- [datarobot.models.Project](https://docs.datarobot.com/en/docs/api/reference/sdk/projects.html#datarobot.models.Project): Linked from this page.
- [Credential](https://docs.datarobot.com/en/docs/api/reference/sdk/credentials.html#datarobot.models.Credential): Linked from this page.
- [DataStores](https://docs.datarobot.com/en/docs/api/reference/sdk/data-connectivity.html#datarobot.DataStore): Linked from this page.
- [sharing](https://docs.datarobot.com/en/docs/api/dev-learning/python/admin/sharing.html#sharing): Linked from this page.

## Documentation content

### class datarobot.models.Dataset

Represents a Dataset returned from the api/v2/datasets/ endpoints.

- Variables:

#### get_uri()

- Returns: url – Permanent static hyperlink to this dataset in AI Catalog.
- Return type: str

#### classmethod upload(source)

This method covers Dataset creation from local materials (file & DataFrame) and a URL.

- Parameters: source ( str , pd.DataFrame or file object ) – Pass a URL, filepath, file or DataFrame to create and return a Dataset.
- Returns: response – The Dataset created from the uploaded data source.
- Return type: Dataset
- Raises: InvalidUsageError – If the source parameter cannot be determined to be a URL, filepath, file or DataFrame.

> [!NOTE] Examples
> ```
> # Upload a local file
> dataset_one = Dataset.upload("./data/examples.csv")
> 
> # Create a dataset via URL
> dataset_two = Dataset.upload(
>     "https://raw.githubusercontent.com/curran/data/gh-pages/dbpedia/cities/data.csv"
> )
> 
> # Create dataset with a pandas Dataframe
> dataset_three = Dataset.upload(my_df)
> 
> # Create dataset using a local file
> with open("./data/examples.csv", "rb") as file_pointer:
>     dataset_four = Dataset.create_from_file(filelike=file_pointer)
> ```

#### classmethod create_from_file(cls, file_path=None, filelike=None, categories=None, read_timeout=600, max_wait=600, , use_cases=None)

A blocking call that creates a new Dataset from a file. Returns when the dataset has
been successfully uploaded and processed.

Warning: This function does not clean up it’s open files. If you pass a filelike, you are
responsible for closing it. If you pass a file_path, this will create a file object from
the file_path but will not close it.

- Parameters:
- Returns: response – A fully armed and operational Dataset
- Return type: Dataset

#### classmethod create_from_in_memory_data(cls, data_frame=None, records=None, categories=None, read_timeout=600, max_wait=600, fname=None, , use_cases=None)

A blocking call that creates a new Dataset from in-memory data. Returns when the dataset has
been successfully uploaded and processed.

The data can be either a pandas DataFrame or a list of dictionaries with identical keys.

- Parameters:
- Returns: response – The Dataset created from the uploaded data.
- Return type: Dataset
- Raises: InvalidUsageError – If neither a DataFrame or list of records is passed.

#### classmethod create_from_url(cls, url, do_snapshot=None, persist_data_after_ingestion=None, categories=None, sample_size=None, max_wait=600, , use_cases=None)

A blocking call that creates a new Dataset from data stored at a url.
Returns when the dataset has been successfully uploaded and processed.

- Parameters:
- Returns: response – The Dataset created from the uploaded data
- Return type: Dataset

#### classmethod create_from_project(cls, project_id, categories=None, max_wait=600, , use_cases=None)

A blocking call that creates a new dataset from project data.
Returns when the dataset has been successfully created.

- Parameters:
- Returns: response – The dataset created from the project dataset.
- Return type: Dataset

#### classmethod create_from_datastage(cls, datastage_id, categories=None, max_wait=600, , use_cases=None)

A blocking call that creates a new Dataset from data stored as a DataStage.
Returns when the dataset has been successfully uploaded and processed.

- Parameters:
- Returns: response – The Dataset created from the uploaded data
- Return type: Dataset

#### classmethod create_from_data_source(cls, data_source_id, username=None, password=None, do_snapshot=None, persist_data_after_ingestion=None, categories=None, credential_id=None, use_kerberos=None, credential_data=None, sample_size=None, max_wait=600, , use_cases=None)

A blocking call that creates a new Dataset from data stored at a DataSource.
Returns when the dataset has been successfully uploaded and processed.

Added in version v2.22.

- Parameters:
- Returns: response – The Dataset created from the uploaded data
- Return type: Dataset

#### classmethod create_from_query_generator(cls, generator_id, dataset_id=None, dataset_version_id=None, max_wait=600, , use_cases=None)

A blocking call that creates a new Dataset from the query generator.
Returns when the dataset has been successfully processed. If optional
parameters are not specified the query is applied to the dataset_id
and dataset_version_id stored in the query generator. If specified they
will override the stored dataset_id/dataset_version_id, e.g., to prep a
prediction dataset.

- Parameters:
- Returns: response – The Dataset created from the query generator
- Return type: Dataset

#### classmethod create_from_recipe(cls, recipe, name=None, do_snapshot=None, persist_data_after_ingestion=None, categories=None, credential=None, credential_id=None, use_kerberos=None, materialization_destination=None, max_wait=600, , use_cases=None)

A blocking call that creates a new Dataset from the recipe.
Returns when the dataset has been successfully uploaded and processed.

Added in version 3.6.

- Returns: response – The Dataset created from the uploaded data
- Return type: Dataset

#### classmethod get(dataset_id)

Get information about a dataset.

- Parameters: dataset_id ( string ) – the ID of the dataset
- Returns: dataset – the queried dataset
- Return type: Dataset

#### classmethod delete(dataset_id)

Soft deletes a dataset.  You cannot get it or list it or do actions with it, except for
un-deleting it.

- Parameters: dataset_id ( string ) – The id of the dataset to mark for deletion
- Return type: None

#### classmethod un_delete(dataset_id)

Un-deletes a previously deleted dataset.  If the dataset was not deleted, nothing happens.

- Parameters: dataset_id ( string ) – The id of the dataset to un-delete
- Return type: None

#### classmethod list(category=None, filter_failed=None, order_by=None, use_cases=None)

List all datasets a user can view.

- Parameters:
- Returns: a list of datasets the user can view
- Return type: list[Dataset]

#### classmethod iterate(offset=None, limit=None, category=None, order_by=None, filter_failed=None, use_cases=None)

Get an iterator for the requested datasets a user can view.
This lazily retrieves results. It does not get the next page from the server until the
current page is exhausted.

- Parameters:
- Yields: Dataset – An iterator of the datasets the user can view.
- Return type: Generator [ TypeVar ( TDataset , bound= Dataset), None , None ]

#### update()

Updates the Dataset attributes in place with the latest information from the server.

- Return type: None

#### modify(name=None, categories=None)

Modifies the Dataset name and/or categories.  Updates the object in place.

- Parameters:
- Return type: None

#### share(access_list, apply_grant_to_linked_objects=False)

Modify the ability of users to access this dataset

- Parameters:
- Return type: None
- Raises: datarobot.ClientError: – If you do not have permission to share this dataset, if the user you’re sharing with
      doesn’t exist, if the same user appears multiple times in the access_list, or if these
      changes would leave the dataset without an owner.

> [!NOTE] Examples
> Transfer access to the dataset from [old_user@datarobot.com](mailto:old_user@datarobot.com) to [new_user@datarobot.com](mailto:new_user@datarobot.com)
> 
> ```
> from datarobot.enums import SHARING_ROLE
> from datarobot.models.dataset import Dataset
> from datarobot.models.sharing import SharingAccess
> 
> new_access = SharingAccess(
>     "new_user@datarobot.com",
>     SHARING_ROLE.OWNER,
>     can_share=True,
> )
> access_list = [
>     SharingAccess(
>         "old_user@datarobot.com",
>         SHARING_ROLE.OWNER,
>         can_share=True,
>         can_use_data=True,
>     ),
>     new_access,
> ]
> 
> Dataset.get('my-dataset-id').share(access_list)
> ```

#### get_details()

Gets the details for this Dataset

- Return type: DatasetDetails

#### get_all_features(order_by=None)

Get a list of all the features for this dataset.

- Parameters: order_by ( string , optional ) – If unset, uses the server default: ‘name’.
  How the features should be ordered. Can be ‘name’ or ‘featureType’.
- Return type: list[DatasetFeature]

#### iterate_all_features(offset=None, limit=None, order_by=None)

Get an iterator for the requested features of a dataset.
This lazily retrieves results. It does not get the next page from the server until the
current page is exhausted.

- Parameters:
- Yields: DatasetFeature
- Return type: Generator [ DatasetFeature , None , None ]

#### get_featurelists()

Get DatasetFeaturelists created on this Dataset

- Returns: feature_lists
- Return type: list[DatasetFeaturelist]

#### create_featurelist(name, features)

Create a new dataset featurelist

- Parameters:
- Returns: featurelist – the newly created featurelist
- Return type: DatasetFeaturelist

> [!NOTE] Examples
> ```
> dataset = Dataset.get('1234deadbeeffeeddead4321')
> dataset_features = dataset.get_all_features()
> selected_features = [feat.name for feat in dataset_features][:5]  # select first five
> new_flist = dataset.create_featurelist('Simple Features', selected_features)
> ```

#### get_file(file_path=None, filelike=None)

Retrieves all the originally uploaded data in CSV form.
Writes it to either the file or a filelike object that can write bytes.

Only one of file_path or filelike can be provided and it must be provided as a
keyword argument (i.e., file_path=’path-to-write-to’). If a file-like object is
provided, the user is responsible for closing it when they are done.

The user must also have permission to download data.

- Parameters:
- Return type: None

#### get_as_dataframe(low_memory=False)

Retrieves all the originally uploaded data in a pandas DataFrame.

Added in version v3.0.

- Parameters: low_memory ( Optional[bool] ) – If True, use local files to reduce memory usage which will be slower.
- Return type: pd.DataFrame

#### get_projects()

Retrieves the Dataset’s projects as ProjectLocation named tuples.

- Returns: locations
- Return type: list[ProjectLocation]

#### get_raw_sample_data()

Retrieves the raw sample data for the dataset as a pandas DataFrame.
The raw sample dataset is a subset of the full dataset.

Added in version v3.10.

- Returns: A DataFrame with the dataset’s raw sample data.
- Return type: pd.DataFrame

#### create_project(project_name=None, user=None, password=None, credential_id=None, use_kerberos=None, credential_data=None, , use_cases=None)

Create a [datarobot.models.Project](https://docs.datarobot.com/en/docs/api/reference/sdk/projects.html#datarobot.models.Project) from this dataset

- Parameters:
- Return type: Project

#### classmethod create_version_from_file(dataset_id, file_path=None, filelike=None, categories=None, read_timeout=600, max_wait=600)

A blocking call that creates a new Dataset version from a file. Returns when the new dataset
version has been successfully uploaded and processed.

Warning: This function does not clean up it’s open files. If you pass a filelike, you are
responsible for closing it. If you pass a file_path, this will create a file object from
the file_path but will not close it.

Added in version v2.23.

- Parameters:
- Returns: response – A fully armed and operational Dataset version
- Return type: Dataset

#### classmethod create_version_from_in_memory_data(dataset_id, data_frame=None, records=None, categories=None, read_timeout=600, max_wait=600)

A blocking call that creates a new Dataset version for a dataset from in-memory data.
Returns when the dataset has been successfully uploaded and processed.

The data can be either a pandas DataFrame or a list of dictionaries with identical keys.

> Added in version v2.23.
> *Parameters:*dataset_id(string) – The ID of the dataset for which new version to be created
>   *data_frame(DataFrame,optional) – The data frame to upload
>   *records(list[dict],optional) – A list of dictionaries with identical keys to upload
>   *categories(list[string],optional) – An array of strings describing the intended use of the dataset. The
>     current supported options are “TRAINING” and “PREDICTION”.
>   *read_timeout(Optional[int]) – The maximum number of seconds to wait for the server to respond indicating that the
>     initial upload is complete
>   *max_wait(Optional[int]) – Time in seconds after which project creation is considered unsuccessful
> *Returns:response– The Dataset version created from the uploaded data
> *Return type:Dataset*Raises:InvalidUsageError– If neither a DataFrame or list of records is passed.

#### classmethod create_version_from_url(dataset_id, url, categories=None, max_wait=600)

A blocking call that creates a new Dataset from data stored at a url for a given dataset.
Returns when the dataset has been successfully uploaded and processed.

Added in version v2.23.

- Parameters:
- Returns: response – The Dataset version created from the uploaded data
- Return type: Dataset

#### classmethod create_version_from_datastage(dataset_id, datastage_id, categories=None, max_wait=600)

A blocking call that creates a new Dataset from data stored as a DataStage for a given dataset.
Returns when the dataset has been successfully uploaded and processed.

- Parameters:
- Returns: response – The Dataset version created from the uploaded data
- Return type: Dataset

#### classmethod create_version_from_data_source(dataset_id, data_source_id, username=None, password=None, categories=None, credential_id=None, use_kerberos=None, credential_data=None, max_wait=600)

A blocking call that creates a new Dataset from data stored at a DataSource.
Returns when the dataset has been successfully uploaded and processed.

Added in version v2.23.

- Parameters:
- Returns: response – The Dataset version created from the uploaded data
- Return type: Dataset

#### classmethod create_version_from_recipe(dataset_id, recipe, credential=None, credential_id=None, use_kerberos=None, max_wait=600)

A blocking call that creates a new Dataset version from Recipe.
Returns when the dataset has been successfully uploaded and processed.

Added in version v3.8.

- Parameters:
- Returns: response – The Dataset version created from the uploaded data
- Return type: Dataset

#### classmethod from_data(data)

Instantiate an object of this class using a dict.

- Parameters: data ( dict ) – Correctly snake_cased keys and their values.
- Return type: TypeVar ( T , bound= APIObject)

#### classmethod from_server_data(data, keep_attrs=None)

Instantiate an object of this class using the data directly from the server,
meaning that the keys may have the wrong camel casing

- Parameters:
- Return type: TypeVar ( T , bound= APIObject)

#### open_in_browser()

Opens class’ relevant web browser location.
If default browser is not available the URL is logged.

Note:
If text-mode browsers are used, the calling process will block
until the user exits the browser.

- Return type: None

### class datarobot.DatasetDetails

Represents a detailed view of a Dataset. The to_dataset method creates a Dataset
from this details view.

- Variables:

#### classmethod get(dataset_id)

Get details for a Dataset from the server

- Parameters: dataset_id ( str ) – The id for the Dataset from which to get details
- Return type: DatasetDetails

#### to_dataset()

Build a Dataset object from the information in this object

- Return type: Dataset

### class datarobot.models.dataset.ProjectLocation

ProjectLocation(url, id)

#### id

Alias for field number 1

#### url

Alias for field number 0

## Secondary datasets

### class datarobot.helpers.feature_discovery.SecondaryDataset

A secondary dataset to be used for feature discovery

Added in version v2.25.

- Variables:

> [!NOTE] Examples
> ```
> import datarobot as dr
> dataset_definition = dr.SecondaryDataset(
>     identifier='profile',
>     catalog_id='5ec4aec1f072bc028e3471ae',
>     catalog_version_id='5ec4aec2f072bc028e3471b1',
> )
> ```

## Secondary dataset configurations

### class datarobot.models.SecondaryDatasetConfigurations

Create secondary dataset configurations for a given project

Added in version v2.20.

- Variables:

#### classmethod create(project_id, secondary_datasets, name, featurelist_id=None)

create secondary dataset configurations

Added in version v2.20.

- Parameters:
- Return type: an instance of SecondaryDatasetConfigurations
- Raises: ClientError – raised if incorrect configuration parameters are provided

> [!NOTE] Examples
> ```
>    profile_secondary_dataset = dr.SecondaryDataset(
>        identifier='profile',
>        catalog_id='5ec4aec1f072bc028e3471ae',
>        catalog_version_id='5ec4aec2f072bc028e3471b1',
>        snapshot_policy='latest'
>    )
> 
>    transaction_secondary_dataset = dr.SecondaryDataset(
>        identifier='transaction',
>        catalog_id='5ec4aec268f0f30289a03901',
>        catalog_version_id='5ec4aec268f0f30289a03900',
>        snapshot_policy='latest'
>    )
> 
>    secondary_datasets = [profile_secondary_dataset, transaction_secondary_dataset]
>    new_secondary_dataset_config = dr.SecondaryDatasetConfigurations.create(
>        project_id=project.id,
>        name='My config',
>        secondary_datasets=secondary_datasets
>    )
> 
> >>> new_secondary_dataset_config.id
> '5fd1e86c589238a4e635e93d'
> ```

#### delete()

Removes the Secondary datasets configuration

Added in version v2.21.

- Raises: ClientError – Raised if an invalid or already deleted secondary dataset config id is provided

> [!NOTE] Examples
> ```
> # Deleting with a valid secondary_dataset_config id
> status_code = dr.SecondaryDatasetConfigurations.delete(some_config_id)
> status_code
> >>> 204
> ```

- Return type: None

#### get()

Retrieve a single secondary dataset configuration for a given id

Added in version v2.21.

- Returns: secondary_dataset_configurations – The requested secondary dataset configurations
- Return type: SecondaryDatasetConfigurations

> [!NOTE] Examples
> ```
> config_id = '5fd1e86c589238a4e635e93d'
> secondary_dataset_config = dr.SecondaryDatasetConfigurations(id=config_id).get()
> >>> secondary_dataset_config
> {
>      'created': datetime.datetime(2020, 12, 9, 6, 16, 22, tzinfo=tzutc()),
>      'creator_full_name': u'abc@datarobot.com',
>      'creator_user_id': u'asdf4af1gf4bdsd2fba1de0a',
>      'credential_ids': None,
>      'featurelist_id': None,
>      'id': u'5fd1e86c589238a4e635e93d',
>      'is_default': True,
>      'name': u'My config',
>      'project_id': u'5fd06afce2456ec1e9d20457',
>      'project_version': None,
>      'secondary_datasets': [
>             {
>                 'snapshot_policy': u'latest',
>                 'identifier': u'profile',
>                 'catalog_version_id': u'5fd06b4af24c641b68e4d88f',
>                 'catalog_id': u'5fd06b4af24c641b68e4d88e'
>             },
>             {
>                 'snapshot_policy': u'dynamic',
>                 'identifier': u'transaction',
>                 'catalog_version_id': u'5fd1e86c589238a4e635e98e',
>                 'catalog_id': u'5fd1e86c589238a4e635e98d'
>             }
>      ]
> }
> ```

#### classmethod list(project_id, featurelist_id=None, limit=None, offset=None)

Returns list of secondary dataset configurations.

Added in version v2.23.

- Parameters:
- Returns: secondary_dataset_configurations – The requested list of secondary dataset configurations for a given project
- Return type: list of SecondaryDatasetConfigurations

> [!NOTE] Examples
> ```
> pid = '5fd06afce2456ec1e9d20457'
> secondary_dataset_configs = dr.SecondaryDatasetConfigurations.list(pid)
> >>> secondary_dataset_configs[0]
>     {
>          'created': datetime.datetime(2020, 12, 9, 6, 16, 22, tzinfo=tzutc()),
>          'creator_full_name': u'abc@datarobot.com',
>          'creator_user_id': u'asdf4af1gf4bdsd2fba1de0a',
>          'credential_ids': None,
>          'featurelist_id': None,
>          'id': u'5fd1e86c589238a4e635e93d',
>          'is_default': True,
>          'name': u'My config',
>          'project_id': u'5fd06afce2456ec1e9d20457',
>          'project_version': None,
>          'secondary_datasets': [
>                 {
>                     'snapshot_policy': u'latest',
>                     'identifier': u'profile',
>                     'catalog_version_id': u'5fd06b4af24c641b68e4d88f',
>                     'catalog_id': u'5fd06b4af24c641b68e4d88e'
>                 },
>                 {
>                     'snapshot_policy': u'dynamic',
>                     'identifier': u'transaction',
>                     'catalog_version_id': u'5fd1e86c589238a4e635e98e',
>                     'catalog_id': u'5fd1e86c589238a4e635e98d'
>                 }
>          ]
>     }
> ```

## Data engine query generator

### class datarobot.DataEngineQueryGenerator

DataEngineQueryGenerator is used to set up time series data prep.

Added in version v2.27.

- Variables:

#### classmethod create(generator_type, datasets, generator_settings)

Creates a query generator entity.

Added in version v2.27.

- Parameters:
- Returns: query_generator – The created generator
- Return type: DataEngineQueryGenerator

> [!NOTE] Examples
> ```
> import datarobot as dr
> from datarobot.models.data_engine_query_generator import (
>    QueryGeneratorDataset,
>    QueryGeneratorSettings,
> )
> dataset = QueryGeneratorDataset(
>    alias='My_Awesome_Dataset_csv',
>    dataset_id='61093144cabd630828bca321',
>    dataset_version_id=1,
> )
> settings = QueryGeneratorSettings(
>    datetime_partition_column='date',
>    time_unit='DAY',
>    time_step=1,
>    default_numeric_aggregation_method='sum',
>    default_categorical_aggregation_method='mostFrequent',
> )
> g = dr.DataEngineQueryGenerator.create(
>    generator_type='TimeSeries',
>    datasets=[dataset],
>    generator_settings=settings,
> )
> g.id
> >>>'54e639a18bd88f08078ca831'
> g.generator_type
> >>>'TimeSeries'
> ```

#### classmethod get(generator_id)

Gets information about a query generator.

- Parameters: generator_id ( str ) – The identifier of the query generator you want to load.
- Returns: query_generator – The queried generator
- Return type: DataEngineQueryGenerator

> [!NOTE] Examples
> ```
> import datarobot as dr
> g = dr.DataEngineQueryGenerator.get(generator_id='54e639a18bd88f08078ca831')
> g.id
> >>>'54e639a18bd88f08078ca831'
> g.generator_type
> >>>'TimeSeries'
> ```

#### create_dataset(dataset_id=None, dataset_version_id=None, max_wait=600)

A blocking call that creates a new Dataset from the query generator.
Returns when the dataset has been successfully processed. If optional
parameters are not specified the query is applied to the dataset_id
and dataset_version_id stored in the query generator. If specified they
will override the stored dataset_id/dataset_version_id, i.e., to prep a
prediction dataset.

- Parameters:
- Returns: response – The Dataset created from the query generator
- Return type: Dataset

#### prepare_prediction_dataset_from_catalog(project_id, dataset_id, dataset_version_id=None, max_wait=600, relax_known_in_advance_features_check=None)

Apply time series data prep to a catalog dataset and upload it to the project
as a PredictionDataset.

Added in version v3.1.

- Parameters:
- Returns: dataset – The newly uploaded dataset.
- Return type: PredictionDataset

#### prepare_prediction_dataset(sourcedata, project_id, max_wait=600, relax_known_in_advance_features_check=None)

Apply time series data prep and upload the PredictionDataset to the project.

Added in version v3.1.

- Parameters:
- Returns: dataset – The newly uploaded dataset.
- Return type: PredictionDataset
- Raises:

## Sharing access

### class datarobot.SharingAccess

Represents metadata about whom a entity (e.g., a data store) has been shared with

Added in version v2.14.

Currently [DataStores](https://docs.datarobot.com/en/docs/api/reference/sdk/data-connectivity.html#datarobot.DataStore), [DataSources](https://docs.datarobot.com/en/docs/api/reference/sdk/data-connectivity.html#datarobot.DataSource), [Datasets](https://docs.datarobot.com/en/docs/api/reference/sdk/data-registry.html#datarobot.models.Dataset), [Projects](https://docs.datarobot.com/en/docs/api/reference/sdk/projects.html#datarobot.models.Project) (new in version v2.15) and [CalendarFiles](https://docs.datarobot.com/en/docs/api/reference/sdk/projects.html#datarobot.CalendarFile) (new in version 2.15) can be shared.

This class can represent either access that has already been granted, or be used to grant access
to additional users.

- Variables:

## Sharing role

### class datarobot.models.sharing.SharingRole

Represents metadata about a user who has been granted access to an entity.
At least one of id or username must be set.

- Variables: