Developer documentation > API reference > Python API client > Data preparation > Data connectivity

Database connectivity¶

class datarobot.DataDriver¶

A data driver

Variables:
- id (str) – the id of the driver.
- class_name (str) – the Java class name for the driver.
- canonical_name (str) – the user-friendly name of the driver.
- creator (str) – the id of the user who created the driver.
- base_names (List[str]) – a list of the file name(s) of the jar files.

classmethod list(typ=None)¶

Returns list of available drivers.

Parameters: typ (DataDriverListTypes) – If specified, filters by specified driver type.
Returns: drivers – contains a list of available drivers.
Return type: list of DataDriver instances

Examples

>>> import datarobot as dr
>>> drivers = dr.DataDriver.list()
>>> drivers
[DataDriver('mysql'), DataDriver('RedShift'), DataDriver('PostgreSQL')]

classmethod get(driver_id)¶

Gets the driver.

Parameters: driver_id (str) – the identifier of the driver.
Returns: driver – the required driver.
Return type: DataDriver

Examples

>>> import datarobot as dr
>>> driver = dr.DataDriver.get('5ad08a1889453d0001ea7c5c')
>>> driver
DataDriver('PostgreSQL')

classmethod create(class_name, canonical_name, files=None, typ=None, database_driver=None)¶

Creates the driver. Only available to admin users.

Parameters:
- class_name (str) – the Java class name for the driver. Specify None if typ is DataDriverTypes.DR_DATABASE_V1`.
- canonical_name (str) – the user-friendly name of the driver.
- files (List[str]) – a list of the file paths on file system file_path(s) for the driver.
- typ (str) – Optional. Specify the type of the driver. Defaults to DataDriverTypes.JDBC, may also be DataDriverTypes.DR_DATABASE_V1.
- database_driver (str) – Optional. Specify when typ is DataDriverTypes.DR_DATABASE_V1 to create a native database driver. See DrDatabaseV1Types enum for some of the types, but that list may not be exhaustive.
Returns: driver – the created driver.
Return type: DataDriver
Raises: ClientError – raised if user is not granted for Can manage JDBC database drivers feature

Examples

>>> import datarobot as dr
>>> driver = dr.DataDriver.create(
...     class_name='org.postgresql.Driver',
...     canonical_name='PostgreSQL',
...     files=['/tmp/postgresql-42.2.2.jar']
... )
>>> driver
DataDriver('PostgreSQL')

update(class_name=None, canonical_name=None)¶

Updates the driver. Only available to admin users.

Parameters:
- class_name (str) – the Java class name for the driver.
- canonical_name (str) – the user-friendly name of the driver.
Raises: ClientError – raised if user is not granted for Can manage JDBC database drivers feature
Return type: None

Examples

>>> import datarobot as dr
>>> driver = dr.DataDriver.get('5ad08a1889453d0001ea7c5c')
>>> driver.canonical_name
'PostgreSQL'
>>> driver.update(canonical_name='postgres')
>>> driver.canonical_name
'postgres'

delete()¶

Removes the driver. Only available to admin users.

Raises: ClientError – raised if user is not granted for Can manage JDBC database drivers feature
Return type: None

class datarobot.Connector¶

A connector

Variables:
- id (str) – the id of the connector.
- creator_id (str) – the id of the user who created the connector.
- base_name (str) – the file name of the jar file.
- canonical_name (str) – the user-friendly name of the connector.
- configuration_id (str) – the id of the configuration of the connector.

classmethod list(data_type=None)¶

Returns list of available connectors.

Parameters: data_type (DataTypes) – If specified, returns the connectors that support the specified data type. If not specified, it will default to DataTypes.ALL
Returns: connectors – contains a list of available connectors.
Return type: list of Connector instances

Examples

>>> import datarobot as dr
>>> connectors = dr.Connector.list()
>>> connectors
[Connector('ADLS Gen2 Connector'), Connector('S3 Connector')]

classmethod get(connector_id)¶

Gets the connector.

Parameters: connector_id (str) – the identifier of the connector.
Returns: connector – the required connector.
Return type: Connector

Examples

>>> import datarobot as dr
>>> connector = dr.Connector.get('5fe1063e1c075e0245071446')
>>> connector
Connector('ADLS Gen2 Connector')

classmethod create(file_path=None, connector_type=None)¶

Creates the connector from a jar file. Only available to admin users.

Parameters:
- file_path (str) – (Deprecated in version v3.6) the file path on file system file_path(s) for the java-based connector.
- connector_type (str) – The type of the native connector to create
Returns: connector – the created connector.
Return type: Connector
Raises: ClientError – raised if user is not granted for Can manage connectors feature

Examples

>>> import datarobot as dr
>>> connector = dr.Connector.create('/tmp/connector-adls-gen2.jar')
>>> connector
Connector('ADLS Gen2 Connector')

update(file_path)¶

Updates the connector with new jar file. Only available to admin users.

Parameters: file_path (str) – (Deprecated in version v3.6) the file path on file system file_path(s) for the java-based connector.
Returns: connector – the updated connector.
Return type: Connector
Raises: ClientError – raised if user is not granted for Can manage connectors feature

Examples

>>> import datarobot as dr
>>> connector = dr.Connector.get('5fe1063e1c075e0245071446')
>>> connector.base_name
'connector-adls-gen2.jar'
>>> connector.update('/tmp/connector-s3.jar')
>>> connector.base_name
'connector-s3.jar'

delete()¶

Removes the connector. Only available to admin users.

Raises: ClientError – raised if user is not granted for Can manage connectors feature
Return type: None

class datarobot.DataStore¶

A data store. Represents database

Variables:
- id (str) – The id of the data store.
- data_store_type (str) – The type of data store.
- canonical_name (str) – The user-friendly name of the data store.
- creator (str) – The id of the user who created the data store.
- updated (datetime.datetime) – The time of the last update
- params (DataStoreParameters) – A list specifying data store parameters.
- role (str) – Your access role for this data store.

classmethod list(typ=None, name=None, substitute_url_parameters=False, data_type=None)¶

Returns list of available data stores.

Parameters:
- typ (str) – If specified, filters by specified data store type. If not specified, the default is DataStoreListTypes.JDBC.
- name (str) – If specified, filters by data store names that match or contain this name. The search is case-insensitive.
- substitute_url_parameters (bool) – If specified, dynamic parameters in the URL will be substituted.
- data_type (DataTypes) – If specified, filters data stores which support the specified data type. If not specified it will default to DataTypes.ALL
Returns: data_stores – contains a list of available data stores.
Return type: list of DataStore instances

Examples

>>> import datarobot as dr
>>> data_stores = dr.DataStore.list()
>>> data_stores
[DataStore('Demo'), DataStore('Airlines')]

classmethod get(data_store_id, substitute_url_parameters=False)¶

Gets the data store.

Parameters:
- data_store_id (str) – the identifier of the data store.
- substitute_url_parameters (bool) – If specified, dynamic parameters in the URL will be substituted.
Returns: data_store – the required data store.
Return type: DataStore

Examples

>>> import datarobot as dr
>>> data_store = dr.DataStore.get('5a8ac90b07a57a0001be501e')
>>> data_store
DataStore('Demo')

classmethod create(data_store_type, canonical_name, driver_id=None, jdbc_url=None, fields=None, connector_id=None)¶

Creates the data store.

Parameters:
- data_store_type (str or DataStoreTypes) – the type of data store.
- canonical_name (str) – the user-friendly name of the data store.
- driver_id (str) – Optional. The identifier of the DataDriver if data_store_type is DataStoreListTypes.JDBC or DataStoreListTypes.DR_DATABASE_V1.
- jdbc_url (str) – Optional. The full JDBC URL (for example: jdbc:postgresql://my.dbaddress.org:5432/my_db).
- fields (list) – Optional. If the type is dr-database-v1, then the fields specify the configuration.
- connector_id (str) – Optional. The identifier of the Connector if data_store_type is DataStoreListTypes.DR_CONNECTOR_V1
Returns: data_store – the created data store.
Return type: DataStore

Examples

>>> import datarobot as dr
>>> data_store = dr.DataStore.create(
...     data_store_type='jdbc',
...     canonical_name='Demo DB',
...     driver_id='5a6af02eb15372000117c040',
...     jdbc_url='jdbc:postgresql://my.db.address.org:5432/perftest'
... )
>>> data_store
DataStore('Demo DB')

update(canonical_name=None, driver_id=None, connector_id=None, jdbc_url=None, fields=None)¶

Updates the data store.

Parameters:
- canonical_name (str) – optional, the user-friendly name of the data store.
- driver_id (str) – Optional. The identifier of the DataDriver. if the type is one of DataStoreTypes.DR_DATABASE_V1 or DataStoreTypes.JDBC.
- connector_id (str) – Optional. The identifier of the Connector. if the type is DataStoreTypes.DR_CONNECTOR_V1.
- jdbc_url (str) – Optional. The full JDBC URL (for example: jdbc:postgresql://my.dbaddress.org:5432/my_db).
- fields (list) – Optional. If the type is dr-database-v1, then the fields specify the configuration.
Return type: None

Examples

>>> import datarobot as dr
>>> data_store = dr.DataStore.get('5ad5d2afef5cd700014d3cae')
>>> data_store
DataStore('Demo DB')
>>> data_store.update(canonical_name='Demo DB updated')
>>> data_store
DataStore('Demo DB updated')

delete()¶

Removes the DataStore

Return type: None

test(username=None, password=None, credential_id=None, use_kerberos=None, credential_data=None, set_default_credential=False)¶

Tests database connection.

Changed in version v3.2: Added credential_id, use_kerberos and credential_data optional params and made username and password optional.

Changed in version v3.9: If credential_id is provided and set_default_credential is True and the connection test is successful, the credential is set as the default for this data store.

Parameters:
- username (str) – optional, the username for database authentication.
- password (str) – optional, the password for database authentication. The password is encrypted at server side and never saved / stored
- credential_id (str) – optional, id of the set of credentials to use instead of username and password
- use_kerberos (bool) – optional, whether to use Kerberos for data store authentication
- credential_data (dict) – optional, the credentials to authenticate with the database, to use instead of user/password or credential ID
- set_default_credential (bool) – optional, if True and credential_id is provided, sets the credential as default for this data store Default is False.
Returns: message – message with status.
Return type: dict
Raises: CredentialsError – If unable to set the provided credential_id as default for this data store.

Examples

>>> import datarobot as dr
>>> data_store = dr.DataStore.get('5ad5d2afef5cd700014d3cae')
>>> data_store.test(username='db_username', password='db_password')
{'message': 'Connection successful'}

schemas(username, password)¶

Returns list of available schemas.

Parameters:
- username (str) – the username for database authentication.
- password (str) – the password for database authentication. The password is encrypted at server side and never saved / stored
Returns: response – dict with database name and list of str - available schemas
Return type: dict

Examples

>>> import datarobot as dr
>>> data_store = dr.DataStore.get('5ad5d2afef5cd700014d3cae')
>>> data_store.schemas(username='db_username', password='db_password')
{'catalog': 'perftest', 'schemas': ['demo', 'information_schema', 'public']}

tables(username, password, schema=None)¶

Returns list of available tables in schema.

Parameters:
- username (str) – optional, the username for database authentication.
- password (str) – optional, the password for database authentication. The password is encrypted at server side and never saved / stored
- schema (str) – optional, the schema name.
Returns: response – dict with catalog name and tables info
Return type: dict

Examples

>>> import datarobot as dr
>>> data_store = dr.DataStore.get('5ad5d2afef5cd700014d3cae')
>>> data_store.tables(username='db_username', password='db_password', schema='demo')
{'tables': [{'type': 'TABLE', 'name': 'diagnosis', 'schema': 'demo'}, {'type': 'TABLE',
'name': 'kickcars', 'schema': 'demo'}, {'type': 'TABLE', 'name': 'patient',
'schema': 'demo'}, {'type': 'TABLE', 'name': 'transcript', 'schema': 'demo'}],
'catalog': 'perftest'}

classmethod from_server_data(data, keep_attrs=None)¶

Instantiate an object of this class using the data directly from the server, meaning that the keys may have the wrong camel casing

Parameters:
- data (dict) – The directly translated dict of JSON from the server. No casing fixes have taken place
- keep_attrs (iterable) – List, set or tuple of the dotted namespace notations for attributes to keep within the object structure even if their values are None
Return type: DataStore

get_shared_roles()¶

Retrieve what users have access to this data store

Added in version v3.2.

Return type: list of SharingRole

Modify the ability of users to access this data store

Added in version v2.14.

Parameters: access_list (list of SharingRole) – the modifications to make.
Return type: None
Raises: datarobot.ClientError : – if you do not have permission to share this data store, if the user you’re sharing with doesn’t exist, if the same user appears multiple times in the access_list, or if these changes would leave the data store without an owner.

Examples

The SharingRole class is needed in order to share a Data Store with one or more users.

For example, suppose you had a list of user IDs you wanted to share this DataStore with. You could use a loop to generate a list of SharingRole objects for them, and bulk share this Data Store.

>>> import datarobot as dr
>>> from datarobot.models.sharing import SharingRole
>>> from datarobot.enums import SHARING_ROLE, SHARING_RECIPIENT_TYPE
>>>
>>> user_ids = ["60912e09fd1f04e832a575c1", "639ce542862e9b1b1bfa8f1b", "63e185e7cd3a5f8e190c6393"]
>>> sharing_roles = []
>>> for user_id in user_ids:
...     new_sharing_role = SharingRole(
...         role=SHARING_ROLE.CONSUMER,
...         share_recipient_type=SHARING_RECIPIENT_TYPE.USER,
...         id=user_id,
...         can_share=True,
...     )
...     sharing_roles.append(new_sharing_role)
>>> dr.DataStore.get('my-data-store-id').share(access_list)

Similarly, a SharingRole instance can be used to remove a user’s access if the role is set to SHARING_ROLE.NO_ROLE, like in this example:

>>> import datarobot as dr
>>> from datarobot.models.sharing import SharingRole
>>> from datarobot.enums import SHARING_ROLE, SHARING_RECIPIENT_TYPE
>>>
>>> user_to_remove = "foo.bar@datarobot.com"
... remove_sharing_role = SharingRole(
...     role=SHARING_ROLE.NO_ROLE,
...     share_recipient_type=SHARING_RECIPIENT_TYPE.USER,
...     username=user_to_remove,
...     can_share=False,
... )
>>> dr.DataStore.get('my-data-store-id').share(roles=[remove_sharing_role])

class datarobot.DataSource¶

A data source. Represents data request

Variables:
- id (str) – the id of the data source.
- type (str) – the type of data source.
- canonical_name (str) – the user-friendly name of the data source.
- creator (str) – the id of the user who created the data source.
- updated (datetime.datetime) – the time of the last update.
- params (DataSourceParameters) – a list specifying data source parameters.
- role (str or None) – if a string, represents a particular level of access and should be one of datarobot.enums.SHARING_ROLE. For more information on the specific access levels, see the sharing documentation. If None, can be passed to a share function to revoke access for a specific user.

classmethod list(typ=None)¶

Returns list of available data sources.

Parameters: typ (DataStoreListTypes) – If specified, filters by specified datasource type. If not specified it will default to DataStoreListTypes.DATABASES
Returns: data_sources – contains a list of available data sources.
Return type: list of DataSource instances

Examples

>>> import datarobot as dr
>>> data_sources = dr.DataSource.list()
>>> data_sources
[DataSource('Diagnostics'), DataSource('Airlines 100mb'), DataSource('Airlines 10mb')]

classmethod get(data_source_id)¶

Gets the data source.

Parameters: data_source_id (str) – the identifier of the data source.
Returns: data_source – the requested data source.
Return type: DataSource

Examples

>>> import datarobot as dr
>>> data_source = dr.DataSource.get('5a8ac9ab07a57a0001be501f')
>>> data_source
DataSource('Diagnostics')

classmethod create(data_source_type, canonical_name, params)¶

Creates the data source.

Parameters:
- data_source_type (str or DataStoreTypes) – the type of data source.
- canonical_name (str) – the user-friendly name of the data source.
- params (DataSourceParameters) – a list specifying data source parameters.
Returns: data_source – the created data source.
Return type: DataSource

Examples

>>> import datarobot as dr
>>> params = dr.DataSourceParameters(
...     data_store_id='5a8ac90b07a57a0001be501e',
...     query='SELECT * FROM airlines10mb WHERE "Year" >= 1995;'
... )
>>> data_source = dr.DataSource.create(
...     data_source_type='jdbc',
...     canonical_name='airlines stats after 1995',
...     params=params
... )
>>> data_source
DataSource('airlines stats after 1995')

update(canonical_name=None, params=None)¶

Creates the data source.

Parameters:
- canonical_name (str) – optional, the user-friendly name of the data source.
- params (DataSourceParameters) – optional, the identifier of the DataDriver.
Return type: None

Examples

>>> import datarobot as dr
>>> data_source = dr.DataSource.get('5ad840cc613b480001570953')
>>> data_source
DataSource('airlines stats after 1995')
>>> params = dr.DataSourceParameters(
...     query='SELECT * FROM airlines10mb WHERE "Year" >= 1990;'
... )
>>> data_source.update(
...     canonical_name='airlines stats after 1990',
...     params=params
... )
>>> data_source
DataSource('airlines stats after 1990')

delete()¶

Removes the DataSource

Return type: None

classmethod from_server_data(data, keep_attrs=None)¶

Instantiate an object of this class using the data directly from the server, meaning that the keys may have the wrong camel casing

Parameters:
- data (dict) – The directly translated dict of JSON from the server. No casing fixes have taken place
- keep_attrs (iterable) – List, set or tuple of the dotted namespace notations for attributes to keep within the object structure even if their values are None
Return type: TypeVar(TDataSource, bound= DataSource)

get_access_list()¶

Retrieve what users have access to this data source

Added in version v2.14.

Return type: list of SharingAccess

Modify the ability of users to access this data source

Added in version v2.14.

Parameters: access_list (list of SharingAccess) – The modifications to make.
Return type: None
Raises: datarobot.ClientError: – If you do not have permission to share this data source, if the user you’re sharing with doesn’t exist, if the same user appears multiple times in the access_list, or if these changes would leave the data source without an owner.

Examples

Transfer access to the data source from old_user@datarobot.com to new_user@datarobot.com

from datarobot.enums import SHARING_ROLE
from datarobot.models.data_source import DataSource
from datarobot.models.sharing import SharingAccess

new_access = SharingAccess(
    "new_user@datarobot.com",
    SHARING_ROLE.OWNER,
    can_share=True,
)
access_list = [
    SharingAccess("old_user@datarobot.com", SHARING_ROLE.OWNER, can_share=True),
    new_access,
]

DataSource.get('my-data-source-id').share(access_list)

create_dataset(username=None, password=None, do_snapshot=None, persist_data_after_ingestion=None, categories=None, credential_id=None, use_kerberos=None)¶

Create a Dataset from this data source.

Added in version v2.22.

Parameters:
- username (string, optional) – The username for database authentication.
- password (string, optional) – The password (in cleartext) for database authentication. The password will be encrypted on the server side in scope of HTTP request and never saved or stored.
- do_snapshot (Optional[bool]) – If unset, uses the server default: True. If true, creates a snapshot dataset; if false, creates a remote dataset. Creating snapshots from non-file sources requires an additional permission, Enable Create Snapshot Data Source.
- persist_data_after_ingestion (Optional[bool]) – If unset, uses the server default: True. If true, will enforce saving all data (for download and sampling) and will allow a user to view extended data profile (which includes data statistics like min/max/median/mean, histogram, etc.). If false, will not enforce saving data. The data schema (feature names and types) still will be available. Specifying this parameter to false and doSnapshot to true will result in an error.
- categories (list[string], optional) – An array of strings describing the intended use of the dataset. The current supported options are “TRAINING” and “PREDICTION”.
- credential_id (string, optional) – The ID of the set of credentials to use instead of user and password. Note that with this change, username and password will become optional.
- use_kerberos (Optional[bool]) – If unset, uses the server default: False. If true, use kerberos authentication for database authentication.
Returns: response – The Dataset created from the uploaded data
Return type: Dataset

class datarobot.DataSourceParameters¶

Data request configuration

Variables:
- data_store_id (str) – the id of the DataStore.
- table (str) – Optional. The name of specified database table.
- schema (str) – Optional. The name of the schema associated with the table.
- partition_column (str) – Optional. The name of the partition column.
- query (str) – Optional. The user specified SQL query.
- fetch_size (int) – Optional. A user specified fetch size in the range [1, 20000]. By default a fetchSize will be assigned to balance throughput and memory usage
- path (str) – Optional. The user-specified path for BLOB storage
- filter (str) – A connector-specific filter string (e.g., JQL for Jira). Only supported for DataRobot Connector v1, where applicable. (optional)

Database connectivity¶

class datarobot.DataDriver¶

classmethod list(typ=None)¶

classmethod get(driver_id)¶

classmethod create(class_name, canonical_name, files=None, typ=None, database_driver=None)¶

update(class_name=None, canonical_name=None)¶

delete()¶

class datarobot.Connector¶

classmethod list(data_type=None)¶

classmethod get(connector_id)¶

classmethod create(file_path=None, connector_type=None)¶

update(file_path)¶

delete()¶

class datarobot.DataStore¶

classmethod list(typ=None, name=None, substitute_url_parameters=False, data_type=None)¶

classmethod get(data_store_id, substitute_url_parameters=False)¶

classmethod create(data_store_type, canonical_name, driver_id=None, jdbc_url=None, fields=None, connector_id=None)¶

update(canonical_name=None, driver_id=None, connector_id=None, jdbc_url=None, fields=None)¶

delete()¶

test(username=None, password=None, credential_id=None, use_kerberos=None, credential_data=None, set_default_credential=False)¶

schemas(username, password)¶

tables(username, password, schema=None)¶

classmethod from_server_data(data, keep_attrs=None)¶

get_shared_roles()¶

class datarobot.DataSource¶

classmethod list(typ=None)¶

classmethod get(data_source_id)¶

classmethod create(data_source_type, canonical_name, params)¶

update(canonical_name=None, params=None)¶

delete()¶

classmethod from_server_data(data, keep_attrs=None)¶

get_access_list()¶

create_dataset(username=None, password=None, do_snapshot=None, persist_data_after_ingestion=None, categories=None, credential_id=None, use_kerberos=None)¶

class datarobot.DataSourceParameters¶

Data store¶

class datarobot.models.data_store.TestResponse¶

class datarobot.models.data_store.SchemasResponse¶

class datarobot.models.data_store.TablesResponse¶

Database connectivity¶

class datarobot.DataDriver¶

classmethod list(typ=None)¶

classmethod get(driver_id)¶

classmethod create(class_name, canonical_name, files=None, typ=None, database_driver=None)¶

update(class_name=None, canonical_name=None)¶

delete()¶

class datarobot.Connector¶

classmethod list(data_type=None)¶

classmethod get(connector_id)¶

classmethod create(file_path=None, connector_type=None)¶

update(file_path)¶

delete()¶

class datarobot.DataStore¶

classmethod list(typ=None, name=None, substitute_url_parameters=False, data_type=None)¶

classmethod get(data_store_id, substitute_url_parameters=False)¶

classmethod create(data_store_type, canonical_name, driver_id=None, jdbc_url=None, fields=None, connector_id=None)¶

update(canonical_name=None, driver_id=None, connector_id=None, jdbc_url=None, fields=None)¶

delete()¶

test(username=None, password=None, credential_id=None, use_kerberos=None, credential_data=None, set_default_credential=False)¶

schemas(username, password)¶

tables(username, password, schema=None)¶

classmethod from_server_data(data, keep_attrs=None)¶

get_shared_roles()¶

share(access_list)¶

class datarobot.DataSource¶

classmethod list(typ=None)¶

classmethod get(data_source_id)¶

classmethod create(data_source_type, canonical_name, params)¶

update(canonical_name=None, params=None)¶

delete()¶

classmethod from_server_data(data, keep_attrs=None)¶

get_access_list()¶

share(access_list)¶

create_dataset(username=None, password=None, do_snapshot=None, persist_data_after_ingestion=None, categories=None, credential_id=None, use_kerberos=None)¶

class datarobot.DataSourceParameters¶

Data store¶

class datarobot.models.data_store.TestResponse¶

class datarobot.models.data_store.SchemasResponse¶

class datarobot.models.data_store.TablesResponse¶