Trino¶
Supported authentication¶
- Basic (username/password)
Prerequisites¶
The following is required before connecting to Trino in DataRobot:
- Data stored in a Trino database
Required parameters¶
The table below lists the minimum required fields to establish a connection with Trino:
| Required field | Description | Documentation |
|---|---|---|
Host |
The hostname or IP address of your Trino coordinator. | Trino documentation |
Troubleshooting¶
| Problem | Solution | Instructions |
|---|---|---|
| When attempting to execute an operation in DataRobot, the firewall requests that you clear the IP address each time. | Add all allowed IPs for DataRobot. | See Allowed source IP addresses. If you've already added the allowed IPs, check the existing IPs for completeness. |
Code examples¶
The Python example below shows how to connect to and move data from Trino into DataRobot.
Initialize the DataRobot client and define database details for later use:
api_token = '<token>'
endpoint = 'https://app.datarobot.com/api/v2'
import datarobot as dr
from datarobot.enums import DataStoreTypes
dr.Client(token=api_token, endpoint=endpoint)
TRINO_HOST = "datarobot.trino.galaxy.starburst.io"
TRINO_PORT = 443
USE_SSL = "true"
CATALOG = "<catalog>"
SCHEMA = "<schema>"
TABLE = "<table>"
QUERY = None
TRINO_USERNAME = "<username>"
TRINO_PASSWORD = "<password>"
Do one of the following to locate your Trino driver ID:
-
Create the Trino driver ID:
trino_driver = dr.DataDriver.create( class_name=DataStoreTypes.DR_DATABASE_V1, canonical_name='Trino Driver', database_driver='trino-v1', ) -
Reference an existing Trino driver ID:
trino_driver = dr.DataDriver.get('<trino_driver_id>')
Create (or reuse) Trino credentials and securely save them in DataRobot:
trino_credentials = dr.Credential.create_basic(
name='Trino Credentials',
user=TRINO_USERNAME,
password=TRINO_PASSWORD,
)
Define a connection to the external data store:
datastore_fields = [
{"id": "host", "name": "Host Name", "value": TRINO_HOST},
{"id": "port", "name": "port", "value": str(TRINO_PORT)},
{"id": "ssl", "name": "ssl", "value": USE_SSL},
]
trino_datastore = dr.DataStore.create(
data_store_type=DataStoreTypes.DR_DATABASE_V1,
canonical_name='Trino Datastore',
driver_id=trino_driver.id,
fields=datastore_fields,
)
Point to a specific data source (table or query):
data_source_params = dr.DataSourceParameters(
data_store_id=trino_datastore.id,
catalog=CATALOG,
schema=SCHEMA,
table=TABLE,
query=QUERY,
)
trino_datasource = dr.DataSource.create(
data_source_type=DataStoreTypes.DR_DATABASE_V1,
canonical_name='Trino DataSource',
params=data_source_params,
)
Pull the data from Trino and import a snapshotted version into DataRobot:
trino_dataset = trino_datasource.create_dataset(
do_snapshot=True,
credential_id=trino_credentials.id,
)