Skip to content

Trino

Supported authentication

  • Basic (username/password)

Prerequisites

The following is required before connecting to Trino in DataRobot:

  • Data stored in a Trino database

Required parameters

The table below lists the minimum required fields to establish a connection with Trino:

Required field Description Documentation
Host The hostname or IP address of your Trino coordinator. Trino documentation

Troubleshooting

Problem Solution Instructions
When attempting to execute an operation in DataRobot, the firewall requests that you clear the IP address each time. Add all allowed IPs for DataRobot. See Allowed source IP addresses. If you've already added the allowed IPs, check the existing IPs for completeness.

Code examples

The Python example below shows how to connect to and move data from Trino into DataRobot.

Initialize the DataRobot client and define database details for later use:

api_token = '<token>'
endpoint = 'https://app.datarobot.com/api/v2'

import datarobot as dr
from datarobot.enums import DataStoreTypes

dr.Client(token=api_token, endpoint=endpoint)

TRINO_HOST = "datarobot.trino.galaxy.starburst.io"
TRINO_PORT = 443
USE_SSL = "true"
CATALOG = "<catalog>"
SCHEMA = "<schema>"
TABLE = "<table>"
QUERY = None
TRINO_USERNAME = "<username>"
TRINO_PASSWORD = "<password>"

Do one of the following to locate your Trino driver ID:

  • Create the Trino driver ID:

    trino_driver = dr.DataDriver.create(
        class_name=DataStoreTypes.DR_DATABASE_V1,
        canonical_name='Trino Driver',
        database_driver='trino-v1',
    )
    
  • Reference an existing Trino driver ID:

    trino_driver = dr.DataDriver.get('<trino_driver_id>')
    

Create (or reuse) Trino credentials and securely save them in DataRobot:

trino_credentials = dr.Credential.create_basic(
    name='Trino Credentials',
    user=TRINO_USERNAME,
    password=TRINO_PASSWORD,
)

Define a connection to the external data store:

datastore_fields = [
    {"id": "host", "name": "Host Name", "value": TRINO_HOST},
    {"id": "port", "name": "port", "value": str(TRINO_PORT)},
    {"id": "ssl", "name": "ssl", "value": USE_SSL},
]

trino_datastore = dr.DataStore.create(
    data_store_type=DataStoreTypes.DR_DATABASE_V1,
    canonical_name='Trino Datastore',
    driver_id=trino_driver.id,
    fields=datastore_fields,
)

Point to a specific data source (table or query):

data_source_params = dr.DataSourceParameters(
    data_store_id=trino_datastore.id,
    catalog=CATALOG,
    schema=SCHEMA,
    table=TABLE,
    query=QUERY,
)

trino_datasource = dr.DataSource.create(
    data_source_type=DataStoreTypes.DR_DATABASE_V1,
    canonical_name='Trino DataSource',
    params=data_source_params,
)

Pull the data from Trino and import a snapshotted version into DataRobot:

trino_dataset = trino_datasource.create_dataset(
    do_snapshot=True,
    credential_id=trino_credentials.id,
)