# Databricks

> Databricks - How to connect to the native Databricks connector.

This Markdown file sits beside the HTML page at the same path (with a `.md` suffix). It summarizes the topic and lists links for tools and LLM context.

Companion generated at `2026-05-01T23:10:48.100040+00:00` (UTC).

## Primary page

- [Databricks](https://docs.datarobot.com/en/docs/reference/data-ref/data-sources/wb-databricks.html): Full documentation for this topic (HTML).

## Sections on this page

- [Supported authentication](https://docs.datarobot.com/en/docs/reference/data-ref/data-sources/wb-databricks.html#supported-authentication): In-page section heading.
- [Prerequisites](https://docs.datarobot.com/en/docs/reference/data-ref/data-sources/wb-databricks.html#prerequisites): In-page section heading.
- [Generate a personal access token](https://docs.datarobot.com/en/docs/reference/data-ref/data-sources/wb-databricks.html#generate-a-personal-access-token): In-page section heading.
- [Create a service principal](https://docs.datarobot.com/en/docs/reference/data-ref/data-sources/wb-databricks.html#create-a-service-principal): In-page section heading.
- [Set up a connection in DataRobot](https://docs.datarobot.com/en/docs/reference/data-ref/data-sources/wb-databricks.html#set-up-a-connection-in-datarobot): In-page section heading.
- [Required parameters](https://docs.datarobot.com/en/docs/reference/data-ref/data-sources/wb-databricks.html#required-parameters): In-page section heading.
- [Troubleshooting](https://docs.datarobot.com/en/docs/reference/data-ref/data-sources/wb-databricks.html#troubleshooting): In-page section heading.
- [Feature considerations](https://docs.datarobot.com/en/docs/reference/data-ref/data-sources/wb-databricks.html#feature-considerations): In-page section heading.

## Related documentation

- [Reference documentation](https://docs.datarobot.com/en/docs/reference/index.html): Linked from this page.
- [Data reference](https://docs.datarobot.com/en/docs/reference/data-ref/index.html): Linked from this page.
- [Supported data stores](https://docs.datarobot.com/en/docs/reference/data-ref/data-sources/index.html): Linked from this page.
- [connecting to a data source](https://docs.datarobot.com/en/docs/workbench/nxt-workbench/dataprep/add-data/connect.html#connect-to-a-data-source): Linked from this page.
- [Allowed source IP addresses](https://docs.datarobot.com/en/docs/reference/data-ref/allowed-ips.html): Linked from this page.

## Documentation content

# Databricks

The Databricks connector allows you to access data in Databricks on Azure or AWS. In addition to accessing Databricks tables, you can use the Databricks connector to access data stored in Delta Lake and Iceberg format as long as they are registered as tables in the [Databricks Unity Catalog](https://docs.databricks.com/aws/en/tables/managed). The native Databricks connector also supports ingesting unstructured data from the Databricks Unity Catalog volumes, allowing you to browse and ingest files stored in volumes when creating vector databases.

> [!NOTE] Note
> To ingest both structured and unstructured data from Databricks, you must add the connector twice.

## Supported authentication

- Personal access token
- Service principal

## Prerequisites

In addition to either a [personal access token](https://docs.datarobot.com/en/docs/reference/data-ref/data-sources/wb-databricks.html#generate-a-personal-access-token) or [service principal](https://docs.datarobot.com/en/docs/reference/data-ref/data-sources/wb-databricks.html#create-a-service-principal) for authentication, the following is required before connecting to Databricks in DataRobot:

**Azure:**
A
Databricks workspace
in the Azure Portal app
Data stored in an Azure Databricks database

**AWS:**
A
Databricks workspace
in AWS
Data stored in an AWS Databricks database


To ingest unstructured data, you must also set up a volume in the [Databricks Unity Catalog](https://docs.databricks.com/aws/en/volumes).

### Generate a personal access token

**Azure:**
In the Azure Portal app, generate a personal access token for your Databricks workspace. This token will be used to authenticate your connection to Databricks in DataRobot.

See the [Azure Databricks documentation](https://learn.microsoft.com/en-us/azure/databricks/dev-tools/auth#--azure-databricks-personal-access-tokens-for-workspace-users).

**AWS:**
In AWS, generate a personal access token for your Databricks workspace. This token will be used to authenticate your connection to Databricks in DataRobot.

See the [Databricks on AWS documentation](https://docs.databricks.com/en/dev-tools/auth.html#databricks-personal-access-token-authentication).


### Create a service principal

**Azure:**
In the Azure Portal app, create a service principal for your Databricks workspace. The resulting client ID and client secret will be used to authenticate your connection to Databricks in DataRobot.

See the [Azure Databricks documentation](https://learn.microsoft.com/en-us/azure/databricks/dev-tools/auth/oauth-m2m). In the linked instructions, copy the following information:

Application ID
: Entered in the client ID field during setup in DataRobot.
OAuth secrets
: Entered in the client secret field during setup in DataRobot.

Make sure the service principal has permission to access the data you want to use.

**AWS:**
In AWS, create a service principal for your Databricks workspace. The resulting client ID and client secret will be used to authenticate your connection to Databricks in DataRobot.

See the [Databricks on AWS documentation](https://docs.databricks.com/en/dev-tools/auth/oauth-m2m.html).

Make sure the service principal has permission to access the data you want to use.


## Set up a connection in DataRobot

The example below shows how to connect to Databricks using Azure and an access token to ingest structured data.

To connect to Databricks in DataRobot:

1. Open Workbench and select a Use Case.
2. Follow the instructions for connecting to a data source .
3. With the information retrieved in theprevious section, fill in therequired configuration parameters.
4. UnderAuthentication, clickNew credentials. Then, enter your access token and a unique display name. If you've previously added credentials for this data source, you can select it from your saved credentials. If you selected service principal as the authentication method, enter the client ID, client secret, and a unique display name.
5. ClickSave.

### Required parameters

The table below lists the minimum required fields to establish a connection with Databricks:

**Azure:**
Required field
Description
Documentation
Server Hostname
The address of the server to connect to.
Azure Databricks documentation
HTTP Path
The compute resources URL.
Azure Databricks documentation

**AWS:**
Required field
Description
Documentation
Server Hostname
The address of the server to connect to.
Databricks on AWS documentation
HTTP Path
The compute resources URL.
Databricks on AWS documentation


SQL warehouses are dedicated to execute SQL, and as a result, have less overhead than clusters and often provide better performance. It is recommended to use a SQL warehouse if possible.

> [!NOTE] Note
> If the `catalog` parameter is specified in a connection configuration, Workbench will only show a list of schemas in that catalog. If this parameter is not specified, Workbench lists all catalogs you have access to.

## Troubleshooting

| Problem | Solution | Instructions |
| --- | --- | --- |
| When attempting to execute an operation in DataRobot, the firewall requests that you clear the IP address each time. | Add all allowed IPs for DataRobot. | See Allowed source IP addresses. If you've already added the allowed IPs, check the existing IPs for completeness. |

## Feature considerations

Review the feature considerations below before connecting to Databricks:

- Unstructured data is only available during vector database creation.
