Skip to content

Click in-app to access the full platform documentation for your version of DataRobot.

MS Azure Synapse Analytics Connector for Data Prep

User Persona: Data Prep User, Data Prep Admin, Data Source Admin, or IT/DevOps

Note

This document covers all configuration fields available during connector setup. Some fields may have already been filled out by your Administrator at an earlier step of configuration and may not be visible to you. For more information on Data Prep's connector framework, see Data Prep Connector setup. Also, your Admin may have named this connector something else in the list of Data Sources.

Configure Data Prep

This connector allows you to connect to an Azure Synapse Analytics for Library imports and exports. For export, the connector uploads data into Azure Data Lake service and then exposes the data as a table in the SQL Data Warehouse.

The following is information on the parameters used to create the connector.

General

  • Name: Name of the data source as it will appear to users in the UI.

  • Description: Description of the data source as it will appear to users in the UI.

Tip

You can connect Data Prep to multiple Azure Synapse Analytics accounts. Using a descriptive name can be a big help to users in identifying the appropriate data source. If you are a Data Prep SaaS customer, inform Data Prep DevOps of how you would like this set.

Database URL

The Database URL is where you provide the Java Database Connectivity (JDBC) connection string. This string tells Data Prep where the database is for import and export. You can include the schema name in the URL.

Visibility Settings

You can control the database, schemas, and tables that are shown to users when they browse a data source during import. For databases, schemas, and tables, you can choose to:

  • "Show only" which returns only the databases, schemas or tables that you specify here.
  • "Hide" which hides the databases, schemas, and tables that you specify here.
  • "Show all" which is the default setting to display everything in the data source.

When you select the "Show only" or "Hide" options, a field is provided for specifying the databases, schemas or tables on which you want the option enforced.

Note

These settings are not enforced when users query against the data source; query results still return a complete list of matches. For example, if you choose to "hide" a specific database, the query results will still include that database in the returned results. However, that database will not be displayed to users when they browse the data source.

Import Configuration

The import configurations let you specify how data is imported into Data Prep.

  • Query Fetch Size: Number of rows per batch to use when fetching data on import.
  • Pre-import SQL: A SQL statement to execute before beginning import, after the table’s schema is determined.
  • Post-import SQL: A SQL statement to execute after import completes.

Export Configuration

The export configurations let you specify how data is exported out of Data Prep.

  • Pre-export SQL: a SQL statement to execute before beginning export, after the table is created if auto-create is enabled.
  • Post-export SQL: a SQL statement to execute after export completes.
  • External Data Source Name: Name of the external data source in SQL Data Warehouse for accessing data in Azure Data Lake.

Note

The maximum VARCHAR size is 8000 characters. Values exceeding 8000 characters are exported as an empty string.

Credentials

The credentials settings allow you to specify a single user account to authenticate with when connecting to the data source.

Azure Data Lake Configuration

The Azure Data Lake Configuration is where you provide the settings Data Prep needs to connect to your Azure Data Lake.

  • ADL URI: The URI for the ADL site.
  • Root Directory: Specifies the top-level of the directory structure from which import/export of data is enabled.
  • Application ID: The application ID for the ADL site.
  • OAUTH 2.0 Token Endpoint: The OAUTH 2.0 TOKEN ENDPOINT for the ADL site.
  • Application Access Key Value: The Application Access Key Value for the ADL site.

Data Import Information

Via Browsing

Browse to a table and "Select" the table for import.

  • Supported data formats:

  • Delimited datasets: comma, tab…

  • XML
  • JSON
  • Excel: Xls and XLSX
  • Avro
  • Parquet
  • Fixed format

  • Export

    • Only one export format is available, JDBC
    • (Optional) Edit the name used for the Snowflake table name

Via SQL Query

Using a legal SQL Select Query


Updated April 12, 2022
Back to top