Skip to content

Click in-app to access the full platform documentation for your version of DataRobot.

JDBC Connector for Data Prep

User Persona: Data Prep Admin or Data Source Admin

Note

This document covers all configuration fields available during connector setup. Some fields may have already been filled out by your Administrator at an earlier step of configuration and may not be visible to you. For more information on Data Prep's connector framework, see Data Prep Connector setup. Also, your Admin may have named this connector something else in the list of Data Sources.

Configure Data Prep

This Connector enables the ability to import and export data leveraging JDBC Drivers. Typically this connector is leveraged for import/export against relational databases, but many applications offer JDBC drivers. The following fields are used to define the connection parameters.

General

  • Name: Name of the data source as it will appear to users in the UI.
  • Description: Description of the data source as it will appear to users in the UI.

Tip

You can connect Data Prep to multiple JDBC sources. Using a descriptive name can be a big help to users in identifying the appropriate data source.

Database URI

  • JDBC URI: The JDBC connection string as defined by the driver being used. For more details on connection string options, please see the documentation for the driver you intend to use. JDBC connection strings typically take the form:

    jdbc:://::/:;

Database Visibility

You can control the database, schemas, and tables that are shown to users when they browse a data source during import. For databases, schemas, and tables, you can choose to:

  • "Show only" which returns only the databases, schemas or tables that you specify here.
  • "Hide" which hides the databases, schemas, and tables that you specify here.
  • "Show all" which is the default setting to display everything in the data source.

When you select the "Show only" or "Hide" options, a field is provided for specifying the databases, schemas or tables on which you want the option enforced.

Note

These settings are not enforced when users query against the data source; query results still return a complete list of matches. For example, if you choose to "hide" a specific database, users can still execute queries that pull data from tables within that database. However, that database will not be displayed to users when they browse the data source.

Import Configuration

  • Query Prefetch Size: The number of rows per batch during import.

Note

A larger batch size will increase throughput for large imports, but setting this to too large a value can cause the Connector process to run out of memory. The default value for this field is 10,000 rows per batch. The amount of memory granted to a Connector process varies greatly depending on the size of your Data Prep installation and your Data Prep Administrator should be consulted before setting this field to anything greater than the default.

  • Max Column Size: The maximum length, in Unicode characters, for any column. Values larger than this will be replaced with 'null'.
  • Pre-Import SQL: SQL to be executed before import process. This SQL may execute multiple times (for preview and import) and could be multiple SQL statements, newline-delimited.
  • Post-Import SQL: SQL to be executed after import process. This SQL may execute multiple times (for preview and import) and could be multiple SQL statements, newline-delimited.
  • Perform Count Query: This selector allows users to prevent the Connector from executing count queries upon import, which can be very slow on some database tables. To disable counting the number of rows while importing, set this to "False".

Note

As the Pre- and Post-Import SQL may be executed multiple times throughout the import process, please take care when specifying these values in the Connector/Datasource Configuration as they will be executed for every import performed with this configuration.

Export Configuration

  • Export Batch Size: The number of rows per batch during export.

Note

A larger batch size will increase throughput for large exports, but setting this to too large a value can cause the Connector process to run out of memory. The default value for this field is 10,000 rows per batch. The amount of memory granted to a Connector process varies greatly depending on the size of your Data Prep installation and your Data Prep Administrator should be consulted before setting this field to anything greater than the default.

  • Max VARCHAR Size: The maximum width for a VARCHAR column.

Note

This connector will attempt to export a column using the CLOB type if the maximum VARCHAR width is exceeded and the database supports to CLOB type.

  • Automatically Create Table: ENABLED | DISABLED

  • ENABLED: Data Prep will automatically create a new table when exporting a data set. If a table exists, Data Prep will drop the existing table before creating a new table with the same name.

  • DISABLED: Data Prep will not automatically create a new table when exporting a data set. The Connector assumes that a table exists that matches the exported dataset's name & format. The exported data will be appended to the existing table.

  • Pre-Export SQL: SQL to be executed before export process. This SQL will execute once and could be multiple SQL statements, newline-delimited.

  • Post-Export SQL: SQL to be executed after export process. This SQL will execute once and could be multiple SQL statements, newline-delimited.

Credentials

  • User: Username used to access the data source.
  • Password: Password used to access the data source.
  • Role: Some applications allow you to specify a role when connecting. Enter your role value here if required, or this field may be left blank.

Data Import & Export Information

Import Via Browsing

  • Browse Databases, Schemas and/or tables based on your configuration settings. "Select" the table for import.

Import Via SQL Query

  • Requires a SQL Select Query for your database.
  • Example: SELECT * FROM "SAMPLE_DATA"."TPCH_SF1"."CUSTOMER"

Export Information

  • Browse Databases and/or Schemas based on your configuration settings
  • There may be errors if you attempt to export directly under a catalog and not a schema when the database supports catalogs, schemas, and tables.
  • (Optional) Edit the name used for the table name.

Tier 2 Support for JDBC

Non-SaaS customers of Data Prep are able to supply and install their own drivers in for use with the JDBC Connector. This functionality is not currently available for SaaS customers.

Data Prep has a test kit that can help assess whether a given JDBC driver will be a good fit for use in Data Prep. The Data Prep Customer Success team will run this test kit on your behalf.

Note

Success with this tool does not certify the supplied JDBC driver for official support. Drivers that pass tests using this tool are not guaranteed to work with current or future versions of the Data Prep application. The JDBC Test Kit is not a comprehensive JDBC test suite tool.

What does “Tier 2” mean?

If the JDBC Connector Test Kit passes all tests, Data Prep supports usage of the combination of:

  • A specific Data Prep JDBC connector version and a specific Data Prep Core Server version. These numbers generally match the Test Kit version, but this is not always the case.
  • A specific database/application version.
  • A specific JDBC driver version.
  • The datatypes tested.
  • Only the successfully tested features - There are currently three: Import via query, Import via browse, and Export.

What else does this mean:

  • Data Prep does NOT certify the database/application for JDBC or other connector connectivity.
  • Data Prep does NOT explicitly or formally test against the database/application.
  • Data Prep does NOT test new versions nor Service Packs for potential regressions of Tier 2 data sources.
  • Data Prep does NOT make Tier 2 JDBC sources available in Data Prep Cloud. Only the certified (Tier 1) JDBC sources are available in Data Prep Cloud.

How do I set it up?

If you have a JDBC driver you would like to use with Data Prep, please follow these steps for using it under the JDBC Connector as a Tier 2 data source:

  1. Start by reaching out to your Customer Success representative and provide the driver you would like to use, they can test the driver for you using the test kit.
  2. If the test is successful, the Customer Success representative will help you install the driver in the correct directory on the Data Prep Core Server and add the driver to the registry.
  3. From there, you can refer to your driver documentation and the details above to configure the JDBC Connector to work with your new driver.

Updated October 28, 2021
Back to top