Google BigQuery Connector for Data Prep¶
User Persona: Data Prep User, Data Prep Admin, Data Source Admin, or IT/DevOps
Note
This document covers all configuration fields available during connector setup. Some fields may have already been filled out by your Administrator at an earlier step of configuration and may not be visible to you. For more information on Data Prep's connector framework, see Data Prep Connector setup. Also, your Admin may have named this connector something else in the list of Data Sources.
Configure Data Prep¶
This connector allows you to connect to BigQuery for importing and exporting available data. The fields you are required to set up here depending on how the connector was configured by your administrator.
General¶
- Name: Name of the data source as it will appear to users in the UI.
- Description: Description of the data source as it will appear to users in the UI.
Tip
You can connect Data Prep to multiple BigQuery accounts. Using a descriptive name can be a big help to users in identifying the appropriate data source.
BigQuery Configuration¶
-
OAuth Verifier Key: The verifier key used to authenticate with BigQuery. To obtain the verifier key, click "Test Data Source" and follow the link to grant access to BigQuery. After allowing access, you will be redirected to a page that displays an access code. Copy the code into this field.
-
Profile: The ID of the GCP Project to which you will connect.
-
Automatically Create Table (optional): If enabled, Data Prep will drop the table whose name matches the name of the exported dataset, if one already exists, and recreate the table using the exported dataset. If disabled, Data Prep will expect that the table is already created and will try to export it.
Google Cloud Storage Configuration for Export¶
These fields are necessary to perform export to BigQuery. If you intend to only import, you can leave these blank.
Note
They must either both be provided or both left blank.
-
Google Cloud Storage Bucket Name: Google Cloud Storage bucket name to be used as a staging area for export.
-
Google Cloud Storage JSON Web Token: Content of JSON Web Token (JWT) to be used to connect to Google Cloud Storage.
Web Proxy¶
If you connect to BigQuery through a proxy server, these fields define the proxy details.
- Web Proxy: 'None' if no proxy is required or 'Proxied' if the connection to BigQuery should be made via a proxy server. If a web proxy server is required, the following fields are required to enable a proxied connection.
- Proxy Host: The hostname or IP address of the proxy server.
- Proxy Port: The port of the proxy server.
- Proxy Username and Proxy Password: User credentials for an authenticated proxy connection. Leave these blank for an unauthenticated proxy connection.
Data Import Information¶
Via Browsing¶
-
View datasets and tables within the project specified in your configuration. The project will appear as the top-level directory in the browsing view.
-
Browse to a table within a dataset and "Select" the table for import.
Via SQL Query¶
- Using a SQL Select Query.
Usage¶
Each table name in a query must be single-quoted, with any dot separation occurring outside the single-quotes.
Valid syntax
SELECT * FROM `my-project`.`paxata`.`test`
Invalid syntax
SELECT * FROM `my-project.paxata.test`