Skip to content

Using Google Cloud SQL (PostgreSQL) with DataRobot

はじめに

DataRobot supports the use of external databases, including Google Cloud SQL for PostgreSQL, to store and manage data. This documentation provides detailed instructions on configuring DataRobot to utilize Google Cloud SQL as a database solution.

Steps to Configure Google Cloud SQL (PostgreSQL) with DataRobot

1. Create a Google Cloud SQL (PostgreSQL) Instance

  • Log in to the Google Cloud Console.
  • Navigate to the Cloud SQL section.
  • Click "Create instance" and choose "PostgreSQL."
  • Configure the following settings:
  • Instance ID: Enter a unique instance ID.
  • Region: Choose the GCP region.
  • Database version: Select the PostgreSQL Engine Version 12.
  • Machine type: Choose an appropriate instance size based on our database requirements.
  • Storage type: Choose "Standard" or "SSD" based on your requirements.
  • Disk size: Set the desired disk space for the database.

(Optional) Configure Private IP Address

  • Under Connectivity, select Private IP to enable private IP connectivity.
  • Choose the VPC network for private connectivity.

2. Open the "Users" Tab

  • In the instance details page, click on the "USERS" tab.

  • Click the "Add User Account" button to create a new user.

  • Select the "Built-in authentication"

  • Enter the following details for the new user:

  • Name: postgres

  • Password: Set a secure password for the user.

3. Obtain Google Cloud SQL Connection Details

  • Once the Cloud SQL instance is created, note down the following details:
  • Private IP Address: The private IP address assigned to the Cloud SQL instance.

4. Configure DataRobot to Use Google Cloud SQL (PostgreSQL)

When PostgreSQL is configured to utilize an external service, additional YAML override values must be provided.

postgresql-ha:
  postgresql:
    postgresPassword: YOUR_GCP_SQL_PASSWORD  # created on step 2 

then add to your values.yaml within the datarobot chart.

global:
  postgresql:
    internal: false
    hostname:  "YOUR_GCP_SQL_INSTANCE_IP"

core-integration-tasks:
  jobs:
    setup:
      config_env_vars:
        PGSQL_INIT_SCRIPT: /init-config/db

build-service:
  buildService:
    envApp:
      secret:
        POSTGRES_HOST: "YOUR_GCP_SQL_INSTANCE_IP" 

Built-in Backup Service

Google Cloud SQL offers a built-in backup service that automatically performs backups of your databases according to a specified schedule. You can configure the backup retention period and frequency to meet your data retention requirements. Additionally, you can perform on-demand backups and create backups of specific instances at any time.

成果

By following these steps, you can seamlessly integrate Google Cloud SQL (PostgreSQL) with DataRobot, providing a reliable and scalable database solution. This setup enhances data storage and retrieval, contributing to the overall efficiency of your DataRobot instance. If issues arise, refer to DataRobot documentation and Google Cloud SQL documentation for troubleshooting guidance.

Note: Always ensure that you follow best practices for security and compliance when configuring external databases with DataRobot and Google Cloud SQL.

Fresh Install or Upgrade to 10.2.3

Google CloudSQL

  • Modify Database flags on your Cloud SQL Postgres and set password_encryption to scram-sha-256

  • Once option group is added to RDS

  • Follow guide to delete the existing secret

  • Continue with upgrade as in the admin guide