Skip to content

Restoring PostgreSQL

This section outlines prerequisites and steps for restoring an internal PostgreSQL database for DataRobot from a backup, particularly for installations using the pcs-ha charts.

Execute this operation from the macOS or GNU/Linux machine where the previously taken PostgreSQL backup is located.

備考

If your DataRobot application is configured to use managed services (external PCS) for PostgreSQL, do not follow this guide. Instead, refer to the backup and restore documentation provided by your cloud provider, for example, the AWS documentation on backing up and restoring Amazon RDS for PostgreSQL.

備考

When restoring backed-up PostgreSQL databases, the following databases must be skipped (see the "Restore procedure" section below for how to exclude them):

  • identityresourceservice

  • sushihydra

Additionally, if the DataRobot version being restored to (or from which the backup was taken) is >= 10.1.0, also skip:

  • cnshydra

前提条件

Ensure the following tools are installed on the host where the backup will be restored:

  • pg_restore:

    • DataRobot 11.0: Use version 12 of pg_restore.
    • DataRobot 11.1 or newer: Use version 14 of pg_restore.
    • See PostgreSQL Downloads.
  • kubectl: Version 1.23 or later.

    • See Kubernetes Tools Documentation.
    • kubectl must be configured to access the Kubernetes cluster where the DataRobot application is running. Verify this configuration with the kubectl cluster-info command.

Restore procedure for internal PostgreSQL

Follow these steps to restore your internal PostgreSQL database.

  1. Set the DR_CORE_NAMESPACE environment variable to your DataRobot application's Kubernetes namespace. Replace <your-datarobot-namespace> with the actual namespace.

    export DR_CORE_NAMESPACE=<your-datarobot-namespace> 
    
  2. Define the location on the host where your PostgreSQL backup files are stored. This example assumes ~/datarobot-backups.

    export BACKUP_LOCATION=~/datarobot-backups 
    
  3. Define a local port for port-forwarding to the PostgreSQL service. This example uses port 54321.

    export LOCAL_PGSQL_PORT=54321 
    
  4. Obtain the PostgreSQL admin user password from the Kubernetes secret.

    export PGPASSWORD=$(kubectl -n $DR_CORE_NAMESPACE get secret pcs-postgresql -o jsonpath='{.data.postgres-password}' | base64 -d)
    echo "PostgreSQL password retrieved for restore process." 
    
  5. Forward the local port to the remote PostgreSQL service running in Kubernetes. This command runs in the background.

    kubectl -n $DR_CORE_NAMESPACE port-forward svc/pcs-postgresql --address 127.0.0.1 $LOCAL_PGSQL_PORT:5432 & 
    

    Wait a few seconds for the port-forwarding to establish.

  6. Restore all databases from the $BACKUP_LOCATION/pgsql directory, excluding specified system databases.

    備考

    If restoring to DataRobot version 10.1.0 or later, ensure ! -name cnshydra is included in the find command below to omit the cnshydra database. Adjust the find command if your backup structure or exclusion list differs.

    for db_backup_path in $(find $BACKUP_LOCATION/pgsql -mindepth 1 -maxdepth 1 -type d ! -name postgres ! -name sushihydra ! -name identityresourceservice); do # Add ! -name cnshydra if DR version >= 10.1.0
      echo "Restoring database from path: $db_backup_path"
      pg_restore -v -Upostgres -hlocalhost -p$LOCAL_PGSQL_PORT -cC -j4 -d postgres "$db_backup_path"
    done 
    
  7. Once the restore operation is complete, find the process ID (PID) of the kubectl port-forward command.

    ps aux | grep -E "port-forwar[d].*$LOCAL_PGSQL_PORT" 
    
  8. Finally, stop the port-forwarding process using its PID. Replace <pid_of_the_kubectl_port-forward> with the actual PID found in the previous step.

    kill <pid_of_the_kubectl_port-forward> 
    

    Confirm that the port-forwarding process has stopped.

  9. Restore PostgreSQL credentials by patching the existing pcs-postgresql secret with data from your backed-up secret file (e.g., ${BACKUP_LOCATION}/secrets/pcs/pcs-postgresql.json).

    kubectl -n $DR_CORE_NAMESPACE patch secret pcs-postgresql --patch-file="$BACKUP_LOCATION/secrets/pcs/pcs-postgresql.json"