Restoring PostgreSQL¶
This section outlines prerequisites and steps for restoring an internal PostgreSQL database for DataRobot from a backup, particularly for installations using the pcs-ha charts.
Execute this operation from the macOS or GNU/Linux machine where the previously taken PostgreSQL backup is located.
備考
If your DataRobot application is configured to use managed services (external PCS) for PostgreSQL, do not follow this guide. Instead, refer to the backup and restore documentation provided by your cloud provider, for example, the AWS documentation on backing up and restoring Amazon RDS for PostgreSQL.
備考
When restoring backed-up PostgreSQL databases, the following databases must be skipped (see the "Restore procedure" section below for how to exclude them):
identityresourceservice
sushihydraAdditionally, if the DataRobot version being restored to (or from which the backup was taken) is
>= 10.1.0, also skip:
cnshydra
前提条件¶
Ensure the following tools are installed on the host where the backup will be restored:
-
pg_restore:
- DataRobot 11.0: Use version
12of pg_restore. - DataRobot 11.1 or newer: Use version
14of pg_restore. - See PostgreSQL Downloads.
- DataRobot 11.0: Use version
-
kubectl: Version
1.23or later.- See Kubernetes Tools Documentation.
kubectlmust be configured to access the Kubernetes cluster where the DataRobot application is running. Verify this configuration with thekubectl cluster-infocommand.
Restore procedure for internal PostgreSQL¶
Follow these steps to restore your internal PostgreSQL database.
-
Set the
DR_CORE_NAMESPACEenvironment variable to your DataRobot application's Kubernetes namespace. Replace<your-datarobot-namespace>with the actual namespace.export DR_CORE_NAMESPACE=<your-datarobot-namespace> -
Define the location on the host where your PostgreSQL backup files are stored. This example assumes
~/datarobot-backups.export BACKUP_LOCATION=~/datarobot-backups -
Define a local port for port-forwarding to the PostgreSQL service. This example uses port
54321.export LOCAL_PGSQL_PORT=54321 -
Obtain the PostgreSQL admin user password from the Kubernetes secret.
export PGPASSWORD=$(kubectl -n $DR_CORE_NAMESPACE get secret pcs-postgresql -o jsonpath='{.data.postgres-password}' | base64 -d) echo "PostgreSQL password retrieved for restore process." -
Forward the local port to the remote PostgreSQL service running in Kubernetes. This command runs in the background.
kubectl -n $DR_CORE_NAMESPACE port-forward svc/pcs-postgresql --address 127.0.0.1 $LOCAL_PGSQL_PORT:5432 &Wait a few seconds for the port-forwarding to establish.
-
Restore all databases from the
$BACKUP_LOCATION/pgsqldirectory, excluding specified system databases.備考
If restoring to DataRobot version
10.1.0or later, ensure! -name cnshydrais included in thefindcommand below to omit thecnshydradatabase. Adjust thefindcommand if your backup structure or exclusion list differs.for db_backup_path in $(find $BACKUP_LOCATION/pgsql -mindepth 1 -maxdepth 1 -type d ! -name postgres ! -name sushihydra ! -name identityresourceservice); do # Add ! -name cnshydra if DR version >= 10.1.0 echo "Restoring database from path: $db_backup_path" pg_restore -v -Upostgres -hlocalhost -p$LOCAL_PGSQL_PORT -cC -j4 -d postgres "$db_backup_path" done -
Once the restore operation is complete, find the process ID (PID) of the
kubectl port-forwardcommand.ps aux | grep -E "port-forwar[d].*$LOCAL_PGSQL_PORT" -
Finally, stop the port-forwarding process using its PID. Replace
<pid_of_the_kubectl_port-forward>with the actual PID found in the previous step.kill <pid_of_the_kubectl_port-forward>Confirm that the port-forwarding process has stopped.
-
Restore PostgreSQL credentials by patching the existing
pcs-postgresqlsecret with data from your backed-up secret file (e.g.,${BACKUP_LOCATION}/secrets/pcs/pcs-postgresql.json).kubectl -n $DR_CORE_NAMESPACE patch secret pcs-postgresql --patch-file="$BACKUP_LOCATION/secrets/pcs/pcs-postgresql.json"