Skip to content

Back up PostgreSQL

This operation can be executed from any macOS or GNU/Linux machine that has enough space to store the backup.

Warning

If your DataRobot platform is configured to use managed services (external PCS) for PostgreSQL, you must instead refer to the backup and restore documentation provided by your cloud provider, for example, the AWS documentation on backing up and restoring Amazon RDS for PostgreSQL.

Considerations

As the database size increases, the execution time of pg_dump also increases. This can reach impractical durations in certain scenarios, potentially spanning days. For production environments or large databases, DataRobot strongly recommends using managed services (external PCS) with their native backup solutions.

Create backup

Note

You must fulfill the prerequisites before proceeding.

If you deploy these services within the Kubernetes cluster, you can use the script below to create a backup.

For the backup process, you must forward a local port to the remote PostgreSQL service:

export LOCAL_PGSQL_PORT=54321

Note

Define which local port you use by setting the LOCAL_PGSQL_PORT value. This example uses port 54321.

Obtain the PostgreSQL admin user password:

export PGPASSWORD=$(kubectl -n $NAMESPACE get secret pcs-postgresql -o jsonpath='{.data.postgres-password}' | base64 -d)
echo $PGPASSWORD

Forward the local port to the remote PostgreSQL service deployed in the Kubernetes:

kubectl -n $NAMESPACE port-forward svc/pcs-postgresql --address 127.0.0.1 $LOCAL_PGSQL_PORT:5432 &

List databases for backup:

mkdir -p ${BACKUP_LOCATION}/pgsql

dbs=$(psql -Upostgres -hlocalhost -p $LOCAL_PGSQL_PORT -t -c "SELECT datname FROM pg_database;" \
| grep -vE 'template|repmgr|postgres' \
| sed 's/\r//g')

cd ${BACKUP_LOCATION}/pgsql/; mkdir -p $dbs

Separately back up each database:

for db in $dbs; do
  pg_dump -Upostgres -hlocalhost -p$LOCAL_PGSQL_PORT -Fd -j4 "$db" -f "$BACKUP_LOCATION/pgsql/$db";
done

After the backup completes, find the process ID of the port-forwarding process:

ps aux | grep -E "port-forwar[d].*$LOCAL_PGSQL_PORT"

Then, stop the port-forwarding process:

kill PID_OF_THE_KUBECTL_PORT_FORWARD

Note

Replace PID_OF_THE_KUBECTL_PORT_FORWARD with process ID obtained from the previous command.