Backing up MongoDB¶
This operation can be executed from any macOS or GNU/Linux machine that has enough space to store the backup.
注意
If your DataRobot application is configured to use a managed service (external PCS) like MongoDB Atlas, do not follow this guide. Instead, refer to the backup documentation provided by your service provider, for example, the MongoDB Atlas documentation on backup procedures.
前提条件¶
Ensure the following tools are installed on the host where the backup will be created:
-
mongodump: Version
100.6.0or a compatible version. -
kubectl: Version
1.23or later.- See Kubernetes Tools Documentation.
kubectlmust be configured to access the Kubernetes cluster where the DataRobot application is running. Verify this configuration with thekubectl cluster-infocommand.
注意事項¶
As the database size increases, the execution time of mongodump also increases. This can reach impractical durations in certain scenarios, potentially spanning days. For production environments or large databases, DataRobot strongly recommends using managed services (external PCS) with their native backup solutions.
Backup procedure for internal MongoDB¶
If you are using internal MongoDB deployed via the pcs-ha charts, use the steps below to create a backup.
-
Set the
DR_CORE_NAMESPACEenvironment variable to your DataRobot application's Kubernetes namespace. Replace<your-datarobot-namespace>with the actual namespace.export DR_CORE_NAMESPACE=<your-datarobot-namespace> -
Define the backup location on the host where the backup files will be stored. This example uses
~/datarobot-backups/mongodb.export BACKUP_LOCATION=~/datarobot-backups/mongodb mkdir -p ${BACKUP_LOCATION} -
Define a local port for port-forwarding to the MongoDB service. This example uses port
27018.export LOCAL_MONGO_PORT=27018 -
Obtain the MongoDB root user password from the Kubernetes secret.
export PCS_MONGO_PASSWD=$(kubectl -n $DR_CORE_NAMESPACE get secret pcs-mongo -o jsonpath="{.data.mongodb-root-password}" | base64 -d) echo "MongoDB password retrieved." -
Forward the local port to the remote MongoDB service running in Kubernetes. This command runs in the background.
kubectl -n $DR_CORE_NAMESPACE port-forward svc/pcs-mongo-headless --address 127.0.0.1 $LOCAL_MONGO_PORT:27017 &Wait a few seconds for the port-forwarding to establish.
-
Backup the MongoDB database using
mongodump.mongodump -vv -u pcs-mongodb -p "$PCS_MONGO_PASSWD" -h 127.0.0.1 --port $LOCAL_MONGO_PORT -o $BACKUP_LOCATION --authenticationDatabase admin -
Once the backup is complete, find the process ID (PID) of the
kubectl port-forwardcommand.ps aux | grep -E "port-forwar[d].*$LOCAL_MONGO_PORT" -
Stop the port-forwarding process using its PID. Replace
<pid_of_the_kubectl_port-forward>with the actual PID found in the previous step.kill <pid_of_the_kubectl_port-forward>Confirm that the port-forwarding process has stopped.
-
Create a compressed tar archive of the backed-up database files and remove the original backup directory after archiving.
cd $(dirname $BACKUP_LOCATION) # cd to parent of mongodb directory tar -cvzf datarobot-mongo-backup-$(date +%F).tar.gz -C $(dirname $BACKUP_LOCATION) $(basename $BACKUP_LOCATION) --remove-files echo "MongoDB backup archived to $(dirname $BACKUP_LOCATION)/datarobot-mongo-backup-$(date +%F).tar.gz"