Backup Mongo¶
This operation can be executed from any macOS or GNU\Linux machine that has enough space to store the backup.
注意 If DataRobot application is configured to use managed services (external PCS), then instead of this guide, please refer yourself to Back Up Your Database Deployment guide for MongoDB Atlas.
前提条件¶
- Utility mongodump of version 100.6.0 is installed on the host where backup will be created
- Utility kubectl of version 1.23 is installed on the host where backup will be created
- Utility
kubectlis configured to access the Kubernetes cluster where DataRobot application is running, verify this withkubectl cluster-infocommand.
注意事項¶
As the database size increases, the execution time of mongodump also increases. This can reach impractical durations in certain scenarios, potentially spanning days. We recommend using managed services (external PCS).
Create backup¶
We recommend using managed services (external PCS) and scheduling backups simultaneously for managed Postgres, Redis, and Mongo.
If you are using pcs-ha charts, you can use the script below to create a backup.
Export name of DataRobot application Kubernetes namespace in DR_CORE_NAMESPACE variable:
export DR_CORE_NAMESPACE=<namespace>
Define where the backups will be stored on the host where backup will be created. I use ~/datarobot-backups/, but feel free to choose a different one:
export BACKUP_LOCATION=~/datarobot-backups/
mkdir -p ${BACKUP_LOCATION}/mongodb
Backup process will require you to forward local port to remote MongoDB service, please define which local port you will use. In the following example I use port 27018, but feel free to use another:
export LOCAL_MONGO_PORT=27018
Obtain the MongoDB root user password:
export PCS_MONGO_PASSWD=$(kubectl -n $DR_CORE_NAMESPACE get secret pcs-mongo -o jsonpath="{.data.mongodb-root-password}" | base64 -d)
echo ${PCS_MONGO_PASSWD}
Forward local port to remote MongoDB service deployed in the Kubernetes:
kubectl -n $DR_CORE_NAMESPACE port-forward svc/pcs-mongo-headless --address 127.0.0.1 $LOCAL_MONGO_PORT:27017 &
Backup the Mongo database:
mkdir -p $BACKUP_LOCATION/mongodb
mongodump -vv -u pcs-mongodb -p $PCS_MONGO_PASSWD -h 127.0.0.1 --port $LOCAL_MONGO_PORT -o $BACKUP_LOCATION/mongodb
Once backup complete, find process ID of the port-forwarding process:
ps aux | grep -E "port-forwar[d].*$LOCAL_MONGO_PORT"
and stop it
kill <pid_of_the_kubectl_port-forward>
Create a tar archive of the backed-up database files and delete the backup files after they are archived:
cd $BACKUP_LOCATION
tar -cf datarobot-mongo-backup-$(date +%F).tar -C ${BACKUP_LOCATION} mongodb --remove-files
Critical Collections for Feature-Specific Backup¶
The mongodump command backs up all MongoDB collections by default. However, for specific DataRobot features, ensure the following collections are included in your backup:
カスタムアプリケーション¶
custom_applications- Application metadata and configurationlongrunningservices- Kubernetes deployment configurations (critical for functional restoration)custom_application_images- Application Source metadatacustom_application_image_versions- Source version informationexecute_docker_images- Docker image metadataworkspace_items- File storage references
Note: The longrunningservices collection is shared across multiple DataRobot features (Custom Applications, Custom Models deployments, etc.) and is critical for restoring running workloads. Without this collection, applications and deployments will appear in the UI but will not be functional.