Backing up DataRobot secrets¶
DataRobot encrypts sensitive data at rest and, by default, secures backend services with passwords. When a DataRobot cluster is backed up, you must also back up the secrets used to secure the DataRobot environment. These secrets must be backed up at the same time as the databases.
If these secrets are not backed up and restored as part of the DataRobot cluster recovery, you will lose access to data and analytics stored in the DataRobot environment.
Important: These secrets cannot be recovered by DataRobot. It is critical that they are secured as part of your data management policy.
Prerequisites¶
Ensure the following tools are installed on the host where the backup will be created:
-
jq: Latest version.
- See jq download page.
-
kubectl: Version
1.23or later.- See Kubernetes Tools Documentation.
kubectlmust be configured to access the Kubernetes cluster where the DataRobot application is running. Verify this configuration with thekubectl cluster-infocommand.
Backup procedures¶
Follow the steps below to back up the critical secrets and keys for your DataRobot installation.
Define environment variables¶
-
Export the name of your DataRobot application Kubernetes namespace. Replace
<your-datarobot-namespace>with the actual namespace.export DR_CORE_NAMESPACE=<your-datarobot-namespace> -
Define where the backups will be stored on the host. This example uses
~/datarobot-backups/, but you can choose a different location. Replace with your chosen path if different.export BACKUP_LOCATION=~/datarobot-backups/ mkdir -p ${BACKUP_LOCATION}/secrets # Ensure base secrets directory exists
Encryption keys¶
Backup the encryption keys used to encrypt data in MongoDB.
mkdir -p ${BACKUP_LOCATION}/secrets
kubectl -n $DR_CORE_NAMESPACE get secret/core-credentials -o jsonpath="{.data.asymmetrickey}" | base64 --decode > ${BACKUP_LOCATION}/secrets/ASYMMETRIC_KEY_PAIR_MONGO_ENCRYPTION_KEY.txt
kubectl -n $DR_CORE_NAMESPACE get secret/core-credentials -o jsonpath="{.data.drsecurekey}" | base64 --decode > ${BACKUP_LOCATION}/secrets/DRSECURE_MONGO_ENCRYPTION_KEY.txt
Backup DataRobot secrets¶
DataRobot secrets include authentication and connection data used by various internal platform services. This includes connection details to Persistent Critical Services such as MongoDB, PostgreSQL, RabbitMQ, and ElasticSearch.
The following command retrieves all secrets with the label app.kubernetes.io/instance=dr and saves their data into JSON files:
mkdir -p ${BACKUP_LOCATION}/secrets/dr
for secret in $(kubectl -n $DR_CORE_NAMESPACE get secrets -l app.kubernetes.io/instance=dr -o name); do
kubectl -n "$DR_CORE_NAMESPACE" get "$secret" -o json | jq '{data}' > "${BACKUP_LOCATION}/secrets/dr/${secret#*/}.json"
done
Persistent Critical Services (PCS) secrets¶
Persistent Critical Services are third-party services that DataRobot uses to store persistent data, such as MongoDB, PostgreSQL, and RabbitMQ.
Note
The following command is valid only for 10.X versions of DataRobot.
The following command retrieves all secrets with the label app.kubernetes.io/instance=pcs and saves their data into JSON files:
mkdir -p ${BACKUP_LOCATION}/secrets/pcs
for secret in $(kubectl -n $DR_CORE_NAMESPACE get secrets -l app.kubernetes.io/instance=pcs -o name); do
kubectl -n "$DR_CORE_NAMESPACE" get "$secret" -o json | jq '{data}' > "${BACKUP_LOCATION}/secrets/pcs/${secret#*/}.json"
done
Custom certificates¶
DataRobot application charts allow you to define custom certificates during installation. If your cluster configuration includes such definitions, back those up as well. To do so:
-
Check if there is a
globals.certssection in your cluster configuration. Use thehelm get values dr -n $DR_CORE_NAMESPACEto retrieve the section's content. If it exists, output will look similar to:globals: certs: - secret: rabbit-cert # This is the Kubernetes Secret name path: rabbit/rabbit-cert.pem # This is the key within the Secret's data field -
In the example above, there is a Kubernetes secret named
rabbit-cert. There could be more than one such secret. If your configuration includes any, back up each one individually.Using the
rabbit-certsecret name and assuming the data key israbbit-cert.pem(derived from thepathin the YAML), the backup command is: Replacerabbit-certwith your actual secret name andrabbit-cert.pemwith the actual data key for the certificate file. The output filename should also be meaningful.mkdir -p ${BACKUP_LOCATION}/secrets/certs kubectl -n $DR_CORE_NAMESPACE get secret rabbit-cert -o jsonpath="{.data.rabbit-cert\.pem}" | base64 --decode > ${BACKUP_LOCATION}/secrets/certs/rabbit-cert.pemIf your secret contains multiple data entries (e.g., key, cert, ca bundle) under different keys, you may need to extract each one or adjust the
jsonpath. The original example{.data.*}would extract all data fields concatenated, which might not be ideal for individual certificate files.