Skip to content

Backing up DataRobot secrets

DataRobot encrypts sensitive data at rest and, by default, secures backend services with passwords. When a DataRobot cluster is backed up, you must also back up the secrets used to secure the DataRobot environment. These secrets must be backed up at the same time as the databases.

If these secrets are not backed up and restored as part of the DataRobot cluster recovery, you will lose access to data and analytics stored in the DataRobot environment.

Important: These secrets cannot be recovered by DataRobot. It is critical that they are secured as part of your data management policy.

前提条件

Ensure the following tools are installed on the host where the backup will be created:

  • jq: Latest version.

  • kubectl: Version 1.23 or later.

    • See Kubernetes Tools Documentation.
    • kubectl must be configured to access the Kubernetes cluster where the DataRobot application is running. Verify this configuration with the kubectl cluster-info command.

Backup procedures

Follow the steps below to back up the critical secrets and keys for your DataRobot installation.

環境変数の定義

  1. Export the name of your DataRobot application Kubernetes namespace. Replace <your-datarobot-namespace> with the actual namespace.

    export DR_CORE_NAMESPACE=<your-datarobot-namespace> 
    
  2. Define where the backups will be stored on the host. This example uses ~/datarobot-backups/, but you can choose a different location. Replace with your chosen path if different.

    export BACKUP_LOCATION=~/datarobot-backups/
    mkdir -p ${BACKUP_LOCATION}/secrets # Ensure base secrets directory exists 
    

Encryption keys

Backup the encryption keys used to encrypt data in MongoDB.

mkdir -p ${BACKUP_LOCATION}/secrets
kubectl -n $DR_CORE_NAMESPACE get secret/core-credentials -o jsonpath="{.data.asymmetrickey}" | base64 --decode > ${BACKUP_LOCATION}/secrets/ASYMMETRIC_KEY_PAIR_MONGO_ENCRYPTION_KEY.txt
kubectl -n $DR_CORE_NAMESPACE get secret/core-credentials -o jsonpath="{.data.drsecurekey}" | base64 --decode > ${BACKUP_LOCATION}/secrets/DRSECURE_MONGO_ENCRYPTION_KEY.txt 

Backup DataRobot secrets

DataRobot secrets include authentication and connection data used by various internal platform services. This includes connection details to Persistent Critical Services such as MongoDB, PostgreSQL, RabbitMQ, and ElasticSearch.

The following command retrieves all secrets with the label app.kubernetes.io/instance=dr and saves their data into JSON files:

mkdir -p ${BACKUP_LOCATION}/secrets/dr
for secret in $(kubectl -n $DR_CORE_NAMESPACE get secrets -l app.kubernetes.io/instance=dr -o name); do
  kubectl -n "$DR_CORE_NAMESPACE" get "$secret" -o json | jq '{data}' > "${BACKUP_LOCATION}/secrets/dr/${secret#*/}.json"
done 

Persistent Critical Services (PCS) secrets

Persistent Critical Services are third-party services that DataRobot uses to store persistent data, such as MongoDB, PostgreSQL, and RabbitMQ.

備考

The following command is valid only for 10.X versions of DataRobot.

The following command retrieves all secrets with the label app.kubernetes.io/instance=pcs and saves their data into JSON files:

mkdir -p ${BACKUP_LOCATION}/secrets/pcs
for secret in $(kubectl -n $DR_CORE_NAMESPACE get secrets -l app.kubernetes.io/instance=pcs -o name); do
  kubectl -n "$DR_CORE_NAMESPACE" get "$secret" -o json | jq '{data}' > "${BACKUP_LOCATION}/secrets/pcs/${secret#*/}.json"
done 

Custom certificates

DataRobot application charts allow you to define custom certificates during installation. If your cluster configuration includes such definitions, back those up as well. 以下の手順を実行します。

  1. Check if there is a globals.certs section in your cluster configuration. Use the helm get values dr -n $DR_CORE_NAMESPACE to retrieve the section's content. If it exists, output will look similar to:

    globals:
      certs:
        - secret: rabbit-cert # This is the Kubernetes Secret name
          path: rabbit/rabbit-cert.pem # This is the key within the Secret's data field 
    
  2. In the example above, there is a Kubernetes secret named rabbit-cert. There could be more than one such secret. If your configuration includes any, back up each one individually.

    Using the rabbit-cert secret name and assuming the data key is rabbit-cert.pem (derived from the path in the YAML), the backup command is: Replace rabbit-cert with your actual secret name and rabbit-cert.pem with the actual data key for the certificate file. The output filename should also be meaningful.

    mkdir -p ${BACKUP_LOCATION}/secrets/certs
    kubectl -n $DR_CORE_NAMESPACE get secret rabbit-cert -o jsonpath="{.data.rabbit-cert\.pem}" | base64 --decode > ${BACKUP_LOCATION}/secrets/certs/rabbit-cert.pem 
    

    If your secret contains multiple data entries (e.g., key, cert, ca bundle) under different keys, you may need to extract each one or adjust the jsonpath. The original example {.data.*} would extract all data fields concatenated, which might not be ideal for individual certificate files.