Back up Elasticsearch¶
Note
You must fulfill the prerequisites before proceeding.
Due to specific security settings, DataRobot suggests using the curl utility within the Elasticsearch containers for the following operations (for example, attach it to the container in pcs-elasticsearch-master stateful set):
kubectl -n $NAMESPACE exec -it sts/pcs-elasticsearch-master -- /bin/bash
Then retrieve the cluster information:
curl -k -u elastic:$ELASTICSEARCH_PASSWORD "https://localhost:9200/"
Note
The inclusion of -k is mandatory to allow an insecure connection as well as -u elastic:$ELASTICSEARCH_PASSWORD because non-authorized access isn't allowed.
Local file system backup¶
Starting with version 10.2, an additional mount point is configured to store backup locally. To take a snapshot under the /snapshots directory
-
Find the master node.
kubectl exec -it pcs-elasticsearch-master-0 -n $NS -- bash -c 'export ELASTICSEARCH_PASSWORD=$(cat /opt/iamguarded/elasticsearch/secrets/elasticsearch-password); curl -X GET -k -u elastic:$ELASTICSEARCH_PASSWORD "https://localhost:9200/_cat/master?v"' -
Register snapshot directory.
$ curl -X PUT -k -u elastic:$ELASTICSEARCH_PASSWORD "https://localhost:9200/_snapshot/dr_repository?pretty" \ -H 'Content-Type: application/json' -d' { "type": "fs", "settings": { "location": "/snapshots" } } 'Note
If the above command returns 500, disable the repository verification by adding
"verify": falseto the URL:curl -X PUT -k -u elastic:$ELASTICSEARCH_PASSWORD \ "https://localhost:9200/_snapshot/dr_repository?verify=false&pretty" \ -H 'Content-Type: application/json' \ -d '{ "type": "fs", "settings": { "location": "/snapshots" } }' -
Take a snapshot, ignoring system indices that start with the
.character.curl -k -X PUT -u elastic:$ELASTICSEARCH_PASSWORD \ "https://localhost:9200/_snapshot/dr_repository/dr_snapshot?wait_for_completion=true&pretty" \ -H "Content-Type: application/json" \ -d '{ "indices": ["*", "-.ds-*", "-.*"], "ignore_unavailable": true, "include_global_state": false, "partial": false }' -
Take a snapshot from each Elasticsearch pod so that DataRobot has all indices from all shards that might be distributed.
for pod in pcs-elasticsearch-master-0 pcs-elasticsearch-master-1 pcs-elasticsearch-master-2; do echo "Tarring on $pod..." kubectl exec -it $pod -n $NS -- bash -c "tar -czvf /tmp/snapshot_${pod}.tar.gz /snapshots/" done -
To allow restoring the snapshot onto your target Elasticsearch statefulset on a different Kubernetes namespace, copy the snapshot from each pod to a provisioner.
for pod in pcs-elasticsearch-master-0 pcs-elasticsearch-master-1 pcs-elasticsearch-master-2; do echo "Copying from $pod..." kubectl cp $NS/$pod:/tmp/snapshot_${pod}.tar.gz ./snapshot_${pod}.tar.gz done -
Merge all snapshots locally.
mkdir -p merged_snapshots for pod in pcs-elasticsearch-master-0 pcs-elasticsearch-master-1 pcs-elasticsearch-master-2; do echo "Extracting $pod tar..." tar -xzvf snapshot_${pod}.tar.gz -C merged_snapshots/ 2>/dev/null || true done tar -czvf es_snapshot_merged_final.tar.gz -C merged_snapshots snapshots/
The backup is now ready.
Register a snapshot repository¶
Elasticsearch can store snapshots on different external locations such as AWS S3 bucket, Azure Blob Storage, or shared NFS volume.
Shared filesystem (NFS) repository¶
Elasticsearch distribution delivered with the DataRobot application allows you to configure Elasticsearch to store snapshots on an NFS volume. Check Snapshot and restore operations for more information on using this method. Note that this method requires an NFS server continuously available in your network.
Other repository types¶
Elasticsearch can store snapshots on S3, Google Cloud, or Azure Blob Storage. If you prefer any of these methods, refer to the appropriate section of the Register a snapshot repository guide.
Example: adding an AWS S3 repository¶
Follow the official guide to configure either an S3 IAM role or service account that has access to store backups.
For IAM users¶
If you have an AWS IAM role assigned to an IAM user, follow the steps below to export the access ID and secret key to configure the S3 repository:
kubectl -n $NAMESPACE exec -it pcs-elasticsearch-master-0 -- /bin/bash
/opt/bitnami/elasticsearch/bin/elasticsearch-keystore add s3.client.default.access_key
/opt/bitnami/elasticsearch/bin/elasticsearch-keystore add s3.client.default.secret_key
When prompted, input your AWS Access Key ID and Secret Access Key.
You can also show these values to confirm if correct values are set for AWS Access Key ID and Secret Access Key
/opt/bitnami/elasticsearch/bin/elasticsearch-keystore show s3.client.default.access_key
/opt/bitnami/elasticsearch/bin/elasticsearch-keystore show s3.client.default.secret_key
After adding the credentials, reload the secure settings across all Elasticsearch nodes to ensure they're applied:
curl -k -X POST -u elastic:$ELASTICSEARCH_PASSWORD \
-H "Content-Type: application/json" \
"https://localhost:9200/_nodes/reload_secure_settings" \
-d '{"secure_settings_password": ""}'
Assuming cluster nodes can access S3 bucket "dr_repository_bucket" without a password, the following command, run from any DataRobot container (see Manage Elasticsearch section above), creates a snapshot repository in "dr_repository_bucket":
curl -k -X PUT -u elastic:$ELASTICSEARCH_PASSWORD -H "Content-Type: application/json" \
"https://localhost:9200/_snapshot/dr_repository?pretty" -d'
{
"type": "s3",
"settings": {
"bucket": "dr_repository_bucket"
}
}
'
With a provisional account¶
If there is a provisioned service account that has access to S3, follow the steps below.**
Note
Since the following are manual edits, they are overwritten during upgrades. It's acceptable for these to be overwritten, as these steps are only necessary when preparing for backup and subsequent restoration.
Edit the PCS helm chart to add this init script under the Elasticsearch block:
initScripts:
setup_s3_access.sh: |
#!/bin/sh
mkdir -p /opt/bitnami/elasticsearch/config/repository-s3
chown {{ .Values.master.containerSecurityContext.runAsUser }}:{{ .Values.master.podSecurityContext.fsGroup }} /opt/bitnami/elasticsearch/config/repository-s3
ln -svf $AWS_WEB_IDENTITY_TOKEN_FILE /opt/bitnami/elasticsearch/config/repository-s3/aws-web-identity-token-file
export SIZE_OF_SECRETS_FILE=$(wc -c /opt/bitnami/elasticsearch/config/repository-s3/aws-web-identity-token-file | awk '{print $1}')
info "S3 access setup successfully for snapshot and restore. Size of secrets file $SIZE_OF_SECRETS_FILE"
Here is a sample Elasticsearch block with values to indicate where the init script needs to be placed. don't copy any other values from below:
elasticsearch:
coordinating:
replicaCount: 0
data:
replicaCount: 3
fullnameOverride: pcs-elasticsearch
image:
registry: docker.io
repository: bitnami/elasticsearch
tag: 8.12.2-debian-12-r1
ingest:
replicaCount: 0
initScripts:
setup_s3_access.sh: |
#!/bin/sh
mkdir -p /opt/bitnami/elasticsearch/config/repository-s3
chown {{ .Values.master.containerSecurityContext.runAsUser }}:{{ .Values.master.podSecurityContext.fsGroup }} /opt/bitnami/elasticsearch/config/repository-s3
ln -svf $AWS_WEB_IDENTITY_TOKEN_FILE /opt/bitnami/elasticsearch/config/repository-s3/aws-web-identity-token-file
export SIZE_OF_SECRETS_FILE=$(wc -c /opt/bitnami/elasticsearch/config/repository-s3/aws-web-identity-token-file | awk '{print $1}')
info "S3 access setup successfully for snapshot and restore. Size of secrets file $SIZE_OF_SECRETS_FILE"
master:
containerSecurityContext:
seccompProfile: null
masterOnly: false
persistence:
size: 20Gi
replicaCount: 3
resources:
limits:
cpu: 2000m
memory: 3Gi
requests:
cpu: 250m
memory: 512Mi
serviceAccount:
create: true
name: pcs-elasticsearch-sa
security:
enabled: true
existingSecret: pcs-elasticsearch
tls:
autoGenerated: true
sysctlImage:
enabled: true
tag: 12-debian-12-r18
extraObjects: []
Run helm upgrade on PCS:
helm upgrade pcs datarobot-pcs-ha-10.1.0.tgz -n $NAMESPACE -f <updated-values-with-initscript.yaml>
Update pcs-elasticsearch-master statefulset to mount service account tokens that allow access to s3:
-
Under
spec:containers:env:- name: AWS_ROLE_ARN value: arn:aws:iam::<account-number>:role/<irsa-role-defined-for-cluster> - name: AWS_WEB_IDENTITY_TOKEN_FILE value: /var/run/secrets/eks.amazonaws.com/serviceaccount/token -
Under
mountPath::- mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount name: aws-iam-token -
Under
volumes::- name: aws-iam-token projected: defaultMode: 420 sources: - serviceAccountToken: audience: sts.amazonaws.com expirationSeconds: 86400 path: token
Apply the above modified values:
kubectl apply -f <above-updated-config.yaml> -n $NAMESPACE
This mounts the service token on the correct path so Elasticsearch can use it for the snapshot registry.
curl -k -X PUT -u elastic:$ELASTICSEARCH_PASSWORD -H "Content-Type: application/json" \
"https://localhost:9200/_snapshot/es_backup?pretty" -d '{
"type": "s3",
"settings": {
"bucket": "<bucket_name_in_s3>",
"region": "us-east-1",
"base_path": "<any_sub_folders_in_s3>"
}
}'
Example: adding a GCP repository¶
Follow the official guide to configure a service account that has access to store backups. Also, ensure that a new key for the service account has been created.
Your JSON credentials file needs to look like this. Set up the Elasticsearch keystore with the credential file above.
Client settings are needed to establish connectivity between Elasticsearch and Google Cloud Storage. The default client name looked up by a GCS repository is called default
-
Copy the above credentials into the Elasticsearch primary pod.
kubectl cp </path/to/local/service-account.json> $NAMESPACE/pcs-elasticsearch-master-0:/tmp/service-account.json -
Exec into the Elasticsearch master pod to set up the keystore.
kubectl exec -it pcs-elasticsearch-master-0 -n $NAMESPACE /entrypoint -- bash I have no name!@pcs-elasticsearch-master-0:$ cd /opt/bitnami/elasticsearch/bin I have no name!@pcs-elasticsearch-master-0:/opt/bitnami/elasticsearch/bin$ % elasticsearch-keystore add-file gcs.client.default.credentials_file /tmp/service-account.json -
After the keystore is set up, register the repository with the default client name.
curl -k -X PUT -u elastic:$ELASTICSEARCH_PASSWORD -H "Content-Type: application/json" \ "https://localhost:9200/_snapshot/es_backup?pretty" -d '{ "type": "gcs", "settings": { "bucket": "<google_cloud_storage_name>", "client": "default", "base_path": "<any_sub_folders_in_gcs>" } }'
Example: adding an Azure repository¶
Follow the official guide to configure Azure credentials that have access to store backups.
Get the account and key for Azure blob storage. By default, Azure repositories use a client named default.
kubectl exec -it pcs-elasticsearch-master-0 -n $NAMESPACE /entrypoint -- bash
I have no name!@pcs-elasticsearch-master-0:$ cd /opt/bitnami/elasticsearch/bin
I have no name!@pcs-elasticsearch-master-0:/opt/bitnami/elasticsearch/bin$ % elasticsearch-keystore add azure.client.default.account
I have no name!@pcs-elasticsearch-master-0:/opt/bitnami/elasticsearch/bin$ % elasticsearch-keystore add azure.client.default.key
Once the keys are added, you can register snapshot repository.
curl -k -X PUT -u elastic:$ELASTICSEARCH_PASSWORD -H "Content-Type: application/json" \
"https://localhost:9200/_snapshot/es_backup?pretty" -d '{
"type": "azure",
"settings": {
"container": "<azure_blob_storage_name>",
"client": "default",
"base_path": "<any_sub_folders_in_azure>"
}
}'