Elasticsearch Restore¶
注意 If DataRobot application is configured to use managed services (external PCS), then instead of this guide, please refer yourself to Restoring snapshots guide from Amazon.
Prerequisites:¶
- Utility kubectl of version 1.23 is installed on the host where backup will be created
- Utility
kubectlis configured to access the Kubernetes cluster where DataRobot application is running, verify this withkubectl cluster-infocommand.
Manage Elasticsearch¶
Due to specific security settings, we suggest using the curl utility within the elasticsearch containers for all following operations. For example, attach to container in pcs-elasticsearch-master stateful set:
kubectl -n $DR_CORE_NAMESPACE exec sts/pcs-elasticsearch-master -- /bin/bash
Then retrieve the cluster information:
curl -k -u elastic:$ELASTICSEARCH_PASSWORD "https://localhost:9200/"
- option
-kis mandatory to allow insecure connection - as well as
-u elastic:$ELASTICSEARCH_PASSWORDsince non-authorized access is not allowed.
Restore from filesystem snapshot¶
Starting 10.2, additional mount point is configured to store backup locally. If backup is stored under local filesystem in elasticsearch, please follow below steps
Look for snapshot under /snapshots directory in pcs-elasticsearch-master node
$ ls /snapshots/
index-0 index.latest meta-f7MUCz9BR5GFlNd0Voct0g.dat snap-f7MUCz9BR5GFlNd0Voct0g.dat
Run restore
curl -k -X POST -u elastic:$ELASTICSEARCH_PASSWORD "https://localhost:9200/_snapshot/snapshots/<snapshot-name>/_restore" -H "Content-Type: application/json" -d '{
"indices": "*",
"ignore_unavailable": true,
"include_global_state": true
}'
Register a snapshot repository¶
You can configure Elasticsearch to read snapshots from different external locations: AWS S3 bucket, Azure Blob Storage, shared NFS volume, etc.
Shared filesystem (NFS) repository¶
Elasticsearch distribution delivered with DataRobot application allows you to configure Elasticsearch to keep snapshots on NFS volume. Please check Snapshot and restore operations for more information on using this method. Please note that this method requires an NFS server continuously available in your network.
Other repository types¶
Elasticsearch can also store snapshots on S3, Google Cloud or Azure Blob Storage. If your snapshots are stored using one of these ways, please refer yourself to the appropriate section of Register a snapshot repository guide.
Registering snapshot repository¶
Please refer to backup guide on how to configure snapshot registry, this restore guide assumes that the backup snapshot is accessible from the same registry to perform restore.
Restore data from snapshot¶
Snapshots can be manually restored according to Restore a snapshot guide. Please follow Restore Entire Cluster section.
Is the snapshot is available in s3 registry, you can follow below steps
- Get a list of all indices
export DR_CORE_NAMESPACE=dr-app
kubectl -n $DR_CORE_NAMESPACE exec -it pcs-elasticsearch-master-0 -- /bin/bash
curl -k -X GET -u elastic:$ELASTICSEARCH_PASSWORD "https://localhost:9200/_cat/indices"
-
Remove all existing indices if restoring onto an existing cluster with data, single command will not work and will throw error
curl -k -XDELETE -u elastic:$ELASTICSEARCH_PASSWORD "https://localhost:9200/<index-name-from-step-1>" -
Once deleted, run restore using the backup snapshot in s3 repository
curl -k -X POST -u elastic:$ELASTICSEARCH_PASSWORD "https://localhost:9200/_snapshot/repository_name/<snapshot-name>/_restore" -H "Content-Type: application/json" -d '{
"indices": "*",
"ignore_unavailable": true,
"include_global_state": true
}'