Elasticsearch Restore¶
Warning
If the DataRobot application is configured to use managed services (external PCS), refer to the Restoring snapshots guide from Amazon instead of this guide.
Prerequisites¶
- Utility kubectl version 1.23 is installed on the host where the backup is created
- Utility
kubectlis configured to access the Kubernetes cluster where the DataRobot application is running. Verify this with thekubectl cluster-infocommand.
Manage Elasticsearch¶
Due to specific security settings, use the curl utility within the elasticsearch containers for all following operations. For example, attach to the container in the pcs-elasticsearch-master stateful set:
kubectl -n $DR_CORE_NAMESPACE exec sts/pcs-elasticsearch-master -- /bin/bash
curl -k -u elastic:$ELASTICSEARCH_PASSWORD "https://localhost:9200/"
-k is mandatory for allowing insecure connections. -u elastic:$ELASTICSEARCH_PASSWORD is also mandatory because non-authorized access is not allowed.
Restore from filesystem snapshot¶
Starting with 10.2, configure an additional mount point to store backups locally. If the backup is stored under the local filesystem in Elasticsearch, follow the steps below:
Look for snapshot under /snapshots directory in pcs-elasticsearch-master node
$ ls /snapshots/
index-0 index.latest meta-f7MUCz9BR5GFlNd0Voct0g.dat snap-f7MUCz9BR5GFlNd0Voct0g.dat
Run restore:
curl -k -X POST -u elastic:$ELASTICSEARCH_PASSWORD "https://localhost:9200/_snapshot/snapshots/<snapshot-name>/_restore" -H "Content-Type: application/json" -d '{
"indices": "*",
"ignore_unavailable": true,
"include_global_state": true
}'
Register a snapshot repository¶
You can configure Elasticsearch to read snapshots from different external locations: AWS S3 bucket, Azure Blob Storage, shared NFS volume, etc.
Shared filesystem (NFS) repository¶
Elasticsearch distribution delivered with the DataRobot application allows you to configure Elasticsearch to keep snapshots on an NFS volume. Check Snapshot and restore operations for more information on using this method. Note that this method requires an NFS server continuously available in your network.
Other repository types¶
Elasticsearch can also store snapshots on S3, Google Cloud, or Azure Blob Storage. If your snapshots are stored using one of these methods, refer to the appropriate section of the Register a snapshot repository guide.
Registering snapshot repository¶
Refer to the backup guide on how to configure snapshot registry. This restore guide assumes that the backup snapshot is accessible from the same registry to perform restore.
Restore data from snapshot¶
Snapshots can be manually restored according to the Restore a snapshot guide. Follow the Restore Entire Cluster section.
If the snapshot is available in the S3 registry, follow the steps below:
- Get a list of all indices.
export DR_CORE_NAMESPACE=dr-app
kubectl -n $DR_CORE_NAMESPACE exec -it pcs-elasticsearch-master-0 -- /bin/bash
curl -k -X GET -u elastic:$ELASTICSEARCH_PASSWORD "https://localhost:9200/_cat/indices"
-
Remove all existing indices if restoring onto an existing cluster with data. A single command does not work and throws an error:
curl -k -XDELETE -u elastic:$ELASTICSEARCH_PASSWORD "https://localhost:9200/<index-name-from-step-1>" -
Once deleted, run restore using the backup snapshot in the S3 repository:
curl -k -X POST -u elastic:$ELASTICSEARCH_PASSWORD "https://localhost:9200/_snapshot/repository_name/<snapshot-name>/_restore" -H "Content-Type: application/json" -d '{
"indices": "*",
"ignore_unavailable": true,
"include_global_state": true
}'