Skip to content

クラスターの管理

This section presents administrative tasks and commands to ensure your cluster is configured and operating as expected.

Useful commands to know

Helm commands

The following provides a selection of code snippets that can be used for managing Helm-related aspects of the cluster. For full documentation, see the Helm documentation.

List all deployments in a given namespace

helm list -n $NAMESPACE
` 

Get the values supplied to a Helm chart when a deployment was created

helm get values $RELEASE_NAME -n $NAMESPACE 

Get all the values Helm computed when a deployment was created

helm get all $RELEASE_NAME -n $NAMESPACE 

Get the content of a Helm chart’s default values.yaml file:

helm show values $CHART_NAME

**Get the manifest of a deployed release**

```shell
helm get manifest $RELEASE_NAME -n $NAMESPACE 

Kubectl commands

For full documentation, see the Kubectl Reference Docs.

Get all resources in a specific namespace:

kubectl get all -n $NAMESPACE 

Get a specific resource type across all namespaces

kubectl get $RESOURCE_TYPE -A 

Get all of a resource type in a specific namespace:

kubectl get $RESOURCE_TYPE -n $NAMESPACE 

Get logs from a pod:

kubectl logs $POD_NAME -n $NAMESPACE 

Follow logs from a pod in real time:

kubectl logs -f $POD_NAME -n $NAMESPACE 

Get the values used to build a resource and its status, displayed in YAML format:

# General command
kubectl get $RESOURCE_TYPE/$SPECIFIC_RESOURCE -n $NAMESPACE -o yaml

# Example
kubectl get pod/datarobot-nginx-ABESC -n datarobot-core -o yaml 

Get all events from the last hour for a namespace:

kubectl get events -n $NAMESPACE 

Set the default namespace for kubectl commands: This allows you to run the commands above without the -n $NAMESPACE flag.

kubectl config set-context --current --namespace=$NAMESPACE 

Restart all deployments in the DataRobot core namespace:

for dep in `kubectl get deployments.apps -n DR_CORE_NAMESPACE | tail -n +2 | awk '{print $1}'`; do kubectl rollout restart deployment/$dep -n DR_CORE_NAMESPACE; done 

Scale cluster up and down

Sometimes it is required to temporarily scale a cluster down, for example, to save resources during weekends. Because there are no simple start/stop commands for Kubernetes applications, DataRobot suggests you temporarily scale the cluster down to zero replicas and then restore its original size when it is required.

備考

If not all nodes of the pcs-rabbitmq stateful set come up after scaling up, you must apply the RabbitMQ cluster recovery procedure.

To scale the cluster down, the following command annotates each deployment and stateful set with its current number of replicas and then scales it down to zero replicas:

for obj in $(kubectl -n DR_CORE_NAMESPACE get deployments,statefulsets -o name); do
    r=$(kubectl -n DR_CORE_NAMESPACE get $obj -o jsonpath='{.spec.replicas}')
    kubectl -n DR_CORE_NAMESPACE annotate --overwrite $obj replicas=$r
    kubectl -n DR_CORE_NAMESPACE scale $obj --replicas=0
done 

To restore the cluster to its original size, the following command reads the replica count from the annotation, scales the resources up, and removes the annotation upon completion:

for obj in $(kubectl -n DR_CORE_NAMESPACE get statefulsets,deployments -o name); do
    r=$(kubectl -n DR_CORE_NAMESPACE get $obj -o jsonpath='{.metadata.annotations.replicas}')
    kubectl -n DR_CORE_NAMESPACE scale $obj --replicas=$r
    kubectl -n DR_CORE_NAMESPACE annotate $obj replicas-
done 

Collecting a cluster profile

During troubleshooting, your DataRobot support representative may ask for a cluster information dump. The following command exports the cluster's currently running configuration, plus logs and events for the various services in use.

Replace /path/to/a/folder/on/disk/ with a valid local path.

kubectl cluster-info dump -n DR_CORE_NAMESPACE --output-directory=/path/to/a/folder/on/disk/cluster-state 

This command creates a folder named cluster-state. You can then create a compressed tarball of that folder and its contents to provide to your support representative for detailed analysis.

tar -cvzf cluster-state-$(date +%F).tar.gz /path/to/a/folder/on/disk/cluster-state/