Description:
Nodes in a degraded state are an unknown quantity and so may pose a security risk.
Rationale:
Kubernetes Engine's node auto-repair feature helps you keep the nodes in your cluster in a healthy, running state. When enabled, Kubernetes Engine makes periodic checks on the health state of each node in your cluster. If a node fails consecutive health checks over an extended time period, Kubernetes Engine initiates a repair process for that node.
If multiple nodes require repair, Kubernetes Engine might repair them in parallel. Kubernetes Engine limits number of repairs depending on the size of the cluster (bigger clusters have a higher limit) and the number of broken nodes in the cluster (limit decreases if many nodes are broken).
Node auto-repair is not available on Alpha Clusters.
Using Google Cloud Console
Using Command Line
To enable node auto-repair for an existing cluster with Node pool, run the following command:
gcloud container node-pools update $POOL_NAME --cluster $CLUSTER_NAME --zone $COMPUTE_ZONE --enable-autorepair