How to Gracefully and Faster Shutdown a Pod in Kubernetes?

adil
3 min readSep 26, 2023

--

A Kubernetes cluster’s pod removal process could take longer than anticipated.

Photo by Markus Spiske on Unsplash

By default, removing a pod from a Kubernetes cluster takes 30 seconds:

➜  ~ time kubectl delete -f 01-pod-create.yaml
pod "web1-pod" deleted
kubectl delete -f 01-pod-create.yaml 0.10s user 0.07s system 0% cpu 31.277 total

You can forcefully delete your pods:

kubectl delete -f 01-pod-create.yaml --force --grace-period=0

If you want to shut down your pod without any problems, you can try handling the SIGTERM signal.

When you run kubectl delete , Kubernetes sends a SIGTERM signal to your containers.

I’ve created a container image: ailhan/signal:v1
Here’s the source code

When Kubernetes sends a SIGTERM signal, the container image detects it. However, it does not terminate the container. The “SIGTERM detected” message will be printed.

00-signal-handler-v1.yaml

---
apiVersion: v1
kind: Pod
metadata:
name: signal-pod
spec:
containers:
- image: ailhan/signal:v1
name: signal-container

I split my terminal. I created the pod in Terminal 1, then in Terminal 2, I examined the logs of the signal-container:

Kubernetes’ SIGTERM signal has been detected by signal-container. It did not, however, terminate the container. As a result, the kubectl delete command took 30 seconds to finish.

I will deploy ailhan/signal:v2
Here’s the source code

When SIGTERM is received, V2 will exit immediately.

01-signal-handler-v2.yaml

---
apiVersion: v1
kind: Pod
metadata:
name: signal-pod
spec:
containers:
- image: ailhan/signal:v2
name: signal-container

After adding the exit command to the script, the kubectl delete command finished in 1 second:

This is how you can quickly detect the removal request and terminate the container.

Some graceful shutdown requests take longer than 30 seconds

In this application, we were able to terminate the container quickly.

However, in some cases, it may be necessary to take additional actions to terminate the program without any problems, e.g.: notify an external resource, clean up the disk, complete an active job, etc.

I will deploy ailhan/signal:v3
Here’s the source code

02-signal-handler-v3.yaml

---
apiVersion: v1
kind: Pod
metadata:
name: signal-pod
spec:
containers:
- image: ailhan/signal:v3
name: signal-container

Before terminating the container, V3 will wait 45 seconds. It will execute a long-running disk cleanup job (simulated).

The message “Disk cleanup started” was shown. Nevertheless, as a result of the predetermined timeout duration of 30 seconds, Kubernetes terminated the pod prior to the “Disk cleanup is complete” message being shown.

It is necessary to increase the duration of the timeout. I will add the terminationGracePeriodSeconds parameter to the pod:

03-signal-handler-v3.yaml

---
apiVersion: v1
kind: Pod
metadata:
name: signal-pod
spec:
terminationGracePeriodSeconds: 50
containers:
- image: ailhan/signal:v3
name: signal-container

I adjusted the timeout from 30 seconds to 50 seconds.

The long-running job has been successfully completed.

--

--