Skip to main content

CrashLoopBackOff

1. Introduction

If you run applications on Kubernetes in production, you will eventually see this status:

CrashLoopBackOff

Many engineers assume this is a Kubernetes failure. It is not.

CrashLoopBackOff means Kubernetes is behaving correctly, and your container cannot stay alive.

This document explains:

  • What CrashLoopBackOff really means
  • Why it happens in production
  • How to identify the root cause
  • How to fix it correctly

2. What CrashLoopBackOff Actually Means

CrashLoopBackOff is not a crash. It is a restart pattern.

What Happens Internally

  1. Container starts
  2. Container exits unexpectedly
  3. Kubernetes restarts the container
  4. Restart delay increases (backoff up to ~5 minutes)
  5. Kubernetes protects the cluster from endless restarts

Kubernetes is healthy. Your workload is broken.


3. Start Simple (Never Guess)

kubectl get pods

Look for:

  • STATUS: CrashLoopBackOff
  • Rapidly increasing RESTARTS

For all namespaces:

kubectl get pods -A | grep CrashLoopBackOff

High restart count means fast failure. Do not restart blindly.


4. Identify the Broken Pod (Most Important Step)

kubectl describe pod <pod-name> -n <namespace>

Focus on Events.

Common Event Clues

Event MessageMeaning
OOMKilledMemory limit exceeded
Liveness probe failedKubernetes killed the pod
Back-off restarting failed containerRepeated crash
Permission deniedFile / user issue
Secret not foundMissing configuration

If you don't read events, you are debugging blind.


5. Logs: Current & Previous (Mandatory)

Many engineers miss the most important command:

kubectl logs <pod-name> --previous

Use both:

kubectl logs <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace> --previous

For multi-container pods:

kubectl logs <pod-name> -c <container-name> -n <namespace>

Previous logs often contain the real failure.


Hands-On Examples

The following examples are for demonstration and learning purposes only. They are intentionally designed to show common reasons why a pod enters CrashLoopBackOff, such as:

  • Application startup failures
  • Incorrect health probes
  • Insufficient resource limits
  • Missing configuration or secrets

These examples deliberately introduce failures to help you understand how Kubernetes behaves when an application cannot start or stay healthy.


Example 1: Container That Always Exits

Broken YAML (CrashLoopBackOff)

apiVersion: apps/v1
kind: Deployment
metadata:
name: crashloop
spec:
replicas: 1
selector:
matchLabels:
app: crashloop
template:
metadata:
labels:
app: crashloop
spec:
containers:
- name: app
image: busybox
command: ["sh", "-c", "echo App started; exit 1"]

Apply:

kubectl apply -f crashloop-exit.yaml
kubectl get pods

Result: STATUS shows CrashLoopBackOff with rapidly increasing RESTARTS.


Example 2: Liveness Probe Killing a Healthy App

Broken YAML (Probe Misconfiguration)

apiVersion: apps/v1
kind: Deployment
metadata:
name: crashloop-liveness-demo
spec:
replicas: 1
selector:
matchLabels:
app: liveness-demo
template:
metadata:
labels:
app: liveness-demo
spec:
containers:
- name: app
image: nginx
livenessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 1
periodSeconds: 5

Why This Causes CrashLoopBackOff

1. Invalid Health Check Path

NGINX does not expose a /health endpoint by default. As a result:

  • The liveness probe receives a non-200 response
  • Kubernetes assumes the container is unhealthy

2. Probe Starts Too Early

initialDelaySeconds: 1 means Kubernetes starts health checks 1 second after container startup. The application may not be fully ready yet — even a healthy container can fail at this stage.

3. Kubernetes Forcefully Restarts the Container

When the liveness probe fails:

  • Kubernetes kills the container
  • The container is restarted
  • The same probe fails again
  • This loop repeats

After several failures, Kubernetes applies a restart backoff and the pod enters CrashLoopBackOff.


Example 3: Out-of-Memory (OOMKilled)

Broken YAML

apiVersion: apps/v1
kind: Deployment
metadata:
name: crashloop-oom-demo
spec:
replicas: 1
selector:
matchLabels:
app: oom-demo
template:
metadata:
labels:
app: oom-demo
spec:
containers:
- name: app
image: polinux/stress
command: ["stress"]
args: ["--vm", "1", "--vm-bytes", "200M", "--vm-hang", "1"]
resources:
limits:
memory: "64Mi"

Result:

  • Container exceeds memory
  • Kernel kills it
  • Pod enters CrashLoopBackOff

Why CrashLoopBackOff Happens in This Case

Your container:

  • Is limited to 64Mi memory
  • Tries to allocate 200MB
  • Linux kernel kills it to protect the node
  • Kubernetes restarts it
  • Same thing happens again

After several restarts: STATUS: CrashLoopBackOff

Kubernetes is working correctly. Memory limits are enforced. This is expected behavior.


Example 4: CrashLoopBackOff Due to Missing Secret or ConfigMap

This section demonstrates how a pod can enter CrashLoopBackOff when required configuration objects (Secret or ConfigMap) are missing.

Case 1: Missing Secret

The application expects a Secret at startup. Since the Secret does not exist, the container fails immediately, and Kubernetes repeatedly restarts it.

Deployment YAML (Missing Secret)

apiVersion: apps/v1
kind: Deployment
metadata:
name: mysql-crashloop-demo
spec:
replicas: 1
selector:
matchLabels:
app: mysql-demo
template:
metadata:
labels:
app: mysql-demo
spec:
containers:
- name: app
image: busybox
command:
- sh
- -c
- |
echo "Starting app"
if [ -z "$MYSQL_PASSWORD" ]; then
echo "MySQL password missing"
exit 1
fi
echo "Connected to MySQL"
env:
- name: MYSQL_PASSWORD
valueFrom:
secretKeyRef:
name: mysql-secret
key: MYSQL_PASSWORD

Note: The secret mysql-secret does not exist.

What happens:

  1. Pod starts
  2. Script runs
  3. Password is empty
  4. App exits
  5. Kubernetes restarts
  6. Status becomes CrashLoopBackOff

Case 2: Missing ConfigMap

apiVersion: apps/v1
kind: Deployment
metadata:
name: mongo-crashloop-demo
spec:
replicas: 1
selector:
matchLabels:
app: mongo-demo
template:
metadata:
labels:
app: mongo-demo
spec:
containers:
- name: app
image: busybox
command:
- sh
- -c
- |
echo "Starting app"
if [ -z "$MONGO_URL" ]; then
echo "MongoDB URL missing"
exit 1
fi
echo "Connected to MongoDB"
env:
- name: MONGO_URL
valueFrom:
configMapKeyRef:
name: mongo-config
key: MONGO_URL

What Happens:

  1. Pod is scheduled
  2. Container starts
  3. Kubernetes tries to mount the ConfigMap
  4. ConfigMap is missing
  5. Container fails to start
  6. Kubernetes restarts the container
  7. Pod enters CrashLoopBackOff

CrashLoopBackOff can occur when an application depends on a Secret or ConfigMap that does not exist. Kubernetes retries starting the container, but since the required configuration is missing, the container fails repeatedly.


Freeze Pod for Live Debugging

When the pod crashes too fast, override the command to keep it alive:

Debug Override

command: ["/bin/sh"]
args: ["-c", "while true; do sleep 3600; done"]

Exec Into Pod

kubectl exec -it <pod-name> -n <namespace> -- sh

You can now:

  • Inspect environment variables
  • Test DB connectivity
  • Run startup commands manually
  • Check file permissions

6. Fix → Verify → Rollback

Verify

kubectl rollout status deployment/<deployment-name>

Rollback

kubectl rollout undo deployment/<deployment-name>

Stability first. Debug second.


7. Final Truth

CrashLoopBackOff is not a Kubernetes problem. It is almost always caused by:

  • Bad configuration
  • Bad probes
  • Bad resource limits
  • Bad assumptions

Kubernetes does not break applications. It exposes mistakes early and aggressively.