CrashLoopBackOff
1. Introduction
If you run applications on Kubernetes in production, you will eventually see this status:
CrashLoopBackOff
Many engineers assume this is a Kubernetes failure. It is not.
CrashLoopBackOff means Kubernetes is behaving correctly, and your container cannot stay alive.
This document explains:
- What CrashLoopBackOff really means
- Why it happens in production
- How to identify the root cause
- How to fix it correctly
2. What CrashLoopBackOff Actually Means
CrashLoopBackOff is not a crash. It is a restart pattern.
What Happens Internally
- Container starts
- Container exits unexpectedly
- Kubernetes restarts the container
- Restart delay increases (backoff up to ~5 minutes)
- Kubernetes protects the cluster from endless restarts
Kubernetes is healthy. Your workload is broken.
3. Start Simple (Never Guess)
kubectl get pods
Look for:
- STATUS: CrashLoopBackOff
- Rapidly increasing RESTARTS
For all namespaces:
kubectl get pods -A | grep CrashLoopBackOff
High restart count means fast failure. Do not restart blindly.
4. Identify the Broken Pod (Most Important Step)
kubectl describe pod <pod-name> -n <namespace>
Focus on Events.
Common Event Clues
| Event Message | Meaning |
|---|---|
| OOMKilled | Memory limit exceeded |
| Liveness probe failed | Kubernetes killed the pod |
| Back-off restarting failed container | Repeated crash |
| Permission denied | File / user issue |
| Secret not found | Missing configuration |
If you don't read events, you are debugging blind.
5. Logs: Current & Previous (Mandatory)
Many engineers miss the most important command:
kubectl logs <pod-name> --previous
Use both:
kubectl logs <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace> --previous
For multi-container pods:
kubectl logs <pod-name> -c <container-name> -n <namespace>
Previous logs often contain the real failure.
Hands-On Examples
The following examples are for demonstration and learning purposes only. They are intentionally designed to show common reasons why a pod enters CrashLoopBackOff, such as:
- Application startup failures
- Incorrect health probes
- Insufficient resource limits
- Missing configuration or secrets
These examples deliberately introduce failures to help you understand how Kubernetes behaves when an application cannot start or stay healthy.
Example 1: Container That Always Exits
Broken YAML (CrashLoopBackOff)
apiVersion: apps/v1
kind: Deployment
metadata:
name: crashloop
spec:
replicas: 1
selector:
matchLabels:
app: crashloop
template:
metadata:
labels:
app: crashloop
spec:
containers:
- name: app
image: busybox
command: ["sh", "-c", "echo App started; exit 1"]
Apply:
kubectl apply -f crashloop-exit.yaml
kubectl get pods
Result: STATUS shows CrashLoopBackOff with rapidly increasing RESTARTS.
Example 2: Liveness Probe Killing a Healthy App
Broken YAML (Probe Misconfiguration)
apiVersion: apps/v1
kind: Deployment
metadata:
name: crashloop-liveness-demo
spec:
replicas: 1
selector:
matchLabels:
app: liveness-demo
template:
metadata:
labels:
app: liveness-demo
spec:
containers:
- name: app
image: nginx
livenessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 1
periodSeconds: 5
Why This Causes CrashLoopBackOff
1. Invalid Health Check Path
NGINX does not expose a /health endpoint by default. As a result:
- The liveness probe receives a non-200 response
- Kubernetes assumes the container is unhealthy
2. Probe Starts Too Early
initialDelaySeconds: 1 means Kubernetes starts health checks 1 second after container startup. The application may not be fully ready yet — even a healthy container can fail at this stage.
3. Kubernetes Forcefully Restarts the Container
When the liveness probe fails:
- Kubernetes kills the container
- The container is restarted
- The same probe fails again
- This loop repeats
After several failures, Kubernetes applies a restart backoff and the pod enters CrashLoopBackOff.
Example 3: Out-of-Memory (OOMKilled)
Broken YAML
apiVersion: apps/v1
kind: Deployment
metadata:
name: crashloop-oom-demo
spec:
replicas: 1
selector:
matchLabels:
app: oom-demo
template:
metadata:
labels:
app: oom-demo
spec:
containers:
- name: app
image: polinux/stress
command: ["stress"]
args: ["--vm", "1", "--vm-bytes", "200M", "--vm-hang", "1"]
resources:
limits:
memory: "64Mi"
Result:
- Container exceeds memory
- Kernel kills it
- Pod enters CrashLoopBackOff
Why CrashLoopBackOff Happens in This Case
Your container:
- Is limited to 64Mi memory
- Tries to allocate 200MB
- Linux kernel kills it to protect the node
- Kubernetes restarts it
- Same thing happens again
After several restarts: STATUS: CrashLoopBackOff
Kubernetes is working correctly. Memory limits are enforced. This is expected behavior.
Example 4: CrashLoopBackOff Due to Missing Secret or ConfigMap
This section demonstrates how a pod can enter CrashLoopBackOff when required configuration objects (Secret or ConfigMap) are missing.
Case 1: Missing Secret
The application expects a Secret at startup. Since the Secret does not exist, the container fails immediately, and Kubernetes repeatedly restarts it.
Deployment YAML (Missing Secret)
apiVersion: apps/v1
kind: Deployment
metadata:
name: mysql-crashloop-demo
spec:
replicas: 1
selector:
matchLabels:
app: mysql-demo
template:
metadata:
labels:
app: mysql-demo
spec:
containers:
- name: app
image: busybox
command:
- sh
- -c
- |
echo "Starting app"
if [ -z "$MYSQL_PASSWORD" ]; then
echo "MySQL password missing"
exit 1
fi
echo "Connected to MySQL"
env:
- name: MYSQL_PASSWORD
valueFrom:
secretKeyRef:
name: mysql-secret
key: MYSQL_PASSWORD
Note: The secret
mysql-secretdoes not exist.
What happens:
- Pod starts
- Script runs
- Password is empty
- App exits
- Kubernetes restarts
- Status becomes
CrashLoopBackOff
Case 2: Missing ConfigMap
apiVersion: apps/v1
kind: Deployment
metadata:
name: mongo-crashloop-demo
spec:
replicas: 1
selector:
matchLabels:
app: mongo-demo
template:
metadata:
labels:
app: mongo-demo
spec:
containers:
- name: app
image: busybox
command:
- sh
- -c
- |
echo "Starting app"
if [ -z "$MONGO_URL" ]; then
echo "MongoDB URL missing"
exit 1
fi
echo "Connected to MongoDB"
env:
- name: MONGO_URL
valueFrom:
configMapKeyRef:
name: mongo-config
key: MONGO_URL
What Happens:
- Pod is scheduled
- Container starts
- Kubernetes tries to mount the ConfigMap
- ConfigMap is missing
- Container fails to start
- Kubernetes restarts the container
- Pod enters
CrashLoopBackOff
CrashLoopBackOff can occur when an application depends on a Secret or ConfigMap that does not exist. Kubernetes retries starting the container, but since the required configuration is missing, the container fails repeatedly.
Freeze Pod for Live Debugging
When the pod crashes too fast, override the command to keep it alive:
Debug Override
command: ["/bin/sh"]
args: ["-c", "while true; do sleep 3600; done"]
Exec Into Pod
kubectl exec -it <pod-name> -n <namespace> -- sh
You can now:
- Inspect environment variables
- Test DB connectivity
- Run startup commands manually
- Check file permissions
6. Fix → Verify → Rollback
Verify
kubectl rollout status deployment/<deployment-name>
Rollback
kubectl rollout undo deployment/<deployment-name>
Stability first. Debug second.
7. Final Truth
CrashLoopBackOff is not a Kubernetes problem. It is almost always caused by:
- Bad configuration
- Bad probes
- Bad resource limits
- Bad assumptions
Kubernetes does not break applications. It exposes mistakes early and aggressively.