K8s by Example: Production Checklist
| A Deployment that works in dev often breaks in prod. This checklist covers what you need: resource constraints, health checks, graceful shutdown, and high availability. |
| 1-resources.yaml | |
| Set resource requests and limits. Without requests, the scheduler can’t make good decisions. Your Pod might land on an overloaded node. Without limits, a memory leak takes down the entire node. | |
| Set memory limit equal to request. This creates Guaranteed QoS class and makes scheduling predictable. CPU can burst (set limit higher than request), but memory should match to avoid OOMKill surprises when pods use more than requested. | |
| 2-probes.yaml | |
| Configure readiness probe. Without it, traffic hits Pods before they’re ready. During rolling updates, users see errors. The probe should check if your app can actually serve requests, not just if the process is alive. | |
| Configure liveness probe. Restarts stuck processes. Be careful: if your liveness check is too aggressive, you’ll restart healthy Pods under load. Make it simpler than readiness - just “is the process responding at all?” | |
| 3-shutdown.yaml | |
| Handle SIGTERM. Kubernetes sends SIGTERM, waits | |
| The preStop sleep trick: Load balancers take a few seconds to remove endpoints. Without the sleep, requests hit terminating Pods. The sleep gives the LB time to update, then your app handles SIGTERM cleanly. | |
| 4-replicas.yaml | |
| Run multiple replicas. Single replica = single point of failure. During node maintenance, deploys, or crashes - you’re down. Minimum 2 for any service that matters. 3+ for critical paths. | |
| 5-anti-affinity.yaml | |
| Spread Pods across nodes. 3 replicas on 1 node = still a single point of failure. Anti-affinity ensures Pods land on different nodes. Use | |
| 6-pdb.yaml | |
| Set Pod Disruption Budget. Without PDB, | |
| 7-security.yaml | |
| Drop root privileges. Running as root inside the container is unnecessary for most apps and dangerous if compromised. Set | |
| Make filesystem read-only. Prevents attackers from dropping malware. Use | |
| 8-rolling-update.yaml | |
| Configure rolling update strategy. Defaults are conservative. | |
| 9-labels.yaml | |
| Add operational labels. You’ll need these when everything’s on fire at 3am. | |
| 10-complete.yaml | |
| Full production-ready Deployment. All the pieces together. This survives node failures, deploys cleanly, shuts down gracefully, and doesn’t run as root. | |
Index | Use arrow keys to navigate