← Back to Home 🐳 Docker Guide
☸️ Complete Engineering Reference

Kubernetes Mastery Guide

A production-grade reference covering container orchestration β€” from fundamentals to advanced operations, security, and scaling.

01🎯

Introduction to Kubernetes

Container Orchestration Fundamentals

Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications. It solves the challenges of running containers in production β€” monitoring health, scaling on demand, and ensuring zero-downtime deployments.
Problems Kubernetes Solves
πŸ’€
Container Failures
What if a container crashes? Kubernetes monitors health and automatically restarts failed containers β€” this is called self-healing.
πŸ“ˆ
Traffic Spikes
What if load increases 10x? Kubernetes scales your app by creating more replicas and load balancing traffic across them.
πŸ”„
Deployment Downtime
What if deployment breaks the app? Kubernetes does rolling updates β€” replacing containers one by one with zero downtime.
βͺ
Rollback Needed
What if the new version has bugs? Kubernetes can rollback to a previous version with a single command.
Key Features
FeatureDescription
Self-HealingAutomatically restarts failed containers and replaces pods when nodes die.
Auto ScalingScales applications up/down based on CPU, memory, or custom metrics.
Load BalancingDistributes traffic across multiple pod replicas automatically.
Rolling UpdatesDeploys new versions gradually without downtime.
Secret ManagementStores and manages sensitive data like passwords and API keys.
Automatic Bin PackingPlaces containers on nodes based on resource requirements.
Kubernetes vs Other Orchestrators
Docker SwarmKubernetes (K8s)Apache MesosAmazon ECSNomad
Kubernetes (K8s) is the most popular choice with advanced features and support from all major cloud providers. The name comes from Greek meaning "helmsman" β€” the person who steers a ship carrying containers.
02πŸ—οΈ

Kubernetes Architecture

Control Plane & Worker Nodes

A Kubernetes cluster consists of a Control Plane (master) that manages the cluster, and Worker Nodes that run your applications. The control plane makes global decisions while worker nodes provide the runtime environment.
Control Plane Components
🌐
API Server
The front door to Kubernetes. All commands (kubectl, UI, SDK) go through the API server. It validates and processes requests.
πŸ’Ύ
etcd
Distributed key-value store holding all cluster data β€” nodes, pods, configs, secrets. The single source of truth.
πŸ“‹
Scheduler
Watches for new pods and assigns them to nodes based on resource requirements, affinity rules, and constraints.
πŸ”„
Controller Manager
Runs controllers that maintain desired state β€” ReplicaSet controller, Node controller, Endpoint controller, etc.
Worker Node Components
🐳
Container Runtime
Software that runs containers (Docker, containerd, CRI-O). Pulls images and manages container lifecycle.
πŸ“‘
Kubelet
Agent on each node that ensures containers are running in pods. Reports node and pod status to control plane.
πŸ”€
Kube-proxy
Network proxy maintaining network rules on nodes. Handles routing traffic to correct pods and load balancing.
↳ Request Flow
kubectl
User command
β†’
API Server
Validates request
β†’
Scheduler
Picks node
β†’
Kubelet
Creates pod
⚑ Pro Tip

In production, run multiple control plane nodes for high availability. If etcd loses data, you lose your entire cluster state β€” always have backups!

03βš™οΈ

Setup Kubernetes

Local Development & kubectl

For local development, use Minikube, Kind, or k3s to spin up a cluster on your machine. For production, use managed services like EKS, GKE, or AKS. kubectl is the CLI for interacting with any Kubernetes cluster.
Local Cluster Options
ToolBest ForNotes
MinikubeLearning, developmentMulti-node support, add-ons, VM or Docker driver
KindCI/CD, testingKubernetes in Docker, fast startup, multi-node
k3sEdge, IoT, low resourcesLightweight (~50MB), uses SQLite instead of etcd
Docker DesktopMac/Windows devBuilt-in K8s option, single node
Install kubectl
$ curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"Download kubectl binary
$ chmod +x kubectl && sudo mv kubectl /usr/local/bin/Make executable and move to PATH
$ kubectl version --clientVerify installation
Create Cluster with Minikube
$ minikube start --nodes=2 --driver=dockerCreate 2-node cluster with Docker
$ minikube statusCheck cluster status
$ kubectl get nodesList cluster nodes
$ minikube dashboard --urlOpen Kubernetes dashboard
kubectl Syntax
BASHkubectl <command> <resource-type> <name> <flags> # Examples: kubectl get pods # List all pods kubectl get pods -o wide # More details kubectl describe pod inventory-pod # Detailed info kubectl delete pod inventory-pod # Delete pod kubectl apply -f manifest.yaml # Apply config file
kubectl commands follow a consistent pattern: verb (get, create, delete, apply) + resource type (pods, services, deployments) + name + flags (-o yaml, -n namespace).
04πŸ“¦

Kubernetes Pods

The Smallest Deployable Unit

A Pod is the smallest deployable unit in Kubernetes β€” a wrapper around one or more containers that share network and storage. Containers in a pod can communicate via localhost and share volumes.
Why Pods?
πŸ”—
Shared Network
Containers in a pod share IP address and ports. They communicate via localhost.
πŸ’Ύ
Shared Storage
Containers can mount the same volumes for data sharing.
♻️
Co-scheduling
Related containers (app + sidecar) are always scheduled together on same node.
🎯
Single IP
Each pod gets one IP. Multiple containers share it via different ports.
Pod Manifest
POD.YAMLapiVersion: v1 kind: Pod metadata: name: inventory-api labels: app: inventory tier: backend spec: containers: - name: api image: myregistry/inventory-api:v1 ports: - containerPort: 8080 resources: requests: memory: "128Mi" cpu: "100m"
Essential Pod Commands
$ kubectl apply -f pod.yamlCreate pod from manifest
$ kubectl get podsList pods in current namespace
$ kubectl get pods -l app=inventoryFilter pods by label
$ kubectl describe pod inventory-apiDetailed pod info + events
$ kubectl logs inventory-apiView container logs
$ kubectl logs inventory-api -fStream logs in real-time
$ kubectl exec -it inventory-api -- /bin/shShell into container
$ kubectl port-forward pod/inventory-api 8080:8080Forward local port to pod
$ kubectl delete pod inventory-apiDelete pod
Multi-Container Pods (Sidecars)
SIDECAR.YAMLapiVersion: v1 kind: Pod metadata: name: web-with-logger spec: containers: - name: web image: myregistry/web-app:v1 volumeMounts: - name: logs mountPath: /var/log/app - name: log-shipper image: fluent/fluentd:v1.14 volumeMounts: - name: logs mountPath: /var/log/app volumes: - name: logs emptyDir: {}
⚠️ Important

Pods are ephemeral β€” they can be killed and recreated anytime. Never deploy pods directly. Use Deployments or StatefulSets for production workloads.

05♾️

Deployments & ReplicaSets

Self-Healing & Rolling Updates

ReplicaSets ensure a specified number of pod replicas are always running. Deployments wrap ReplicaSets and add rollout management β€” enabling zero-downtime updates and easy rollbacks.
Why Not Just Pods?
πŸ”„
Self-Healing
If a pod dies, ReplicaSet creates a replacement automatically.
πŸ“ˆ
Scaling
Increase replicas from 2 to 10 with one command.
πŸ”€
Load Distribution
Traffic spreads across replicas via Services.
βͺ
Rollbacks
Deployment keeps history, rollback to any previous version.
↳ Deployment β†’ ReplicaSet β†’ Pods
Deployment
Manages rollouts
β†’
ReplicaSet
Maintains replicas
β†’
Pods
Run containers
Deployment Manifest
DEPLOYMENT.YAMLapiVersion: apps/v1 kind: Deployment metadata: name: catalog-service annotations: kubernetes.io/change-cause: "Initial release v1.0" spec: replicas: 3 strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 maxUnavailable: 0 selector: matchLabels: app: catalog template: metadata: labels: app: catalog spec: containers: - name: catalog image: myregistry/catalog:v1.0 ports: - containerPort: 8080
Rollout Commands
$ kubectl rollout status deployment/catalog-serviceWatch rollout progress
$ kubectl rollout history deployment/catalog-serviceView revision history
$ kubectl rollout undo deployment/catalog-serviceRollback to previous
$ kubectl rollout undo deployment/catalog-service --to-revision=2Rollback to specific revision
$ kubectl scale deployment catalog-service --replicas=5Scale replicas
⚑ Pro Tip

Use kubernetes.io/change-cause annotation to document changes. It appears in rollout history and helps teams understand what changed in each revision.

06🌐

Kubernetes Services

Stable Networking & Load Balancing

Pods get random IPs that change on restart. Services provide stable endpoints that abstract away pod IPs β€” enabling reliable communication and automatic load balancing across replicas.
Service Types
TypeScopeUse Case
ClusterIPInternal onlyDefault. Inter-service communication within cluster.
NodePortExternal via nodeOpens port 30000-32767 on all nodes. Development use.
LoadBalancerExternal via cloud LBProvisions cloud load balancer. Production external access.
ExternalNameDNS aliasMaps service to external DNS name.
ClusterIP Service
SERVICE.YAMLapiVersion: v1 kind: Service metadata: name: catalog-service spec: type: ClusterIP selector: app: catalog ports: - port: 80 targetPort: 8080
port is what clients connect to (catalog-service:80). targetPort is the container port (8080). The selector matches pod labels to know which pods to route to.
Service Commands
$ kubectl get svcList services
$ kubectl describe svc catalog-serviceService details + endpoints
$ kubectl get endpoints catalog-servicePod IPs backing the service
$ kubectl port-forward svc/catalog-service 8080:80Local access for debugging
⚠️ Important

NodePort opens ports on every node β€” security concern in production. Use LoadBalancer or Ingress instead.

07πŸšͺ

Kubernetes Ingress

HTTP Routing & TLS

Each LoadBalancer service creates a separate cloud load balancer with its own cost. Ingress consolidates multiple services behind one load balancer with URL-based routing and TLS termination.
Why Ingress?
πŸ’°
Cost Efficient
One load balancer for all services instead of one per service.
πŸ”€
Path Routing
Route /api to backend, / to frontend.
🏠
Host Routing
Route shop.example.com to shop, api.example.com to API.
πŸ”’
TLS Termination
Handle HTTPS certificates in one place.
Ingress Manifest
INGRESS.YAMLapiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: shop-ingress annotations: nginx.ingress.kubernetes.io/rewrite-target: /$1 spec: ingressClassName: nginx tls: - hosts: - shop.example.com secretName: shop-tls rules: - host: shop.example.com http: paths: - path: /api/(.*) pathType: Prefix backend: service: name: shop-api port: number: 8080 - path: /(.*) pathType: Prefix backend: service: name: shop-frontend port: number: 80
Popular Ingress Controllers
NGINX IngressTraefikHAProxyAWS ALBKongIstio
⚑ Pro Tip

Use cert-manager to automatically provision and renew TLS certificates from Let's Encrypt.

08πŸ“

Kubernetes Namespaces

Cluster Organization & Isolation

Namespaces are virtual clusters within a physical cluster. They organize resources, isolate teams, enable resource quotas, and allow the same resource names in different namespaces.
Default Namespaces
NamespacePurpose
defaultResources created without specifying namespace
kube-systemKubernetes system components (API server, scheduler, etc.)
kube-publicPublicly readable resources, cluster info
kube-node-leaseNode heartbeat leases for health detection
Working with Namespaces
$ kubectl create namespace developmentCreate namespace
$ kubectl get namespacesList all namespaces
$ kubectl get pods -n developmentList pods in namespace
$ kubectl get pods --all-namespacesList pods across all namespaces
$ kubectl config set-context --current --namespace=developmentSet default namespace
Namespace Manifest
NAMESPACE.YAMLapiVersion: v1 kind: Namespace metadata: name: production labels: env: prod
Cross-Namespace Communication
DNS# Full DNS name for services: <service-name>.<namespace>.svc.cluster.local # Examples: catalog-service.production.svc.cluster.local payment-api.payments.svc.cluster.local # Short form (within same cluster): catalog-service.production
⚠️ Important

Deleting a namespace deletes ALL resources in it. Be extremely careful with kubectl delete namespace in production!

09πŸ’Ύ

Kubernetes Volumes

Persistent Storage

Container filesystems are ephemeral β€” data is lost when pods restart. Volumes provide persistent storage that survives pod restarts and can be shared between containers.
Volume Types
TypeLifetimeUse Case
emptyDirPod lifetimeScratch space, cache shared between containers
hostPathNode lifetimeAccess node filesystem (avoid in production)
configMap/secretPod lifetimeMount config files or secrets
persistentVolumeClaimIndependentDatabases, stateful apps β€” survives pod/node restarts
PersistentVolumeClaim
PVC.YAMLapiVersion: v1 kind: PersistentVolumeClaim metadata: name: postgres-data spec: accessModes: - ReadWriteOnce storageClassName: standard resources: requests: storage: 10Gi
Using PVC in Pod
POD-PVC.YAMLapiVersion: v1 kind: Pod metadata: name: postgres spec: containers: - name: postgres image: postgres:15 volumeMounts: - name: data mountPath: /var/lib/postgresql/data volumes: - name: data persistentVolumeClaim: claimName: postgres-data
Access Modes
ModeDescription
ReadWriteOnce (RWO)Single node read-write
ReadOnlyMany (ROX)Many nodes read-only
ReadWriteMany (RWX)Many nodes read-write
⚑ Pro Tip

Use StorageClass with volumeBindingMode: WaitForFirstConsumer to create PVs in the same zone as the pod that needs them.

10πŸ—„οΈ

StatefulSets

Stateful Application Management

Deployments are for stateless apps where pods are interchangeable. StatefulSets are for stateful apps (databases) that need stable network IDs, ordered deployment, and persistent storage per pod.
StatefulSet vs Deployment
AspectDeploymentStatefulSet
Pod NamesRandom (catalog-7d9f8b-x2k4j)Ordered (postgres-0, postgres-1)
Creation OrderParallelSequential (0 β†’ 1 β†’ 2)
Deletion OrderRandomReverse (2 β†’ 1 β†’ 0)
StorageShared PVCPer-pod PVC
Network IDChanges on restartStable DNS per pod
StatefulSet Manifest
STATEFULSET.YAMLapiVersion: apps/v1 kind: StatefulSet metadata: name: postgres spec: serviceName: postgres-headless replicas: 3 selector: matchLabels: app: postgres template: metadata: labels: app: postgres spec: containers: - name: postgres image: postgres:15 ports: - containerPort: 5432 volumeMounts: - name: data mountPath: /var/lib/postgresql/data volumeClaimTemplates: - metadata: name: data spec: accessModes: ["ReadWriteOnce"] resources: requests: storage: 10Gi
Headless Service
HEADLESS.YAMLapiVersion: v1 kind: Service metadata: name: postgres-headless spec: clusterIP: None selector: app: postgres ports: - port: 5432
Headless service (clusterIP: None) creates DNS entries for each pod: postgres-0.postgres-headless.default.svc.cluster.local
11πŸ”§

ConfigMaps & Secrets

Configuration Management

ConfigMaps store non-sensitive configuration data. Secrets store sensitive data like passwords and API keys. Both decouple configuration from container images, enabling the same image across environments.
ConfigMap
CONFIGMAP.YAMLapiVersion: v1 kind: ConfigMap metadata: name: app-config data: DATABASE_HOST: "postgres-headless" LOG_LEVEL: "info" config.yaml: | server: port: 8080 cache: ttl: 3600
Secret
SECRET.YAMLapiVersion: v1 kind: Secret metadata: name: db-credentials type: Opaque data: username: YWRtaW4= # base64 encoded password: cGFzc3dvcmQxMjM= # base64 encoded
Using in Pods
POD-CONFIG.YAMLapiVersion: v1 kind: Pod metadata: name: api-server spec: containers: - name: api image: myregistry/api:v1 envFrom: - configMapRef: name: app-config env: - name: DB_PASSWORD valueFrom: secretKeyRef: name: db-credentials key: password volumeMounts: - name: config-volume mountPath: /etc/config volumes: - name: config-volume configMap: name: app-config
$ echo -n "mypassword" | base64Encode value for Secret
$ kubectl create secret generic db-creds --from-literal=password=mypasswordCreate Secret from literal
$ kubectl create configmap app-cfg --from-file=config.yamlCreate ConfigMap from file
⚠️ Important

Secrets are base64 encoded, not encrypted! Anyone with cluster access can decode them. Use external secret managers (Vault, AWS Secrets Manager) for production.

12🩺

Health Probes

Liveness, Readiness & Startup

By default, Kubernetes only checks if the main process is running. Probes let you define custom health checks β€” detect deadlocks, verify dependencies, and handle slow startups.
Three Probe Types
πŸ’“
Liveness
Is container alive? Failure β†’ restart. Detect deadlocks.
βœ…
Readiness
Ready for traffic? Failure β†’ remove from service. Warmup checks.
πŸš€
Startup
Has startup completed? Disables other probes until success.
Probe Example
PROBES.YAMLapiVersion: v1 kind: Pod metadata: name: order-service spec: containers: - name: api image: myregistry/order-api:v2 ports: - containerPort: 8080 startupProbe: httpGet: path: /health/startup port: 8080 failureThreshold: 30 periodSeconds: 10 livenessProbe: httpGet: path: /health/live port: 8080 periodSeconds: 10 failureThreshold: 3 readinessProbe: httpGet: path: /health/ready port: 8080 periodSeconds: 5 failureThreshold: 3
Probe Mechanisms
MechanismSuccess ConditionBest For
httpGetHTTP 2xx-3xx responseWeb services
execCommand exit code 0Custom scripts
tcpSocketTCP connection succeedsDatabases
grpcgRPC health checkgRPC services
⚠️ Important

Liveness probes should NEVER check external dependencies. A database outage shouldn't restart your app β€” it should stop receiving traffic (readiness).

13πŸ“Š

Resource Management

Requests, Limits & QoS

Without resource specs, pods can consume unlimited CPU/memory, starving other workloads. Requests guarantee minimum resources for scheduling. Limits cap maximum usage.
Requests vs Limits
RequestsLimits
PurposeMinimum guaranteedMaximum allowed
SchedulingUsed by schedulerNot considered
CPU ExceededN/AThrottled
Memory ExceededN/AOOMKilled
Resource Specification
RESOURCES.YAMLresources: requests: memory: "256Mi" cpu: "250m" # 0.25 CPU cores limits: memory: "512Mi" cpu: "500m" # 0.5 CPU cores
QoS Classes
ClassConditionEviction Priority
Guaranteedrequests == limitsLast
BurstableAt least one request/limitMiddle
BestEffortNo requests or limitsFirst
$ kubectl top podsView pod resource usage
$ kubectl describe nodeView node allocatable resources
⚑ Pro Tip

Start with requests = typical usage, limits = 2-3x requests. Monitor actual usage with kubectl top and adjust based on real data.

14πŸ“

Advanced Scheduling

Affinity, Taints & Tolerations

By default, the scheduler places pods optimally. For specific requirements β€” SSD nodes, co-location, dedicated node pools β€” use node selectors, affinity rules, and taints/tolerations.
Scheduling Mechanisms
MechanismPurpose
nodeSelectorSimple label matching
Node AffinityAdvanced selection with operators
Pod AffinityCo-locate with other pods
Pod Anti-AffinitySpread away from pods
Taints/TolerationsRepel pods from nodes
Node Selector
NODESELECTOR.YAMLspec: nodeSelector: disk: ssd gpu: "true"
Node Affinity
AFFINITY.YAMLaffinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: zone operator: In values: [us-east-1a, us-east-1b]
Taints & Tolerations
$ kubectl taint nodes gpu-node dedicated=gpu:NoScheduleAdd taint
$ kubectl taint nodes gpu-node dedicated=gpu:NoSchedule-Remove taint
TOLERATION.YAMLtolerations: - key: "dedicated" operator: "Equal" value: "gpu" effect: "NoSchedule"
⚑ Pro Tip

Use pod anti-affinity with topologyKey: kubernetes.io/hostname to spread replicas across nodes for high availability.

15πŸ”

RBAC Security

Role-Based Access Control

By default, anyone with cluster access can do anything. RBAC defines who (subjects) can perform what actions (verbs) on which resources. Essential for multi-tenant clusters and compliance.
RBAC Components
πŸ‘€
Subjects
Who: Users, Groups, ServiceAccounts
πŸ“‹
Roles
What: Permissions (verbs + resources)
πŸ”—
Bindings
Connect subjects to roles
Role vs ClusterRole
TypeScope
Role + RoleBindingSingle namespace
ClusterRole + ClusterRoleBindingEntire cluster
Role Example
ROLE.YAMLapiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: namespace: development name: pod-reader rules: - apiGroups: [""] resources: ["pods", "pods/log"] verbs: ["get", "list", "watch"]
RoleBinding
ROLEBINDING.YAMLapiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: dev-pod-reader namespace: development subjects: - kind: User name: alice apiGroup: rbac.authorization.k8s.io - kind: ServiceAccount name: monitoring namespace: monitoring roleRef: kind: Role name: pod-reader apiGroup: rbac.authorization.k8s.io
Testing Permissions
$ kubectl auth can-i create podsCheck current user permission
$ kubectl auth can-i delete pods --as=aliceCheck as another user
$ kubectl auth can-i --listList all permissions
βœ“Follow least privilege β€” grant minimum permissions needed
βœ“Never use cluster-admin for applications
βœ“Avoid wildcards (*) in verbs and resources
βœ“Use namespaces to isolate teams