Kubernetes Tutorial: Step-by-Step Production Deployment

Most teams begin their Kubernetes journey with `kubectl apply` on basic YAML manifests. But this foundational approach often overlooks critical considerations for resilience and scalability, leading to deployment inconsistencies, prolonged downtime, and operational fatigue in dynamic production environments.

TL;DR

Achieve idempotent, zero-downtime application deployments by leveraging native Kubernetes `Deployment` resources with proper rolling update strategies.

Externalize configuration using `ConfigMaps` and `Secrets` and manage their lifecycle independently of application code.

Expose services reliably using `Service` objects and efficiently route external traffic with `Ingress` controllers.

Implement robust `liveness` and `readiness` probes to ensure application health and prevent traffic from being directed to unhealthy instances.

Prioritize comprehensive monitoring, alerting, cost optimization, and strong security practices for stable production Kubernetes operations.

The Problem: Beyond Basic `kubectl apply`

Relying solely on `kubectl apply -f` for every change, without a deep understanding of Kubernetes primitives, frequently results in brittle deployments. An application update might fail silently, leaving users with a mixed-version experience or, worse, a complete outage. This impacts critical business functions and degrades key metrics like SLIs and SLOs. Teams commonly report 15-20% increased Mean Time To Recovery (MTTR) due to poorly orchestrated deployments lacking proper health checks and rollback mechanisms. A robust kubernetes tutorial step by step guide is essential for engineers transitioning applications into a resilient production environment.

Consider a scenario where a critical microservice experiences a memory leak. Without proper resource requests and limits, this rogue service can exhaust node resources, causing cascading failures across other applications on the same node. Similarly, a poorly configured readiness probe might direct traffic to an application instance still initializing, leading to user-facing errors.

How It Works: Declarative Deployments and Service Exposure

Kubernetes operates on a declarative model. You describe the desired state of your application and the cluster works to achieve and maintain that state. This is fundamental for reliable, repeatable deployments.

Kubernetes Deployment Strategies for Resilience

The `Deployment` resource is the cornerstone for managing stateless applications in Kubernetes. It provides declarative updates for Pods and ReplicaSets. When you define a `Deployment`, you specify details like the container image, replica count, resource requirements, and importantly, strategies for updating instances.

Rolling updates are the default and preferred method for `Deployment` updates. This strategy ensures zero-downtime by incrementally updating Pods. New Pods are brought up with the updated configuration, and traffic is gradually shifted. Only once new Pods are healthy are old Pods terminated. This process is managed by the `Deployment` controller and relies heavily on `liveness` and `readiness` probes.

Liveness Probe: Determines if the application inside the container is running correctly. If this probe fails, Kubernetes restarts the container.
Readiness Probe: Determines if the application is ready to serve traffic. If this probe fails, Kubernetes removes the Pod from the Service's endpoints, preventing traffic from being routed to it until it becomes ready. This is crucial for applications with long startup times or dependencies that need to initialize.

This manifest defines a Deployment for a backend API service in 2026. apiVersion: apps/v1 kind: Deployment metadata: name: api-service-2026 namespace: backend-app-2026 labels: app: api-service spec: replicas: 3 # Maintain 3 instances of the API service selector: matchLabels: app: api-service strategy: type: RollingUpdate # Default strategy for zero-downtime updates rollingUpdate: maxUnavailable: 25% # Max percentage of Pods that can be unavailable during the update maxSurge: 25% # Max percentage of Pods that can be created above the desired number template: metadata: labels: app: api-service spec: containers: - name: api-container image: backendstack/api-service:1.0.0-20260415 ports: - containerPort: 8080 envFrom: - configMapRef: name: api-config-2026 # Reference environment variables from a ConfigMap resources: requests: # Define minimum resources required memory: "128Mi" cpu: "250m" limits: # Define maximum resources allowed memory: "512Mi" cpu: "1000m" livenessProbe: # Check if the application is still running and responsive httpGet: path: /healthz port: 8080 initialDelaySeconds: 15 # Give the application time to start periodSeconds: 10 readinessProbe: # Check if the application is ready to accept traffic httpGet: path: /ready port: 8080 initialDelaySeconds: 5 periodSeconds: 5 failureThreshold: 3 # Mark as unready after 3 consecutive failures

Service and Ingress for Application Exposure

Once your application Pods are running, you need a way to expose them, both internally within the cluster and externally to clients.

Service: An abstract way to expose an application running on a set of Pods as a network service. Services use labels to identify which Pods they should route traffic to. Common types include `ClusterIP` (internal only), `NodePort` (exposes service on each Node's IP at a static port), and `LoadBalancer` (provisioning an external load balancer in cloud environments).
Ingress: Manages external access to the services in a cluster, typically HTTP/S. Ingress can provide load balancing, SSL termination, and name-based virtual hosting. It requires an Ingress Controller (e.g., Nginx Ingress, AWS ALB Ingress Controller) to be running in the cluster.

The interaction between `Service` and `Ingress` is critical. An `Ingress` resource defines the routing rules (host, path) and which `Service` it should forward traffic to. The `Service` then load-balances that traffic across the healthy `Pods` it targets. This decoupled architecture allows for flexible traffic management without modifying individual application Pods.


This manifest defines a ClusterIP Service for the API service.
apiVersion: v1
kind: Service
metadata:
  name: api-service-2026
  namespace: backend-app-2026
spec:
  selector:
    app: api-service # Selects Pods with the label app: api-service
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080 # Routes traffic from Service port 80 to container port 8080
  type: ClusterIP # Exposes the service internally within the cluster