Kubernetes Tutorial: Step-by-Step Production Deployment

Follow this Kubernetes tutorial step by step to deploy, manage, and scale applications reliably in production. Learn cluster setup, deployment strategies, and e

Ahmet Çelik

11 min read
0

/

Kubernetes Tutorial: Step-by-Step Production Deployment

Most teams begin their Kubernetes journey with `kubectl apply` on basic YAML manifests. But this foundational approach often overlooks critical considerations for resilience and scalability, leading to deployment inconsistencies, prolonged downtime, and operational fatigue in dynamic production environments.


TL;DR

  • Achieve idempotent, zero-downtime application deployments by leveraging native Kubernetes `Deployment` resources with proper rolling update strategies.
  • Externalize configuration using `ConfigMaps` and `Secrets` and manage their lifecycle independently of application code.
  • Expose services reliably using `Service` objects and efficiently route external traffic with `Ingress` controllers.
  • Implement robust `liveness` and `readiness` probes to ensure application health and prevent traffic from being directed to unhealthy instances.
  • Prioritize comprehensive monitoring, alerting, cost optimization, and strong security practices for stable production Kubernetes operations.


The Problem: Beyond Basic `kubectl apply`


Relying solely on `kubectl apply -f` for every change, without a deep understanding of Kubernetes primitives, frequently results in brittle deployments. An application update might fail silently, leaving users with a mixed-version experience or, worse, a complete outage. This impacts critical business functions and degrades key metrics like SLIs and SLOs. Teams commonly report 15-20% increased Mean Time To Recovery (MTTR) due to poorly orchestrated deployments lacking proper health checks and rollback mechanisms. A robust kubernetes tutorial step by step guide is essential for engineers transitioning applications into a resilient production environment.


Consider a scenario where a critical microservice experiences a memory leak. Without proper resource requests and limits, this rogue service can exhaust node resources, causing cascading failures across other applications on the same node. Similarly, a poorly configured readiness probe might direct traffic to an application instance still initializing, leading to user-facing errors.


How It Works: Declarative Deployments and Service Exposure


Kubernetes operates on a declarative model. You describe the desired state of your application and the cluster works to achieve and maintain that state. This is fundamental for reliable, repeatable deployments.


Kubernetes Deployment Strategies for Resilience


The `Deployment` resource is the cornerstone for managing stateless applications in Kubernetes. It provides declarative updates for Pods and ReplicaSets. When you define a `Deployment`, you specify details like the container image, replica count, resource requirements, and importantly, strategies for updating instances.


Rolling updates are the default and preferred method for `Deployment` updates. This strategy ensures zero-downtime by incrementally updating Pods. New Pods are brought up with the updated configuration, and traffic is gradually shifted. Only once new Pods are healthy are old Pods terminated. This process is managed by the `Deployment` controller and relies heavily on `liveness` and `readiness` probes.


  • Liveness Probe: Determines if the application inside the container is running correctly. If this probe fails, Kubernetes restarts the container.

  • Readiness Probe: Determines if the application is ready to serve traffic. If this probe fails, Kubernetes removes the Pod from the Service's endpoints, preventing traffic from being routed to it until it becomes ready. This is crucial for applications with long startup times or dependencies that need to initialize.


This manifest defines a Deployment for a backend API service in 2026.

apiVersion: apps/v1

kind: Deployment

metadata:

name: api-service-2026

namespace: backend-app-2026

labels:

app: api-service

spec:

replicas: 3 # Maintain 3 instances of the API service

selector:

matchLabels:

app: api-service

strategy:

type: RollingUpdate # Default strategy for zero-downtime updates

rollingUpdate:

maxUnavailable: 25% # Max percentage of Pods that can be unavailable during the update

maxSurge: 25% # Max percentage of Pods that can be created above the desired number

template:

metadata:

labels:

app: api-service

spec:

containers:

- name: api-container

image: backendstack/api-service:1.0.0-20260415

ports:

- containerPort: 8080

envFrom:

- configMapRef:

name: api-config-2026 # Reference environment variables from a ConfigMap

resources:

requests: # Define minimum resources required

memory: "128Mi"

cpu: "250m"

limits: # Define maximum resources allowed

memory: "512Mi"

cpu: "1000m"

livenessProbe: # Check if the application is still running and responsive

httpGet:

path: /healthz

port: 8080

initialDelaySeconds: 15 # Give the application time to start

periodSeconds: 10

readinessProbe: # Check if the application is ready to accept traffic

httpGet:

path: /ready

port: 8080

initialDelaySeconds: 5

periodSeconds: 5

failureThreshold: 3 # Mark as unready after 3 consecutive failures


Service and Ingress for Application Exposure


Once your application Pods are running, you need a way to expose them, both internally within the cluster and externally to clients.


  • Service: An abstract way to expose an application running on a set of Pods as a network service. Services use labels to identify which Pods they should route traffic to. Common types include `ClusterIP` (internal only), `NodePort` (exposes service on each Node's IP at a static port), and `LoadBalancer` (provisioning an external load balancer in cloud environments).

  • Ingress: Manages external access to the services in a cluster, typically HTTP/S. Ingress can provide load balancing, SSL termination, and name-based virtual hosting. It requires an Ingress Controller (e.g., Nginx Ingress, AWS ALB Ingress Controller) to be running in the cluster.


The interaction between `Service` and `Ingress` is critical. An `Ingress` resource defines the routing rules (host, path) and which `Service` it should forward traffic to. The `Service` then load-balances that traffic across the healthy `Pods` it targets. This decoupled architecture allows for flexible traffic management without modifying individual application Pods.


This manifest defines a ClusterIP Service for the API service.

apiVersion: v1

kind: Service

metadata:

name: api-service-2026

namespace: backend-app-2026

spec:

selector:

app: api-service # Selects Pods with the label app: api-service

ports:

- protocol: TCP

port: 80

targetPort: 8080 # Routes traffic from Service port 80 to container port 8080

type: ClusterIP # Exposes the service internally within the cluster

WRITTEN BY

Ahmet Çelik

Former AWS Solutions Architect, 8 years in cloud and infrastructure. Computer Engineering graduate, Bilkent University. Lead writer for AWS, Terraform and Kubernetes content.Read more

Responses (0)

    Hottest authors

    View all

    Ahmet Çelik

    Lead Writer · ex-AWS Solutions Architect, 8 yrs · AWS, Terraform, K8s

    Alp Karahan

    Contributor · MongoDB certified, NoSQL specialist · MongoDB, DynamoDB

    Ayşe Tunç

    Lead Writer · Engineering Manager, ex-Meta, Google · System Design, Interviews

    Berk Avcı

    Lead Writer · Principal Backend Eng., API design · REST, GraphQL, gRPC

    Burak Arslan

    Managing Editor · Content strategy, developer marketing

    Cansu Yılmaz

    Lead Writer · Database Architect, 9 yrs Postgres · PostgreSQL, Indexing, Perf

    Popular posts

    View all
    Ahmet Çelik
    ·

    AWS EKS vs Self-Managed Kubernetes: The Production Trade-offs

    AWS EKS vs Self-Managed Kubernetes: The Production Trade-offs
    Deniz Şahin
    ·

    GCP vs AWS vs Azure: Serverless Comparison 2026

    GCP vs AWS vs Azure: Serverless Comparison 2026
    Ahmet Çelik
    ·

    Mastering Infrastructure as Code Best Practices

    Mastering Infrastructure as Code Best Practices
    Ahmet Çelik
    ·

    S3 Intelligent-Tiering vs Glacier: A Cost Analysis

    S3 Intelligent-Tiering vs Glacier: A Cost Analysis
    Elif Demir
    ·

    Docker Tutorial for Beginners 2025: Production Basics

    Docker Tutorial for Beginners 2025: Production Basics
    Deniz Şahin
    ·

    Google Cloud Run Tutorial: Deploy Scalable Services

    Google Cloud Run Tutorial: Deploy Scalable Services