Kubernetes¶

The industry-standard container orchestration platform for automating deployment, scaling, and management of containerized workloads across clusters of machines.

Overview¶

Kubernetes (K8s) is a production-grade container orchestration system originally developed by Google and now maintained by the Cloud Native Computing Foundation (CNCF). It implements a desired-state model where controllers continuously reconcile actual state with declared intent. Kubernetes manages the full lifecycle of containerized applications: scheduling, scaling, networking, storage, and self-healing.

Repository & Community¶

Attribute	Detail
Repository	github.com/kubernetes/kubernetes
Stars	~115k+ ⭐
Latest Stable	v1.35.3 (April 2026); v1.36 due April 22, 2026
Language	Go
License	Apache 2.0
Governance	CNCF (Linux Foundation)
Contributors	9,000+

Evaluation¶

Why it's better: Cloud-agnostic, massive ecosystem (CNCF landscape), declarative config, self-healing, horizontal pod autoscaling, service discovery, rolling updates, and the dominant industry standard for container orchestration.
When it fits (Applicability):
Microservices at scale across multiple nodes
CI/CD with automated rollouts and rollbacks
Multi-cloud / hybrid cloud portability
Stateful workloads with persistent volumes
AI/ML training and inference pipelines
Edge deployments (K3s, MicroK8s)
Pros and Cons:

Pros	Cons
Cloud-agnostic, runs anywhere	Steep learning curve
Self-healing, auto-scaling	Complex networking (CNI plugins)
Massive CNCF ecosystem	Control plane overhead for small workloads
Declarative desired-state model	YAML verbosity
Service mesh, Ingress, Gateway API	Security hardening requires expertise
GPU/DRA scheduling for AI/ML	etcd operational complexity
Every major cloud offers managed K8s	Not ideal for traditional VM workloads

Architecture¶

flowchart TB
    subgraph ControlPlane["Control Plane"]
        API["kube-apiserver\n(REST + gRPC)"]
        ETCD["etcd\n(distributed KV store)"]
        Sched["kube-scheduler\n(pod placement)"]
        CM["kube-controller-manager\n(reconciliation loops)"]
        CCM["cloud-controller-manager\n(cloud API integration)"]
    end

    subgraph WorkerNode["Worker Node"]
        Kubelet["kubelet\n(pod lifecycle)"]
        KProxy["kube-proxy\n(Service networking)"]
        CRI["Container Runtime\n(containerd / CRI-O)"]
        Pods["Pods\n(application containers)"]
    end

    API <-->|"watch/list"| ETCD
    Sched -->|"bind pod"| API
    CM -->|"reconcile"| API
    CCM -->|"cloud ops"| API
    Kubelet -->|"status"| API
    API -->|"spec"| Kubelet
    Kubelet -->|"CRI"| CRI
    CRI --> Pods
    KProxy -->|"iptables/IPVS"| Pods

    style ControlPlane fill:#326ce5,color:#fff
    style WorkerNode fill:#1565c0,color:#fff

Key Features¶

Feature	Detail
Pod Scheduling	Affinity, anti-affinity, taints, tolerations, topology spread
Auto-Scaling	HPA (horizontal), VPA (vertical), Cluster Autoscaler
Service Discovery	ClusterIP, NodePort, LoadBalancer, ExternalName
Ingress / Gateway API	L7 traffic routing, TLS termination
Storage	PV, PVC, CSI drivers, StorageClasses
ConfigMaps / Secrets	Externalized configuration and credentials
RBAC	Fine-grained role-based access control
Namespaces	Logical cluster partitioning
DRA (v1.36)	Dynamic Resource Allocation for GPUs, FPGAs
Custom Resources	Extend API with CRDs + Operators

Key Ecosystem¶

Category	Tools
Managed K8s	EKS, GKE, AKS, DOKS, OKE, Linode LKE
Lightweight	K3s, MicroK8s, Kind, Minikube
Networking	Calico, Cilium, Flannel, Antrea
Service Mesh	Istio, Linkerd, Consul Connect
GitOps	ArgoCD, Flux
Observability	Prometheus, Grafana, OpenTelemetry
Security	Falco, OPA/Gatekeeper, Trivy, Kyverno

Pricing¶

Offering	Cost	Notes
Self-hosted	Free (Apache 2.0)	You manage everything
AWS EKS	$0.10/hr/cluster + node costs	Managed control plane
GKE Autopilot	$0.10/hr/cluster + pod costs	Fully managed
Azure AKS	Free control plane + node costs	Managed
Enterprise	Various (Rancher, OpenShift, Tanzu)	Support + add-ons

Compatibility¶

Dimension	Support
Container runtimes	containerd (default), CRI-O
Node OS	Linux (primary), Windows (worker nodes)
CPU architecture	amd64, arm64, arm/v7, s390x, ppc64le
Storage	CSI (Ceph, EBS, GCE PD, Azure Disk, NFS, etc.)
Networking	CNI plugins (Calico, Cilium, Flannel, etc.)
Infrastructure	Bare metal, VMs, any cloud, edge

Scale Limits (Upstream)¶

Dimension	Limit
Nodes per cluster	5,000
Pods per node	110 (default)
Pods per cluster	150,000
Services per cluster	10,000
Namespaces per cluster	10,000