K8s by Example: Horizontal Pod Autoscaler

HPA scales Pods based on CPU, memory, or custom metrics. Requires metrics-server in the cluster. Adjusts replicas to maintain target utilization automatically.

hpa.yaml

HPA uses the autoscaling/v2 API. scaleTargetRef points to the Deployment to scale. HPA adjusts replicas between min and max based on metrics.

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-app
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

	pod-resources.yaml
HPA calculates utilization as current/request. Requires resource requests to be defined. Without requests, HPA cannot determine utilization percentage.	`spec: containers: - name: app resources: requests: cpu: 200m memory: 256Mi`

	hpa-multi-metric.yaml
Combine CPU and memory metrics. HPA takes the max of calculated replicas from all metrics. Good for workloads that are both CPU and memory bound.	`spec: metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70 - type: Resource resource: name: memory target: type: Utilization averageUtilization: 80`

hpa-behavior.yaml

Control scaling behavior: scale up fast, down slow. stabilizationWindowSeconds prevents flapping during traffic spikes. Policies control scale rate per time period.

spec:
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
        - type: Percent
          value: 10
          periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
        - type: Percent
          value: 100
          periodSeconds: 15
        - type: Pods
          value: 4
          periodSeconds: 15
      selectPolicy: Max

hpa-custom.yaml

Custom metrics from Prometheus or other sources. Requires metrics adapter (prometheus-adapter, KEDA). Scale on queue depth, requests per second, or any application metric.

spec:
  metrics:
    - type: Pods
      pods:
        metric:
          name: requests_per_second
        target:
          type: AverageValue
          averageValue: 1000
    - type: Object
      object:
        metric:
          name: queue_depth
        describedObject:
          apiVersion: v1
          kind: Service
          name: rabbitmq
        target:
          type: Value
          value: 100

	hpa-external.yaml
External metrics come from outside the cluster (cloud monitoring, SaaS). Scale based on SQS queue length, Pub/Sub backlog, or any external data source. Requires external metrics provider like KEDA.	`spec: metrics: - type: External external: metric: name: sqs_queue_messages selector: matchLabels: queue: orders target: type: AverageValue averageValue: 10`

	hpa-targets.yaml
HPA works with Deployments, StatefulSets, and ReplicaSets. Cannot scale DaemonSets (1 per node by design). It conflicts with manual scaling because HPA overwrites manual replica changes.	`spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: my-app --- spec: scaleTargetRef: apiVersion: apps/v1 kind: StatefulSet name: postgres`

terminal

Debug HPA by checking events, current metrics, and conditions. Common issues: metrics-server not running, missing resource requests, invalid metric names. TARGETS shows current/target utilization.

$ kubectl get hpa
NAME     REFERENCE           TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
my-app   Deployment/my-app   65%/70%   2         10        3          5m

$ kubectl describe hpa my-app
Conditions:
  Type            Status  Reason
  AbleToScale     True    ReadyForNewScale
  ScalingActive   True    ValidMetricFound

$ kubectl get pods -n kube-system | grep metrics-server
metrics-server-abc12   1/1     Running

$ kubectl top pods
NAME        CPU(cores)   MEMORY(bytes)
my-app-1    150m         200Mi
my-app-2    140m         195Mi

$ kubectl autoscale deployment my-app \
    --min=2 --max=10 --cpu-percent=70
horizontalpodautoscaler.autoscaling/my-app created

Index | Use arrow keys to navigate