Autoscaling in Kubernetes
/ 3 min read
kubernetes , k8s , autoscaling , hpa , vpa , devops , cloud-native , containers , series:kubernetes:15
Understanding Kubernetes Autoscaling
Kubernetes provides three types of autoscaling:
- Horizontal Pod Autoscaling (HPA)
- Vertical Pod Autoscaling (VPA)
- Cluster Autoscaling (CA)
Horizontal Pod Autoscaling (HPA)
Basic HPA Configuration
apiVersion: autoscaling/v2kind: HorizontalPodAutoscalermetadata: name: nginx-hpaspec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: nginx minReplicas: 2 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 50Advanced HPA Configurations
Multiple Metrics
apiVersion: autoscaling/v2kind: HorizontalPodAutoscalermetadata: name: multi-metric-hpaspec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: app minReplicas: 2 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 50 - type: Resource resource: name: memory target: type: Utilization averageUtilization: 70 - type: Pods pods: metric: name: packets-per-second target: type: AverageValue averageValue: 1kCustom Metrics
apiVersion: autoscaling/v2kind: HorizontalPodAutoscalermetadata: name: custom-metric-hpaspec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: app minReplicas: 2 maxReplicas: 10 metrics: - type: Object object: metric: name: requests-per-second describedObject: apiVersion: networking.k8s.io/v1 kind: Ingress name: main-ingress target: type: Value value: 100Vertical Pod Autoscaling (VPA)
VPA Installation
apiVersion: apps/v1kind: Deploymentmetadata: name: metrics-server namespace: kube-systemspec: selector: matchLabels: k8s-app: metrics-server template: metadata: labels: k8s-app: metrics-server spec: containers: - name: metrics-server image: k8s.gcr.io/metrics-server/metrics-server:v0.6.1Basic VPA Configuration
apiVersion: autoscaling.k8s.io/v1kind: VerticalPodAutoscalermetadata: name: my-app-vpaspec: targetRef: apiVersion: "apps/v1" kind: Deployment name: my-app updatePolicy: updateMode: "Auto" resourcePolicy: containerPolicies: - containerName: '*' minAllowed: cpu: "100m" memory: "50Mi" maxAllowed: cpu: "1" memory: "500Mi"VPA Modes
- Auto Mode
updatePolicy: updateMode: "Auto"- Initial Mode
updatePolicy: updateMode: "Initial"- Off Mode
updatePolicy: updateMode: "Off"Cluster Autoscaling (CA)
AWS EKS Configuration
apiVersion: v1kind: ConfigMapmetadata: name: cluster-autoscaler-config namespace: kube-systemdata: config.yaml: | --- autoDiscovery: clusterName: my-eks-cluster awsRegion: us-west-2 extraArgs: scale-down-delay-after-add: 10m scale-down-unneeded-time: 10m skip-nodes-with-system-pods: falseGKE Configuration
apiVersion: v1kind: ConfigMapmetadata: name: cluster-autoscaler-config namespace: kube-systemdata: config.yaml: | --- clusterName: my-gke-cluster cloudProvider: gce nodeGroups: - minSize: 1 maxSize: 5 name: default-poolAKS Configuration
apiVersion: v1kind: ConfigMapmetadata: name: cluster-autoscaler-config namespace: kube-systemdata: config.yaml: | --- autoscalingGroups: - name: nodepool1 minSize: 1 maxSize: 5Best Practices
1. HPA Best Practices
- Set appropriate min/max replicas
- Choose metrics carefully
- Consider scaling behavior
- Monitor scaling events
2. VPA Best Practices
- Use appropriate update mode
- Set reasonable min/max limits
- Test with non-critical workloads first
- Monitor resource recommendations
3. CA Best Practices
- Configure scale-down delays
- Set appropriate node group sizes
- Monitor cluster utilization
- Consider cost implications
Monitoring Autoscaling
Prometheus Metrics
apiVersion: monitoring.coreos.com/v1kind: ServiceMonitormetadata: name: autoscaling-monitorspec: selector: matchLabels: app: autoscaling-metrics endpoints: - port: metricsGrafana Dashboard
apiVersion: integreatly.org/v1alpha1kind: GrafanaDashboardmetadata: name: autoscaling-dashboardspec: json: | { "title": "Autoscaling Metrics", "panels": [ { "title": "HPA Metrics", "type": "graph" } ] }Troubleshooting
Common Issues
- HPA Not Scaling
kubectl describe hpa <hpa-name>kubectl get --raw "/apis/metrics.k8s.io/v1beta1/pods"- VPA Issues
kubectl describe vpa <vpa-name>kubectl logs -n kube-system vpa-recommender- CA Issues
kubectl logs -n kube-system cluster-autoscalerkubectl describe nodesConclusion
Implementing autoscaling in Kubernetes requires careful consideration of your application’s requirements and behavior. Proper configuration of HPA, VPA, and CA ensures optimal resource utilization and cost efficiency.
Series Navigation
- Previous: Resource Management in Kubernetes
- Next: Kubernetes Operators and CRDs