Implement Kubernetes cluster autoscaler for automatic node scaling

Intermediate 45 min Apr 15, 2026 18 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Configure Kubernetes cluster autoscaler to automatically add and remove worker nodes based on pod resource demands. This tutorial covers cloud provider integration, scaling policies, and monitoring for production-grade horizontal scaling.

Prerequisites

  • Running Kubernetes cluster with cloud provider integration
  • kubectl access with cluster admin privileges
  • Cloud provider IAM permissions for managing compute instances
  • Basic understanding of Kubernetes concepts

What this solves

Kubernetes cluster autoscaler automatically adjusts the number of worker nodes in your cluster based on pod scheduling demands. When pods cannot be scheduled due to insufficient resources, the autoscaler provisions new nodes, and when nodes are underutilized, it safely removes them to optimize costs.

Prerequisites

You need a running Kubernetes cluster with cloud provider integration (AWS, GCP, or Azure) and appropriate IAM permissions for managing compute instances. This tutorial assumes you have kubectl access and cluster admin privileges.

Step-by-step installation

Update system packages

Start by updating your system packages to ensure you have the latest versions.

sudo apt update && sudo apt upgrade -y
sudo apt install -y curl wget
sudo dnf update -y
sudo dnf install -y curl wget

Install kubectl if not present

The cluster autoscaler requires kubectl for cluster communication. Install the latest stable version.

curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
kubectl version --client

Create service account and RBAC

The cluster autoscaler needs specific permissions to manage nodes and monitor pod scheduling.

apiVersion: v1
kind: ServiceAccount
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
rules:
  - apiGroups: [""]
    resources: ["events", "endpoints"]
    verbs: ["create", "patch"]
  - apiGroups: [""]
    resources: ["pods/eviction"]
    verbs: ["create"]
  - apiGroups: [""]
    resources: ["pods/status"]
    verbs: ["update"]
  - apiGroups: [""]
    resources: ["endpoints"]
    resourceNames: ["cluster-autoscaler"]
    verbs: ["get", "update"]
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["watch", "list", "get", "update"]
  - apiGroups: [""]
    resources: ["pods", "services", "replicationcontrollers", "persistentvolumeclaims", "persistentvolumes"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["extensions"]
    resources: ["replicasets", "daemonsets"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["policy"]
    resources: ["poddisruptionbudgets"]
    verbs: ["watch", "list"]
  - apiGroups: ["apps"]
    resources: ["statefulsets", "replicasets", "daemonsets"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses", "csinodes"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["batch", "extensions"]
    resources: ["jobs"]
    verbs: ["get", "list", "watch", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
rules:
  - apiGroups: [""]
    resources: ["configmaps"]
    verbs: ["create", "list", "watch"]
  - apiGroups: [""]
    resources: ["configmaps"]
    resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
    verbs: ["delete", "get", "update", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-autoscaler
subjects:
  - kind: ServiceAccount
    name: cluster-autoscaler
    namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: cluster-autoscaler
subjects:
  - kind: ServiceAccount
    name: cluster-autoscaler
    namespace: kube-system

Apply RBAC configuration

Create the service account and apply the necessary permissions for cluster autoscaler operation.

kubectl apply -f cluster-autoscaler-rbac.yaml
kubectl get serviceaccount cluster-autoscaler -n kube-system

Configure cloud provider credentials

Create cloud provider specific credentials. This example shows AWS configuration, but adapt for your provider.

apiVersion: v1
kind: Secret
metadata:
  name: cluster-autoscaler-aws-credentials
  namespace: kube-system
type: Opaque
data:
  aws_access_key_id: 
  aws_secret_access_key: 
Note: For production, use IAM roles for service accounts (IRSA) or similar cloud-native identity mechanisms instead of static credentials.

Create cluster autoscaler deployment

Deploy the cluster autoscaler with cloud provider integration and scaling configuration.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    app: cluster-autoscaler
spec:
  selector:
    matchLabels:
      app: cluster-autoscaler
  template:
    metadata:
      labels:
        app: cluster-autoscaler
      annotations:
        prometheus.io/scrape: 'true'
        prometheus.io/port: '8085'
    spec:
      serviceAccountName: cluster-autoscaler
      containers:
      - image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.28.2
        name: cluster-autoscaler
        resources:
          limits:
            cpu: 100m
            memory: 600Mi
          requests:
            cpu: 100m
            memory: 600Mi
        command:
        - ./cluster-autoscaler
        - --v=4
        - --stderrthreshold=info
        - --cloud-provider=aws
        - --skip-nodes-with-local-storage=false
        - --expander=least-waste
        - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/example-cluster
        - --balance-similar-node-groups
        - --skip-nodes-with-system-pods=false
        - --scale-down-delay-after-add=10m
        - --scale-down-unneeded-time=10m
        - --scale-down-delay-after-delete=10s
        - --scale-down-delay-after-failure=3m
        - --max-node-provision-time=15m
        env:
        - name: AWS_REGION
          value: us-west-2
        - name: AWS_ACCESS_KEY_ID
          valueFrom:
            secretKeyRef:
              name: cluster-autoscaler-aws-credentials
              key: aws_access_key_id
        - name: AWS_SECRET_ACCESS_KEY
          valueFrom:
            secretKeyRef:
              name: cluster-autoscaler-aws-credentials
              key: aws_secret_access_key
        ports:
        - name: http
          containerPort: 8085
          protocol: TCP
        livenessProbe:
          httpGet:
            path: /health-check
            port: 8085
          initialDelaySeconds: 60
          periodSeconds: 60
        imagePullPolicy: Always
      nodeSelector:
        kubernetes.io/os: linux

Apply cluster autoscaler deployment

Deploy the cluster autoscaler to your cluster and verify it starts successfully.

kubectl apply -f cluster-autoscaler-deployment.yaml
kubectl get deployment cluster-autoscaler -n kube-system
kubectl get pods -n kube-system -l app=cluster-autoscaler

Configure node group scaling policies

Set up auto scaling group tags for node discovery and scaling boundaries.

# For AWS Auto Scaling Groups, add these tags:

k8s.io/cluster-autoscaler/enabled = true

k8s.io/cluster-autoscaler/example-cluster = owned

kubernetes.io/cluster/example-cluster = owned

Example AWS CLI command to tag existing ASG:

aws autoscaling create-or-update-tags \ --tags ResourceId=my-node-group-asg,ResourceType=auto-scaling-group,Key=k8s.io/cluster-autoscaler/enabled,Value=true,PropagateAtLaunch=false \ ResourceId=my-node-group-asg,ResourceType=auto-scaling-group,Key=k8s.io/cluster-autoscaler/example-cluster,Value=owned,PropagateAtLaunch=false

Create monitoring service

Expose cluster autoscaler metrics for monitoring and alerting integration.

apiVersion: v1
kind: Service
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    app: cluster-autoscaler
  annotations:
    prometheus.io/scrape: 'true'
    prometheus.io/port: '8085'
spec:
  selector:
    app: cluster-autoscaler
  ports:
  - name: http
    port: 8085
    targetPort: 8085
    protocol: TCP
  type: ClusterIP
kubectl apply -f cluster-autoscaler-service.yaml

Configure scaling policies and limits

Create a ConfigMap to fine-tune scaling behavior and set resource limits.

apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-autoscaler-status
  namespace: kube-system
data:
  nodes.max: "100"
  nodes.min: "3"
  scale-down-delay-after-add: "10m"
  scale-down-unneeded-time: "10m"
  scale-down-utilization-threshold: "0.5"
  skip-nodes-with-local-storage: "false"
  skip-nodes-with-system-pods: "false"
kubectl apply -f cluster-autoscaler-config.yaml

Configure cloud provider integration

AWS integration setup

Configure AWS-specific settings for Auto Scaling Group integration and IAM permissions.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "autoscaling:DescribeAutoScalingGroups",
                "autoscaling:DescribeAutoScalingInstances",
                "autoscaling:DescribeLaunchConfigurations",
                "autoscaling:DescribeTags",
                "autoscaling:SetDesiredCapacity",
                "autoscaling:TerminateInstanceInAutoScalingGroup",
                "ec2:DescribeLaunchTemplateVersions"
            ],
            "Resource": "*"
        }
    ]
}
Note: For GCP, use --cloud-provider=gce and configure service account with Compute Engine permissions. For Azure, use --cloud-provider=azure with appropriate RBAC roles.

Update deployment for multiple node groups

Modify the deployment to handle multiple node groups with different instance types.

kubectl patch deployment cluster-autoscaler -n kube-system -p '{
  "spec": {
    "template": {
      "spec": {
        "containers": [{
          "name": "cluster-autoscaler",
          "command": [
            "./cluster-autoscaler",
            "--v=4",
            "--stderrthreshold=info",
            "--cloud-provider=aws",
            "--skip-nodes-with-local-storage=false",
            "--expander=priority",
            "--node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/example-cluster",
            "--balance-similar-node-groups",
            "--scale-down-delay-after-add=10m",
            "--scale-down-unneeded-time=10m",
            "--max-nodes-total=100",
            "--cores-total=0:320",
            "--memory-total=0:1280"
          ]
        }]
      }
    }
  }
}'

Set up monitoring and troubleshooting

Create ServiceMonitor for Prometheus

Configure Prometheus monitoring for cluster autoscaler metrics and alerting. This integrates with your existing Kubernetes monitoring setup.

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    app: cluster-autoscaler
spec:
  selector:
    matchLabels:
      app: cluster-autoscaler
  endpoints:
  - port: http
    interval: 30s
    path: /metrics
    honorLabels: true
kubectl apply -f cluster-autoscaler-servicemonitor.yaml

Configure alerting rules

Set up Prometheus alerting rules for autoscaler health and scaling events.

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: cluster-autoscaler-alerts
  namespace: kube-system
spec:
  groups:
  - name: cluster-autoscaler
    rules:
    - alert: ClusterAutoscalerDown
      expr: up{job="cluster-autoscaler"} == 0
      for: 5m
      labels:
        severity: critical
      annotations:
        summary: "Cluster Autoscaler is down"
        description: "Cluster Autoscaler has been down for more than 5 minutes."
    
    - alert: ClusterAutoscalerScaleUpFailed
      expr: increase(cluster_autoscaler_failed_scale_ups_total[10m]) > 0
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "Cluster Autoscaler scale up failures detected"
        description: "Cluster Autoscaler failed to scale up {{ $value }} times in the last 10 minutes."
    
    - alert: ClusterAutoscalerNodesNotReady
      expr: cluster_autoscaler_nodes_count{state="notReady"} > 0
      for: 10m
      labels:
        severity: warning
      annotations:
        summary: "Cluster has unready nodes"
        description: "{{ $value }} nodes are in NotReady state for more than 10 minutes."
kubectl apply -f cluster-autoscaler-alerts.yaml

Create test workload for scaling

Deploy a test application to verify autoscaling behavior with resource requests.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: test-scaling-app
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: test-scaling-app
  template:
    metadata:
      labels:
        app: test-scaling-app
    spec:
      containers:
      - name: nginx
        image: nginx:1.21
        resources:
          requests:
            cpu: 100m
            memory: 128Mi
          limits:
            cpu: 200m
            memory: 256Mi
        ports:
        - containerPort: 80
kubectl apply -f test-scaling-deployment.yaml

Verify your setup

# Check cluster autoscaler status
kubectl get pods -n kube-system -l app=cluster-autoscaler
kubectl logs -n kube-system deployment/cluster-autoscaler

Verify metrics endpoint

kubectl port-forward -n kube-system svc/cluster-autoscaler 8085:8085 & curl http://localhost:8085/metrics | grep cluster_autoscaler

Test scaling by increasing replicas

kubectl scale deployment test-scaling-app --replicas=20 kubectl get pods -o wide

Watch node scaling events

kubectl get events --sort-by=.metadata.creationTimestamp | grep -i "scale"

Check cluster autoscaler configmap

kubectl get configmap cluster-autoscaler-status -n kube-system -o yaml

Common issues

Symptom Cause Fix
Autoscaler not discovering node groups Missing or incorrect ASG tags Add k8s.io/cluster-autoscaler/enabled=true and cluster name tags to ASG
Pods remain pending despite autoscaler Resource constraints or taints Check node taints, resource requests, and instance limits with kubectl describe nodes
Nodes not scaling down System pods or local storage blocking Configure --skip-nodes-with-system-pods=false and check DaemonSets
Scale up taking too long Cloud provider API limits Increase --max-node-provision-time and check cloud provider quotas
Autoscaler pod CrashLoopBackOff Incorrect cloud provider config Verify credentials, region settings, and RBAC permissions

Next steps

Automated install script

Run this to automate the entire setup

Need help?

Don't want to manage this yourself?

We handle managed devops services for businesses that depend on uptime. From initial setup to ongoing operations.