Configure Kubernetes vertical pod autoscaler for resource optimization and cost management

Intermediate 25 min Apr 26, 2026 161 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Set up VPA to automatically adjust CPU and memory requests for your Kubernetes workloads. Reduce resource waste and optimize costs by letting VPA analyze actual usage patterns and rightsizing containers.

Prerequisites

  • Running Kubernetes cluster with admin access
  • kubectl configured and working
  • Basic understanding of Kubernetes resources

What this solves

Kubernetes Vertical Pod Autoscaler (VPA) automatically adjusts CPU and memory requests for your containers based on actual usage patterns. This eliminates resource waste from over-provisioned pods and prevents performance issues from under-provisioned workloads, helping you optimize both costs and reliability.

Step-by-step installation

Verify cluster prerequisites

Check that your cluster has metrics-server running and sufficient RBAC permissions for VPA components.

kubectl get deployment metrics-server -n kube-system
kubectl get nodes
kubectl version --short

Install metrics-server if missing

VPA requires metrics-server to collect resource usage data. Install it if not already present.

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

Wait for metrics-server to be ready:

kubectl wait --for=condition=available --timeout=300s deployment/metrics-server -n kube-system

Clone VPA repository

Download the official VPA installation manifests from the autoscaler repository.

git clone https://github.com/kubernetes/autoscaler.git
cd autoscaler/vertical-pod-autoscaler

Install VPA components

Deploy the VPA admission controller, recommender, and updater components to your cluster.

./hack/vpa-install.sh

Verify all VPA components are running:

kubectl get pods -n kube-system | grep vpa
kubectl get deployment -n kube-system | grep vpa

Create VPA custom resource definitions

Ensure VPA CRDs are properly registered in your cluster.

kubectl get crd | grep verticalpodautoscaler
kubectl api-resources | grep verticalpodautoscaler

Step-by-step configuration

Deploy sample application

Create a test deployment to demonstrate VPA functionality with realistic resource patterns.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-vpa-demo
  namespace: default
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx-vpa-demo
  template:
    metadata:
      labels:
        app: nginx-vpa-demo
    spec:
      containers:
      - name: nginx
        image: nginx:1.25
        resources:
          requests:
            cpu: "100m"
            memory: "128Mi"
          limits:
            cpu: "200m"
            memory: "256Mi"
        ports:
        - containerPort: 80
kubectl apply -f nginx-deployment.yaml

Configure VPA in recommendation mode

Start with recommendation-only mode to observe VPA suggestions without automatic changes.

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: nginx-vpa-recommender
  namespace: default
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx-vpa-demo
  updatePolicy:
    updateMode: "Off"
  resourcePolicy:
    containerPolicies:
    - containerName: nginx
      minAllowed:
        cpu: 50m
        memory: 64Mi
      maxAllowed:
        cpu: 500m
        memory: 512Mi
      controlledResources: ["cpu", "memory"]
kubectl apply -f nginx-vpa-rec.yaml

Generate load for meaningful recommendations

Create some CPU and memory usage to help VPA generate realistic recommendations.

apiVersion: v1
kind: Pod
metadata:
  name: load-generator
spec:
  containers:
  - name: busybox
    image: busybox:1.36
    command:
    - /bin/sh
    - -c
    - |
      while true; do
        wget -q -O- http://nginx-vpa-demo.default.svc.cluster.local/
        sleep 0.1
      done
  restartPolicy: Never
kubectl apply -f load-generator.yaml

Configure VPA for automatic updates

Enable automatic resource adjustments with proper update policies for production workloads.

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: nginx-vpa-auto
  namespace: default
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx-vpa-demo
  updatePolicy:
    updateMode: "Auto"
    minReplicas: 1
  resourcePolicy:
    containerPolicies:
    - containerName: nginx
      minAllowed:
        cpu: 50m
        memory: 64Mi
      maxAllowed:
        cpu: 500m
        memory: 512Mi
      controlledResources: ["cpu", "memory"]
      controlledValues: "RequestsAndLimits"
kubectl apply -f nginx-vpa-auto.yaml

Configure VPA resource policies

Set up advanced resource policies with scaling bounds and controlled resource types.

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: production-app-vpa
  namespace: default
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx-vpa-demo
  updatePolicy:
    updateMode: "Initial"
  resourcePolicy:
    containerPolicies:
    - containerName: '*'
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 2
        memory: 2Gi
      controlledResources: ["cpu", "memory"]
      controlledValues: "RequestsOnly"
      mode: Auto
kubectl apply -f advanced-vpa-policy.yaml

Monitor and tune VPA recommendations

View VPA recommendations

Check current resource recommendations and compare them with actual usage patterns.

kubectl describe vpa nginx-vpa-recommender
kubectl get vpa nginx-vpa-recommender -o yaml

Monitor resource usage trends

Compare VPA recommendations with actual pod resource consumption over time.

kubectl top pods -l app=nginx-vpa-demo
kubectl get pods -l app=nginx-vpa-demo -o jsonpath='{.items[].spec.containers[].resources}'

Configure VPA admission webhook

Verify the VPA admission controller is properly configured to intercept pod creation.

kubectl get mutatingwebhookconfigurations
kubectl get validatingwebhookconfigurations | grep vpa

Set up VPA monitoring dashboard

Create monitoring queries to track VPA effectiveness and resource optimization metrics.

#!/bin/bash

echo "=== VPA Status ==="
kubectl get vpa --all-namespaces

echo -e "\n=== VPA Recommendations ==="
for vpa in $(kubectl get vpa -o name); do
  echo "$vpa:"
  kubectl get $vpa -o jsonpath='{.status.recommendation.containerRecommendations[*]}' | jq .
done

echo -e "\n=== Resource Utilization ==="
kubectl top pods --all-namespaces
chmod +x vpa-monitor.sh
./vpa-monitor.sh

Deploy VPA policies for workload optimization

Configure namespace-wide VPA policies

Apply VPA policies across multiple deployments with consistent resource boundaries.

apiVersion: v1
kind: ConfigMap
metadata:
  name: vpa-policy-template
  namespace: production
data:
  vpa-template.yaml: |
    apiVersion: autoscaling.k8s.io/v1
    kind: VerticalPodAutoscaler
    metadata:
      name: DEPLOYMENT_NAME-vpa
      namespace: NAMESPACE
    spec:
      targetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: DEPLOYMENT_NAME
      updatePolicy:
        updateMode: "Auto"
        minReplicas: 2
      resourcePolicy:
        containerPolicies:
        - containerName: '*'
          minAllowed:
            cpu: 100m
            memory: 128Mi
          maxAllowed:
            cpu: 4
            memory: 8Gi
          controlledResources: ["cpu", "memory"]
          controlledValues: "RequestsAndLimits"
kubectl apply -f namespace-vpa-policy.yaml

Configure VPA for StatefulSets

Apply VPA policies to StatefulSets with careful consideration of persistent storage and scaling patterns.

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: database-vpa
  namespace: default
spec:
  targetRef:
    apiVersion: apps/v1
    kind: StatefulSet
    name: postgresql
  updatePolicy:
    updateMode: "Initial"
  resourcePolicy:
    containerPolicies:
    - containerName: postgresql
      minAllowed:
        cpu: 500m
        memory: 1Gi
      maxAllowed:
        cpu: 8
        memory: 32Gi
      controlledResources: ["cpu", "memory"]
      controlledValues: "RequestsOnly"
kubectl apply -f statefulset-vpa.yaml

Implement VPA exclusion policies

Configure workloads to exclude certain containers or resources from VPA management.

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: selective-vpa
  namespace: default
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: multi-container-app
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: app-container
      controlledResources: ["cpu", "memory"]
      mode: Auto
    - containerName: sidecar-container
      mode: "Off"
    - containerName: monitoring-agent
      controlledResources: ["memory"]
      minAllowed:
        memory: 64Mi
      maxAllowed:
        memory: 512Mi
kubectl apply -f selective-vpa.yaml

Verify your setup

# Check VPA components are running
kubectl get pods -n kube-system | grep vpa

Verify VPA CRDs are installed

kubectl get crd | grep verticalpodautoscaler

Check VPA recommendations

kubectl get vpa --all-namespaces kubectl describe vpa nginx-vpa-recommender

Monitor resource changes

kubectl get pods -l app=nginx-vpa-demo -o jsonpath='{range .items[]}{.metadata.name}{"\t"}{.spec.containers[].resources.requests}{"\n"}{end}'

Check VPA admission webhook

kubectl get mutatingwebhookconfigurations | grep vpa

Common issues

Symptom Cause Fix
VPA shows no recommendations Insufficient metrics data Wait 24-48 hours for data collection, ensure metrics-server is running
Pods not getting updated automatically Update policy set to "Off" or "Initial" Change updateMode to "Auto" in VPA spec
VPA recommendations too high/low Insufficient load or incorrect resource policies Adjust minAllowed/maxAllowed bounds, generate realistic load patterns
Admission controller webhook errors Certificate issues or RBAC permissions kubectl logs -n kube-system deployment/vpa-admission-controller
VPA conflicts with HPA Both autoscalers targeting same resource Use HPA for CPU-based scaling, VPA for memory, or separate workloads
Resource updates causing downtime Insufficient replicas during updates Set minReplicas in updatePolicy, use "Initial" mode for critical services

Next steps

Running this in production?

Want this handled for you? Setting up VPA once is straightforward. Keeping it tuned, monitoring recommendation accuracy, and managing resource policies across environments is the harder part. See how we run infrastructure like this for European SaaS and e-commerce teams.

Automated install script

Run this to automate the entire setup

Need help?

Don't want to manage this yourself?

We handle managed devops services for businesses that depend on uptime. From initial setup to ongoing operations.