Vertical Pod Autoscaler (VPA) Configuration

This example demonstrates how to configure Vertical Pod Autoscaler (VPA) in Simple Container for automatic resource optimization in Kubernetes deployments.

Overview

VPA automatically adjusts CPU and memory requests for your containers based on actual usage patterns, providing:

Cost Optimization: Prevents over-provisioning of resources
Performance Optimization: Ensures adequate resources during load spikes
Operational Efficiency: Reduces manual resource tuning

Example Structure

kubernetes-vpa/
├── README.md                    # This documentation
├── .sc/
│   └── stacks/
│       ├── infrastructure/      # Parent stack (DevOps managed)
│       │   ├── server.yaml      # GKE Autopilot with Caddy VPA
│       │   └── secrets.yaml     # Authentication and encrypted secrets
│       └── vpa-demo/            # Client stack (Developer managed)
│           └── client.yaml      # Application with VPA configuration
├── docker-compose.yaml          # Local development environment
└── Dockerfile                   # Application container

Quick Start

Deploy the parent stack (GKE Autopilot with Caddy VPA):
```
sc provision -s infrastructure -e production
```
Deploy the application with VPA:
```
sc deploy -s vpa-demo -e production
```

Monitor VPA recommendations:

kubectl get vpa
kubectl describe vpa vpa-demo-vpa

Configuration Examples

Application VPA Configuration (`client.yaml`)

# File: .sc/stacks/vpa-demo/client.yaml
schemaVersion: 1.0

stacks:
  production:
    type: cloud-compose
    parent: infrastructure
    config:
      dockerComposeFile: ./docker-compose.yaml
      uses: [mongodb]
      runs: [web-app]

      # VPA Configuration for automatic resource optimization
      cloudExtras:
        vpa:
          enabled: true
          updateMode: "Auto"  # Off, Initial, Auto, InPlaceOrRecreate
          minAllowed:
            cpu: "100m"
            memory: "128Mi"
          maxAllowed:
            cpu: "2"
            memory: "4Gi"
          controlledResources: ["cpu", "memory"]

      env:
        DATABASE_URL: "${resource:mongodb.uri}"
        NODE_ENV: "production"

      scale:
        min: 2
        max: 10

Infrastructure with Caddy VPA (`server.yaml`)

# File: .sc/stacks/infrastructure/server.yaml
schemaVersion: 1.0

# Deployment templates
templates:
  gke-autopilot:
    type: gcp-gke-autopilot
    config:
      projectId: "${auth:gcloud.projectId}"
      credentials: "${auth:gcloud}"
      gkeClusterResource: gke-cluster
      artifactRegistryResource: artifact-registry

# Infrastructure resources
resources:
  production:
    resources:
      # GKE Autopilot cluster with Caddy VPA
      gke-cluster:
        type: gcp-gke-autopilot-cluster
        config:
          projectId: "${auth:gcloud.projectId}"
          credentials: "${auth:gcloud}"
          location: "us-central1"
          gkeMinVersion: "1.33.4-gke.1245000"
          caddy:
            enable: true
            namespace: caddy
            replicas: 2
            # VPA Configuration for Caddy ingress controller
            vpa:
              enabled: true
              updateMode: "Auto"  # Recommended for ingress controllers (recreates pods)
              minAllowed:
                cpu: "50m"
                memory: "64Mi"
              maxAllowed:
                cpu: "1"
                memory: "1Gi"
              controlledResources: ["cpu", "memory"]
            # Optional: Manual resource limits alongside VPA
            resources:
              limits:
                cpu: "500m"
                memory: "512Mi"
              requests:
                cpu: "100m"
                memory: "128Mi"

      # Artifact Registry
      artifact-registry:
        type: gcp-artifact-registry
        config:
          projectId: "${auth:gcloud.projectId}"
          credentials: "${auth:gcloud}"
          location: "us-central1"

      # MongoDB Atlas
      mongodb:
        type: mongodb-atlas-cluster
        config:
          projectId: "${secret:mongodb-atlas-project-id}"
          publicKey: "${secret:mongodb-atlas-public-key}"
          privateKey: "${secret:mongodb-atlas-private-key}"
          clusterName: "vpa-demo-cluster"
          instanceSize: "M10"
          region: "US_CENTRAL_1"

VPA Update Modes

Mode	Description	Use Case	Behavior
Off	Only provides recommendations	Testing and analysis	No automatic changes
Initial	Sets resources only at pod creation	Conservative approach	One-time resource setting
Auto	Updates by recreating pods	Recommended for stateless apps	Pod restart required
InPlaceOrRecreate	Updates resources in-place or recreates	Advanced use (preview)	May cause brief interruptions

VPA Best Practices

1. Resource Boundaries

cloudExtras:
  vpa:
    enabled: true
    # Prevent resource starvation
    minAllowed:
      cpu: "100m"      # Minimum for basic functionality
      memory: "128Mi"   # Minimum memory requirement
    # Control maximum costs
    maxAllowed:
      cpu: "4"          # Reasonable upper limit
      memory: "8Gi"     # Prevent memory exhaustion

2. Update Mode Selection

# For stateless applications
cloudExtras:
  vpa:
    updateMode: "Auto"        # Fast resource updates

# For critical services (like ingress controllers)
caddy:
  vpa:
    updateMode: "Auto"  # Safer pod recreation

3. Combining VPA with Manual Limits

# Manual resource specifications
resources:
  limits:
    cpu: "2"
    memory: "4Gi"
  requests:
    cpu: "500m"
    memory: "1Gi"

# VPA will adjust requests within these bounds
vpa:
  enabled: true
  maxAllowed:
    cpu: "2"      # Matches manual limit
    memory: "4Gi" # Matches manual limit

Monitoring VPA

Check VPA Status

# List all VPAs
kubectl get vpa

# Get detailed VPA information
kubectl describe vpa <vpa-name>

# View VPA recommendations
kubectl get vpa <vpa-name> -o yaml

VPA Metrics

# Check current resource usage
kubectl top pods

# Compare with VPA recommendations
kubectl get vpa <vpa-name> -o jsonpath='{.status.recommendation}'

Troubleshooting

VPA Not Creating Recommendations

Ensure VPA controller is installed in the cluster
Check if pods have been running long enough (usually 24 hours)
Verify resource usage patterns exist

Pods Not Being Updated

Check VPA update mode is not "Off"
Verify VPA has sufficient permissions
Ensure resource boundaries allow for changes

Resource Recommendations Too High/Low

Adjust minAllowed and maxAllowed boundaries
Check if workload patterns are representative
Consider using controlledResources to limit scope

GKE Autopilot Setup - Basic GKE Autopilot configuration
Kubernetes Affinity - Advanced pod scheduling
Resource Management - Manual resource configuration