Autoscaling on AWS

On This Page

So here we need to update our cluster and our instance groups to add a lifecycle flag to onDemand instances on each availability zone. Let’s start checking the instance groups:

kops get instancegroups --state s3://eu-north-1-training-dx-book-kops-state

NAME			    ROLE	MACHINETYPE	MIN	MAX	ZONES
master-eu-north-1a	Master	t3.large	1	1	eu-north-1a
master-eu-north-1b	Master	t3.large	1	1	eu-north-1b
master-eu-north-1c	Master	t3.large	1	1	eu-north-1c
nodes-eu-north-1a	Node	t3.large	1	1	eu-north-1a
nodes-eu-north-1b	Node	t3.large	1	1	eu-north-1b
nodes-eu-north-1c	Node	t3.large	1	1	eu-north-1c

And investigate one in detail:

kops get instancegroups nodes-eu-north-1a --state s3://eu-north-1-training-dx-book-kops-state -o yaml

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  labels:
    kops.k8s.io/cluster: eu-north-1.training.dx-book.com
  name: nodes-eu-north-1a
spec:
  image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20230616
  instanceMetadata:
    httpPutResponseHopLimit: 1
    httpTokens: required
  machineType: t3.large
  maxSize: 1
  minSize: 1
  role: Node
  subnets:
  - eu-north-1a

Here is a script to get the current instance groups from our kOps s3 state and write to a yaml file, copy and paste in the terminal of your preference:

state="s3://eu-north-1-training-dx-book-kops-state"
for ig in $(kops get ig --state $state -o json | jq -r '.[].metadata.name'); do
  kops get ig $ig --state $state -o yaml > $ig.yaml > /dev/null 2>&1
  # add the onDemand flag for the instance group starting with nodes*
  if [[ $ig = nodes* ]]; then
    yq eval '.spec.nodeLabels += {"kops.k8s.io/lifecycle": "OnDemand"}' -i $ig.yaml
    yq eval '.spec.minSize = 1' -i $ig.yaml
    yq eval '.spec.maxSize = 3' -i $ig.yaml
    yq eval '.spec.manager = Karpenter' -i $ig.yaml
    kops replace --state $state -f $ig.yaml
  fi
done

Let’s now see the maxSize is increased

kops validate cluster --wait 10m --state s3://eu-north-1-training-dx-book-kops-state

Spot Instances

Activating the termination handler because of the spot instance unstable nature.

spec:
  nodeTerminationHandler:
    enabled: true
    enableSQSTerminationDraining: true
    managedASGTag: "aws-node-termination-handler/managed"

Now we need to create a new instance group with a desired number of node count, mininum and max, and a desired instance type group, which in this case we’ll ask for m5.large.

The following examples find similar instances for m5.large and t3.medium:

state="s3://eu-north-1-training-dx-book-kops-state"
kops toolbox instance-selector "spot-group-base-m5-large" \
--usage-class spot --cluster-autoscaler \
--base-instance-type "m5.large" --burst-support=false \
--deny-list '^?[1-3].*\..*' --gpus 0 \
--node-count-max 3 --node-count-min 1 \
--state ${state}

Here is another example for a t3.medium for a more cost efficient auto scalling:

kops toolbox instance-selector "spot-group-base-t3-medium" \
--usage-class spot --cluster-autoscaler \
--base-instance-type "t3.medium" --burst-support=false \
--deny-list '^?[1-3].*\..*' --gpus 0 \
--node-count-max 2 --node-count-min 1 \
--name ${NAME}

These commands will return instance types that match the base instance type, but have different specs. This is helpful when looking for Spot Instances, which are available at up to a 90% discount compared to On-Demand Instances.

Let’s see again our instance groups now:

kops get instancegroups --state s3://eu-north-1-training-dx-book-kops-state

NAME				        ROLE	MACHINETYPE	MIN	MAX	ZONES
spot-group-base-m5-large	Node	m5.large	1	3	eu-north-1a,eu-north-1b,eu-north-1c

Let’s now validate our cluster and see what’s the current state after the instance groups modifications:

kops validate cluster --wait 10m --state s3://eu-north-1-training-dx-book-kops-state

VALIDATION ERRORS
KIND		NAME				MESSAGE
InstanceGroup	spot-group-base-m5-large	InstanceGroup "spot-group-base-m5-large" is missing from the cloud provider

Time to rollout the modifications to the cloud:

kops update cluster --config cluster.yaml --state s3://eu-north-1-training-dx-book-kops-state --yes
kops rolling-update cluster --state s3://eu-north-1-training-dx-book-kops-state --yes

While kOps update the cluster and instantiate new ec2s, we can watch it using aws cli, querying by PublicDnsName, State.Name, LaunchTime:

aws ec2 describe-instances --query 'Reservations[*].Instances[*].[PublicDnsName, State.Name, LaunchTime]' --output text --region eu-north-1

And also watch the nodes:

kubectl get nodes -w

NAME                  STATUS   ROLES              AGE     VERSION
i-089e8c4a860dcd203   Ready    node,spot-worker   4m14s   v1.25.11
i-014ec53c7396ae56b   Ready    node               3m11s   v1.25.11
i-04fa865118cfd5758   Ready    node               45h     v1.25.11
i-077af190f94301180   Ready    node               45h     v1.25.11
i-06e72faf0b0c2e6c4   Ready    control-plane      45h     v1.25.11
i-0a0cf6a563bc29a5e   Ready    control-plane      45h     v1.25.11
i-0d4cc13d84c394245   Ready    control-plane      45h     v1.25.11

Now we have a new instance group spot-worker, ready to auto scale.

Load test

And buckle up, we going to do some load test here now. We’ll create a dummy deployment and then scale it up to test the autoscaling, using the kubectl command.

First, let’s create a simple deployment of busybox

kubectl run load-generator --image=busybox --restart=Never -- /bin/sh -c "while true; do yes > /dev/null; done"

kubectl run load-generator --image=busybox --restart=Always -- /bin/sh -c "while true; do echo 'Hello, Kubernetes!'; done"

kubectl patch deployment load-generator -p '{"spec":{"template":{"spec":{"containers":[{"name":"busybox","resources":{"requests":{"cpu":"500m"}}}]}}}}'
kubectl patch deployment load-generator -p '{"spec":{"template":{"spec":{"containers":[{"name":"load-generator","image":"busybox","command":["/bin/sh","-c","while true; do echo test; done"]}]}}}}'



kubectl create deployment load-generator --image=busybox
kubectl patch deployment load-generator -p '{"spec":{"template":{"spec":{"containers":[{"name":"load-generator","command":["/bin/sh","-c","while true; do echo 'Hello, Kubernetes!'; done"]}]}}}}'
kubectl scale deployment load-generator --replicas=10


kubectl create deployment load-generator --image=polinux/stress
kubectl patch deployment load-generator -p '{"spec":{"template":{"spec":{"containers":[{"name":"load-generator","resources":{"limits":{"cpu":"500m"}},"args":["--cpu","1"]}]}}}}'


kubectl create deployment load-generator --image=polinux/stress

kubectl patch deployment load-generator --type='json' -p='[{"op": "add", "path": "/spec/template/spec/containers/0/resources", "value": {"limits":{"cpu":"500m"}}},{"op": "add", "path": "/spec/template/spec/containers/0/command", "value": ["/bin/stress"]},{"op": "add", "path": "/spec/template/spec/containers/0/args", "value": ["--cpu","1"]}]'

kubectl scale deployment load-generator --replicas=20

kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cpu-stress
spec:
  replicas: 10
  selector:
    matchLabels:
      app: cpu-stress
  template:
    metadata:
      labels:
        app: cpu-stress
    spec:
      containers:
      - name: cpu-stress
        image: polinux/stress
        command: ["/bin/sh"]
        args: ["-c", "stress --cpu 1", "--vm", '1']
        resources:
          requests:
            memory: "500Mi"
          limits:
            memory: "500Mi"
EOF

This will create a deployment named nginx running the nginx image. Next, we can scale this deployment to 100 replicas using the kubectl scale command:

kubectl scale deployment nginx --replicas=20

After running these commands, you should have a deployment of 100 nginx pods. Kubernetes will start creating the pods, and this should trigger the autoscaler if the current nodes don’t have enough capacity to run all the pods.

You have now successfully created a Kubernetes cluster using kOps on AWS!

Remember to periodically check your cluster’s health by using the kops validate cluster command. If you need to change the cluster, use the kops edit command. If you need to delete the cluster, use the kops delete cluster –name ${NAME} command.

Please note that costs will accrue for as long as the cluster is running. Be sure to clean up and delete any resources you no longer need to avoid unexpected charges.

Activating the Metrics Server

The Metrics Server is an important tool for managing and scaling workloads in a Kubernetes cluster. Here are several reasons why it’s commonly used:

Resource Metrics Pipeline: Metrics Server is a critical part of the resource metrics pipeline in Kubernetes, which is the primary avenue through which CPU and memory usage data (for nodes and pods) gets exposed to the Kubernetes scheduler and other system components.
Auto-Scaling: The Metrics Server is a prerequisite for the Kubernetes Horizontal Pod Autoscaler (HPA) and the Vertical Pod Autoscaler (VPA), which automatically scales the number of pods or resources based on observed metrics. Without Metrics Server, these autoscaling features will not work.
Node Resource Management: Metrics Server helps the Kubernetes scheduler in making better decisions while scheduling pods. It provides current resource usage metrics of nodes and pods, enabling the scheduler to avoid placing pods on nodes that are running out of resources.
Visibility and Monitoring: While Metrics Server itself doesn’t store metrics data, it’s used as an in-cluster API for fetching the latest relevant metrics, which can then be displayed in CLI tools like kubectl top or GUI dashboards.
Cluster Health: It provides necessary data to observe and ensure the health of applications running on the cluster and the cluster itself.

Overall, the Metrics Server provides valuable insights into how workloads are performing in a cluster, and it’s fundamental to the effective scaling and management of applications in Kubernetes.

You can enable the Metrics Server by editing your cluster configuration.

state=s3://eu-north-1-training-dx-book-kops-state
kops get cluster --state $state -o yaml > cluster.yaml
yq e '.spec.metricsServer.enabled = true' -i cluster.yaml
yq e '.spec.metricsServer.insecure = true' -i cluster.yaml
yq e '.spec.metricsServer.insecure = true' -i cluster.yaml
kops replace -f cluster.yaml --state $state
kops update cluster --config cluster.yaml --state $state --yes
kops rolling-update cluster --state $state --yes

Next Steps

Now that you have a Kubernetes cluster running, you can install workloads, explore other AWS services, or set up a CI/CD pipeline.

Please see the official kOps documentation for more information and usage examples.

Node Termination Handler

yq e '.spec.nodeTerminationHandler.cpuRequest = "200m"' -i cluster.yaml
yq e '.spec.nodeTerminationHandler.enabled = true' -i cluster.yaml
yq e '.spec.nodeTerminationHandler.enableRebalanceMonitoring = true' -i cluster.yaml
yq e '.spec.nodeTerminationHandler.enableSQSTerminationDraining = true' -i cluster.yaml
yq e '.spec.nodeTerminationHandler.managedASGTag = "aws-node-termination-handler/managed"' -i cluster.yaml
yq e '.spec.nodeTerminationHandler.prometheusEnable = true' -i cluster.yaml

Enabling Karpenter

export KOPS_FEATURE_FLAGS=“Karpenter”

state=s3://eu-north-1-training-dx-book-kops-state
kops get cluster --state $state -o yaml > cluster.yaml
yq e '.spec.karpenter.enabled = true' -i cluster.yaml
kops update cluster --config cluster.yaml --state $state --yes
kops rolling-update cluster --state $state --yes