How To AutoScale Pods in Kubernetes Using HPA (Horizontal Pod AutoScaler) – Minikube Demo

April 1, 2020April 1, 2020 Ajeet ca, cluster autoscaling, cluster autoscaling in kubernetes, containers, docker, horizontal pod autoscaler, hpa, k8s, kubernetes, minikube, vertical pod autoscaler, vpa

It is very important to have autoscaling and downscaling in place to support huge traffic. Kubernetes provides the facility to support this using HPA and VPA.

VPA (Vertical Pod AutoScaler)
Vertical Pods Autoscaler (VPA) allocates more (or less) cpu or memory to existing pods. It generally do the following work:

VPA continuously checks metrics values you configured during setup AT A DEFAULT 10 SEC intervals
When the threshold is met, VPA attempts to change the allocated memory and/or CPU
VPA mainly updates the resources(CPU/Memory) inside the deployment or replication controller specs
When pods are restarted the new allocated memory/CPU are applied to the created pods.

HPA (Horizontal Pod AutoScaler)
HPA scales up/down the number of Pods replicas. HPA do the following work:

HPA continuously checks metrics values you configure during setup AT A DEFAULT 30 SEC intervals
It increases the number of pods if the SPECIFIED threshold is met
HPA mainly updates the number of replicas inside the deployment or replication controller
The Deployment/Replication Controller WOULD THEN roll-out ANY additional needed pods

HPA Example/Demo
In this post, we will specifically cover the HPA example. For this you need to have minikube up and running. You can setup minikube and kubectl by following this tutorial – https://www.youtube.com/watch?v=6fEzipDi2g8

Enable metrics-server

minikube addons enable metrics-server minikube addons list

Create a file name nginx.yaml to create a deployment

apiVersion: apps/v1 kind: Deployment metadata: name: nginx namespace: default spec: replicas: 1 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.7.9 ports: - containerPort: 80 resources: # You must specify requests for CPU to autoscale # based on CPU utilization requests: cpu: 100m

Create another file nginx-hpa.yaml for HPA

apiVersion: autoscaling/v1 kind: HorizontalPodAutoscaler metadata: name: nginx spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: nginx minReplicas: 1 maxReplicas: 10 targetCPUUtilizationPercentage: 10

Now apply these yaml file to create deployment and HPA for nginx. Run the following command:

kubectl apply -f nginx.yaml kubectl apply -f nginx-hpa.yaml

Now you can check the pods must be getting created and also the HPA. Run the following command to check:

kubectl get pods kubectl get hpa

Now Expose the nginx deployment by creating a service

kubectl expose deployment nginx --type=LoadBalancer --name=nginx-service

This will expose the nginx deployment with the help of service and then we would be able to access the nginx service with some IP and port. To get the IP and port, run the below command:

minikube service nginx-service --url

You can this url in browser and should see nginx home page. This shows that we have successfully deployed an nginx pod.

Now, we have everything in place, we will generate some load on nginx and will see the autoscaling in action. To generate load run the following kubectl command.

kubectl run --generator=run-pod/v1 -it --rm load-generator --image=busybox /bin/sh

After running kubectl command you will see a bash prompt in which you have to run the wget command in a loop:

/ # while true; do wget -q -O- http://192.168.99.101:30726; done

This will run the wget command in a loop generating some load on the pod. Now check if the HPA shows some load and the number of pods are increasing with the help of following command:

kubectl get hpa kubectl get pods

As you see, the number of pods are increasing when the load increases. This is how the HPA actually works.