ITNEXT

ITNEXT is a platform for IT developers & software engineers to share knowledge, connect…

Follow publication

Member-only story

Autoscaling apps on Kubernetes with the Horizontal Pod Autoscaler

--

This article gives a high-level overview of how the Horizontal Pod Autoscaler (HPA) in Kubernetes works and how to use it.

A previous version of this article has been published on learnk8s.io.

Contents

  1. Introduction
  2. Different types of autoscaling in Kubernetes
  3. What is the Horizontal Pod Autoscaler?
  4. How is the Horizontal Pod Autoscaler configured?
  5. How are application metrics obtained?
  6. Putting everything together

Introduction

Deploying a stateless app with a statically configured number of replicas is not optimal.

Traffic patterns can change quickly, and the app should be able to adapt to them:

  • When the rate of requests increases, the app should scale up (i.e. increase the number of replicas) to stay responsive.
  • When the rate of requests decreases, the app should scale down (i.e. decrease the number of replicas) to avoid wasting resources.

In the context of horizontal scaling, scaling up is also called “scaling out”, and scaling down is also called “scaling in”. This contrasts with vertical scaling (see below), which uses only the terms “scaling up” and “scaling down”.

--

--

Published in ITNEXT

ITNEXT is a platform for IT developers & software engineers to share knowledge, connect, collaborate, learn and experience next-gen technologies.

Written by Daniel Weibel

Systems Programming | Software Development | Cloud Engineering | UNIX/Linux | Go | Kubernetes | AWS

Responses (4)

Write a response