Member-only story

Kubernetes Resource Use and Management in Production

Requests, Limits, Overcommitment, Slack/Waste, Throttling

Kim Wuestkamp

Published in

ITNEXT

10 min readJul 19, 2020

Before we go into production with a Kubernetes application we should understand K8s resource management. The core is to understand how the Kubernetes scheduler handles resource requests and limits, then everything else makes sense. So let’s dive into it!

After reading this you will:

understand Requests and Limits
know how the k8s Scheduler works with resources
have ideas on how to improve cluster usage and stability

TL;DR

set memory requests=limits
set no CPU limits or disable CPU limit enforcing in kubelet
usage should be below requests
use scaling with HPA / VPA
monitor/alert on pod resource usage

Pod Resources

In k8s a pod can have one or multiple containers which are usually run by Docker.

A pod can be seen as a wrapper around containers which work tightly together, which is why they should run on the same machine (node). This means that the total pod resources are the sum of all its container resources.

Resource Requests and Limits

Resource requests and limits will be specified for every container, like:

Requests are guaranteed and reserved resources. No other pods can use these.

Limits are an allowance to use more resources than requested. If a container reaches its specified limits it’ll be throttled for CPU and evicted for memory.

Scheduler

The k8s scheduler is responsible for deciding which pod can run on which node. It does so by looking at various configuration…