Monitoring Kubernetes workloads with Prometheus and Thanos

Published in

ITNEXT

7 min readMay 15, 2019

Introduction

Congratulations! you’ve managed to convince your engineering manager, VP R&D or CTO to migrate your company’s workload to micro-services, using containers on top of Kubernetes.

You are very happy, and everything is going according to plan, you create your first Kubernetes cluster (all three major cloud providers, Azure, AWS and GCP, have an easy way to provision a managed or unmanaged Kubernetes platform), develop your first containerized application, and deploy it to to the cluster. That was easy, wasn’t it? :)

After some time, you realize things become a bit more complicated, you have multiple applications to deploy to the cluster, so you need an Ingress Controller, then, before going to production, you want visibility into how your workloads are performing, so you start looking for a monitoring solution, luckily, you find Prometheus, deploy it, add Grafana, and you’re done!

Later on, you start wondering — why Prometheus is running with just one replica? what happens if there is a container restart? or just a version update? for how long can Prometheus store my metrics? what will happen when the cluster goes down? Do I need another cluster for HA and DR? how will I have a centralized view with Prometheus?

Well, keep reading, smart people have already figured it out.

Typical Kubernetes Cluster

Below is a diagram illustrating a typical deployment on top of Kubernetes –

The deployment consists of three layers:

Underlying virtual machines — master nodes and worker nodes
Kubernetes infrastructure applications
User applications

The different components communicate with each other internally, usually using HTTP(s) (REST or gRPC) and some of them expose APIs to the outside of the cluster (Ingress), those APIs are mainly used for —

Cluster management via Kubernetes API Server
User application interaction exposed via an Ingress Controller

In some scenarios, applications might send traffic outside of the cluster (Egress) for consuming other services, such as Azure SQL, Azure Blob, or any 3rd party service.

What to Monitor?

Monitoring Kubernetes should take into account all three layers mentioned above.

Underlying VMs: to make sure the underlying virtual machines are healthy, the following metrics should be collected —

Number of nodes
Resource utilization per node (CPU, Memory, Disk, Network bandwidth)
Node status (ready, not ready, etc.)
Number of pods running per node

Kubernetes Infrastructure: to make sure the Kubernetes infrastructure is healthy, the following metrics should be collected —

Pods health — instances ready, status, restarts, age
Deployments status — desired, current, up-to-date, available, age
StatefulSets status
CronJobs execution stats
Pod resource utilization (CPU and Memory)
Health checks
Kubernetes Events
API Server requests
Etcd stats
Mounted volumes stats

User Applications: each application should expose its own metrics, based on its core functionality, however, there are metrics which are common to most of the applications, such as:

HTTP requests (Total number, Latency, Response Code, etc.)
Number of outgoing connections (e.g. database connections)
Number of threads

Collecting the metrics mentioned above, will allow you to build meaningful Alerts and Dashboards, we’ll cover that briefly below.

Thanos

Thanos is an open-source project, built as a set of components that can be composed into a highly available metric system with unlimited storage capacity, which can be added seamlessly on top of existing Prometheus deployments.

Thanos leverages the Prometheus storage format to cost-efficiently store historical metric data in any object storage while retaining fast query latency. Additionally, it provides a global query view across all Prometheus installations.

Thanos main components are:

Sidecar: connects to Prometheus and exposes it for real time queries by the Query Gateway and/or upload its data to cloud storage for longer term usage
Query Gateway: implements Prometheus’ API to aggregate data from the underlying components (such as Sidecar or Store Gateway)
Store Gateway: exposes the content of a cloud storage
Compactor: compacts and down-samples data stored in cloud storage
Receiver: receives data from Prometheus’ remote-write WAL, exposes it and/or upload it to cloud storage
Ruler: evaluates recording and alerting rules against data in Thanos for exposition and/or upload

In this article, we will focus on the first three components.

Deploying Thanos

We’ll start with deploying Thanos Sidecar into our Kubernetes clusters, the same clusters we use for running our workloads and Prometheus and Grafana deployments.

While there are many ways to install Prometheus, I prefer using Prometheus-Operator which gives you easy monitoring definitions for Kubernetes services and deployment and management of Prometheus instances.

The easiest way to install Prometheus-Operator is using its Helm chart which has built-in support for high-availability, Thanos Sidecar injection and lots of pre-configured Alerts for monitoring the cluster VMs, Kubernetes infrastructure and your applications.

Before we deploy the Thanos Sidecar, we need to have a Kubernetes Secret with details on how to connect to the cloud storage, for this demo, I will be using Microsoft Azure.

Create a blob storage account—

az storage account create --name <storage_name> --resource-group <resource_group> --location <location> --sku Standard_LRS --encryption blob

Then, create a folder (aka container) for the metrics —

az storage container create --account-name <storage_name> --name thanos

Grab the storage keys —

az storage account keys list -g <resource_group> -n <storage_name>

Create a file for the storage settings (thanos-storage-config.yaml) —

Create a Kubernetes Secret —

kubectl -n monitoring create secret generic thanos-objstore-config --from-file=thanos.yaml=thanos-storage-config.yaml

Create a values file (prometheus-operator-values.yaml) to override the default Prometheus-Operator settings —

then deploy:

helm install --namespace monitoring --name prometheus-operator stable/prometheus-operator -f prometheus-operator-values.yaml

Now you should have a highly-available Prometheus running in your cluster, along with a Thanos Sidecar that uploads your metrics to Azure Blob Storage with infinite retention.

To allow Thanos Store Gateway access to those Thanos Sidecars, we will need to expose them via an Ingress, I’m using Nginx Ingress Controller, but you can use any other Ingress Controller that supports gRPC (Envoy is probably the best option).

For secured communication between the Thanos Store Gateway and the Thanos Sidecar we will use mutual-TLS, meaning the client will authenticate the server and vise-versa.

Assuming you have .pfx file you can extract its Private Key, Public Key and CA using openssl —

# public key
openssl pkcs12 -in cert.pfx -nocerts -nodes | sed -ne '/-BEGIN PRIVATE KEY-/,/-END PRIVATE KEY-/p' > cert.key# private key
openssl pkcs12 -in cert.pfx -clcerts -nokeys | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' > cert.cer# certificate authority (CA)
openssl pkcs12 -in cert.pfx -cacerts -nokeys -chain | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' > cacerts.cer

Create a two Kubernetes Secrets out of this —

# a secret to be used for TLS termination
kubectl create secret tls -n monitoring thanos-ingress-secret --key ./cert.key --cert ./cert.cer# a secret to be used for client authenticating using the same CA
kubectl create secret generic -n monitoring thanos-ca-secret --from-file=ca.crt=./cacerts.cer

Make sure you have a domain that resolves to your Kubernetes cluster and create two sub-domains to be used for routing to each Thaos SideCar:

thanos-0.your.domain
thanos-1.your.domain

Now we can create the Ingress rules (replace the host value)—

Now we have a secure way to access the Thanos Sidecars from outside of the clusters!

Thanos Cluster

In the Thanos diagram above, you can see I chose to deploy Thanos in a separate cluster, that’s because I wanted a dedicated cluster that can be easily re-created if needed, and give engineers access to it, without them needing to access real production clusters.

To deploy Thanos components I chose to use this Helm chart (not official yet, but stay tuned, and wait for my PR to be merged).

Create a thanos-values.yaml file to override the default chart settings —

Since Thanos Store Gateway needs access to read from the blob storage, we re-create the storage secret in this cluster as well —

kubectl -n thanos create secret generic thanos-objstore-config --from-file=thanos.yaml=thanos-storage-config.yaml

To deploy this chart, we will use the same certificates we created earlier and inject them during as values —

helm install --name thanos --namespace thanos ./thanos -f thanos-values.yaml --set-file query.tlsClient.cert=cert.cer --set-file query.tlsClient.key=cert.key --set-file query.tlsClient.ca=cacerts.cer --set-file store.tlsServer.cert=cert.cer --set-file store.tlsServer.key=cert.key --set-file store.tlsServer.ca=cacerts.cer

This will install both Thanos Query Gateway and Thanos Storage Gateway, configuring them to use a secure channel.

Validation

To validate everything is working properly you can port-forward into the Thanos Query Gateway HTTP service using —

kubectl -n thanos port-forward svc/thanos-query-http 8080:10902

Then open your browser at http://localhost:8080 and you should see Thanos UI! —

Grafana

To add dashboards you can simply install Grafana using its Helm chart.

Create a grafana-values.yaml with the following content —

Note that I added three default dashboards to it, you can add your own dashboards as well (easiest way is using ConfigMap)

Then deploy —

helm install --name grafana --namespace thanos stable/grafana -f grafana-values.yaml

Then again, port-forward —

kubectl -n thanos port-forward svc/grafana 8080:80

And… viola! you have completed the deployment of a highly-available monitoring solution, based on Prometheus, with long-term storage and a centralized view across multiple clusters!

Other Options

This article is focused on Prometheus and Thanos, but if you don’t need multiple-clusters global view, you can still only Prometheus with defining a persistent storage.

Another option is to deploy Cortex, which is another open-source platform, that is a bit more complex than Thanos, and have taken a different approach.