Using Prometheus in Azure Kubernetes Service (AKS)

Understanding the Prometheus operator, custom resource definitions and kube-prometheus

Published in

ITNEXT

8 min readJun 8, 2018

During our innovation day at Xpirit last Friday I dove into Prometheus and how it can be used to gather telemetry from applications running on Kubernetes

I used HELM to install the Prometheus operator and kube-prometheus. Which was a simple and quick way to get Prometheus up and running. But then I was a bit confused on how to use my new Prometheus installation. What really happened when I ran helm install to install the Prometheus in my cluster? Which resources were deployed and how do I use them? So I started taking apart the HELM templates and experimenting with the resources deployed by it to try and understand how the prometheus operator works and what additional resources you get by installing kube-prometheus. Here’s what I learned.

Operators and Custom Resource Definitions

When you install the Prometheus operator with Helm you will get an operator and a set of custom resource defintions. So first lets look at what an operator is:

An Operator is a method of packaging, deploying and managing a Kubernetes application. A Kubernetes application is an application that is both deployed on Kubernetes and managed using the Kubernetes APIs and kubectl tooling.
To be able to make the most of Kubernetes, you need a set of cohesive APIs to extend in order to service and manage your applications that run on Kubernetes. You can think of Operators as the runtime that manages this type of application on Kubernetes.

In Kubernetes 1.7 CoreOS also added Custom Resource Definitions (CRD’s). With CRD’s the Kubernetes API can be extended with additional resource types to simplify the configuration required to run a Kubernetes application.

Prometheus operator

The Prometheus operator uses 3 CRD’s to greatly simplify the configuration required to run Prometheus in your Kubernetes clusters. These 3 types are:

Prometheus, which defines a desired Prometheus deployment. The Operator ensures at all times that a deployment matching the resource definition is running.
ServiceMonitor, which declaratively specifies how groups of services should be monitored. The Operator automatically generates Prometheus scrape configuration based on the definition.
Alertmanager, which defines a desired Alertmanager deployment. The Operator ensures at all times that a deployment matching the resource definition is running.

Operator workflow and relationships (source)

When you deploy a prometheus the prometheus operator will ensure a new instance of prometheus server is made available in your cluster. A prometheus resource definition has a serviceMonitorSelector that specifies which servicemonitor resources should be used by this instance of prometheus server. A servicemonitor specifies how the prometheus server should monitor a service or a group of services. The prometheus operator will generate and apply the required configuration for the prometheus server.

When you install the kubernetes operator in your cluster you will get the operator and the above mentioned CRD’s but you will not get any prometheus server or any service monitor instances by default. To start monitoring however all you to do is deploy a prometheus resource with the right serviceMonitorSelector and deploy a servicemonitor resource.

There is however a community project called kube-prometheus which is now part of the prometheus operator repository on github that will provide you with a prometheus server that is configured to monitor your Kubernetes cluster including a set of Grafana dashboards.

So let’s go ahead and install the prometheus operator and kube-prometheus in an Azure Kubernetes Service (AKS) cluster.

Connect and setup HELM

Connect to your cluster by running:

az login

List your subscriptions by running:

az account list

Select the subscription your AKS cluster is in by running:

az account set --subscription <subscription-id>

Get the credentials required for kubectl to connect to your AKS cluster by running:

az aks get-credentials --name <aks-cluster-name> --resource-group <aks-cluster-resource-group>

If you do not have HELM installed locally, download the appropriate binaries from this github repository.

If you do not have HELM installed yet in your cluster run:

helm init

Next add the CoreOS repo to HELM:

helm repo add coreos https://s3-eu-west-1.amazonaws.com/coreos-charts/stable/

Install Prometheus operator and kube-prometheus

AKS support was recently added, but if you still have a cluster without RBAC support you can tell HELM to install these charts without using RBAC:

helm install coreos/prometheus-operator --name prometheus-operator --namespace monitoring --set rbacEnable=falsehelm install coreos/kube-prometheus --name kube-prometheus --set global.rbacEnable=false --namespace monitoring

Or with RBAC:

helm install coreos/prometheus-operator --name prometheus-operator --namespace monitoringhelm install coreos/kube-prometheus --name kube-prometheus --namespace monitoring

You now have the prometheus operator and kube-prometheus installed in your cluster. So let’s have a look at what we’ve got:

Prometheus resources

kubectl get prometheus --all-namespaces -l release=kube-prometheusNAMESPACE    NAME              AGE
monitoring   kube-prometheus   1h

Service monitor resources

kubectl get servicemonitor --all-namespaces -l release=kube-prometheusNAMESPACE    NAME                                               AGE
monitoring   kube-prometheus                                    1h
monitoring   kube-prometheus-alertmanager                       1h
monitoring   kube-prometheus-exporter-kube-controller-manager   1h
monitoring   kube-prometheus-exporter-kube-dns                  1h
monitoring   kube-prometheus-exporter-kube-etcd                 1h
monitoring   kube-prometheus-exporter-kube-scheduler            1h
monitoring   kube-prometheus-exporter-kube-state                1h
monitoring   kube-prometheus-exporter-kubelets                  1h
monitoring   kube-prometheus-exporter-kubernetes                1h
monitoring   kube-prometheus-exporter-node                      1h
monitoring   kube-prometheus-grafana                            1h
monitoring   prometheus-operator                                1h

Service resources

kubectl get service --all-namespaces -l release=kube-prometheus -o=custom-columns=NAMESPACE:.metadata.namespace,NAME:.metadata.nameNAMESPACE     NAME
kube-system   kube-prometheus-exporter-kube-controller-manager
kube-system   kube-prometheus-exporter-kube-dns
kube-system   kube-prometheus-exporter-kube-etcd
kube-system   kube-prometheus-exporter-kube-scheduler
monitoring    kube-prometheus
monitoring    kube-prometheus-alertmanager
monitoring    kube-prometheus-exporter-kube-state
monitoring    kube-prometheus-exporter-node
monitoring    kube-prometheus-grafana

By installing kube-prometheus you got 1 Prometheus server (a prometheus resource) and a set of service monitor resources that allow you to monitor your cluster itself. Installing kube-prometheus also gave you a Grafana instance that is connected to the prometheus server and has a set of dashboards pre-configured.

You also have a bunch of services called kube-prometheus-exporter-* and corresponding service monitors. These services expose pods that make metrics from your nodes and other kubernetes components available to Prometheus. Lets for example have a look at kube-prometheus-exporter-node

There is a service called kube-prometheus-exporter-node :

kubectl get service --all-namespaces -l component=node-exporter -o=custom-columns=NAMESPACE:.metadata.namespace,NAME:.metadata.name
NAMESPACE    NAME
monitoring   kube-prometheus-exporter-node

Which provides access to a set of pods called kube-prometheus-exporter-node-* :

kubectl get pod --all-namespaces -l component=node-exporter -o=custom-columns=NAMESPACE:.metadata.namespace,NAME:.metadata.name
NAMESPACE    NAME
monitoring   kube-prometheus-exporter-node-b7v4g
monitoring   kube-prometheus-exporter-node-xz9xw
monitoring   kube-prometheus-exporter-node-zxl64

These pods are deployed as a daemon set which means that each node in our cluster will have an instance of this pod:

kubectl get daemonset --all-namespaces  -l component=node-exporter -o=custom-columns=NAMESPACE:.metadata.namespace,NAME:.metadata.name
NAMESPACE    NAME
monitoring   kube-prometheus-exporter-node

Also servicemonitor is deployed that is configured to monitor the metrics port for services in the monitoring namespace that have a label app=exporter-node and a label node-exporter

kubectl describe servicemonitor kube-prometheus-exporter-node --namespace monitoring
Name:         kube-prometheus-exporter-node
Namespace:    monitoring
Labels:       app=exporter-node
              chart=exporter-node-0.3.2
              component=node-exporter
              heritage=Tiller
              prometheus=kube-prometheus
              release=kube-prometheus
...
Spec:
  Endpoints:
    Interval:  15s
    Port:      metrics
  Job Label:   component
  Namespace Selector:
    Match Names:
      monitoring
  Selector:
    Match Labels:
      App:        exporter-node
      Component:  node-exporter
Events:           <none>

The kube-prometheus-exporter-node service has these labels so it’s metrics port will be monitored by Prometheus:

kubectl describe service kube-prometheus-exporter-node --namespace monitoring
Name:              kube-prometheus-exporter-node
Namespace:         monitoring
Labels:            app=exporter-node
                   chart=exporter-node-0.3.2
                   component=node-exporter
                   heritage=Tiller
                   release=kube-prometheus
Annotations:       <none>
Selector:          app=kube-prometheus-exporter-node,component=node-exporter,release=kube-prometheus
Type:              ClusterIP
IP:                10.0.120.251
Port:              metrics  9100/TCP
TargetPort:        metrics/TCP
Endpoints:         10.240.0.4:9100,10.240.0.5:9100,10.240.0.6:9100
Session Affinity:  None
Events:            <none>

This allows Prometheus server to scrape metrics from each node:

Viewing the metrics

To access Prometheus we have to connect to the pod that is running the Prometheus server in our cluster. We can do this by using kubectl port-forward to forward a port on localhost to a specific pod in our cluster. Run the command below:

kubectl --namespace monitoring port-forward $(kubectl get pod --namespace monitoring -l prometheus=kube-prometheus -l app=prometheus -o template --template "{{(index .items 0).metadata.name}}") 9090:9090

Open http://localhost:9090/targets in your browser. Here you can see which endpoints Prometheus is scraping metrics from. If we take a closer look at the node-exporter target e see Prometheus is scraping the /metrics endpoint on port 9100 on each of our 3 nodes.

node-exporter endpoints scraped by Prometheus

Grafana

Kube-prometheus includes Grafana, to view this you first need to get the username and password by running these commands in a bash prompt:

echo username:$(kubectl get secret --namespace monitoring kube-prometheus-grafana -o jsonpath="{.data.user}"|base64 --decode;echo)
echo password:$(kubectl get secret --namespace monitoring kube-prometheus-grafana -o jsonpath="{.data.password}"|base64 --decode;echo)

Next run the command below to forward port 3000 to the pod that hosts Grafana:

kubectl --namespace monitoring port-forward $(kubectl get pod --namespace monitoring -l app=kube-prometheus-grafana -o template --template "{{(index .items 0).metadata.name}}") 3000:3000

Open http://localhost:3000/login and use the username and password you retrieved. After you are logged in, click the dashboards drop down in the top left of the page:

And select the Nodes dashboard:

On the top left of the dashboard you can select the server for which you want to view the metrics. The server is the node in your cluster that the metrics are scraped from.

Each of the servicemonitors that are installed as part of kube-prometheus provide specific metrics for Prometheus to scrape. Grafana has a bunch of default dashboards that use these metrics to show graphs.

Now you know what happened when you installed the Prometheus operator and kube-prometheus using HELM and how the operator and Custom Resource Definitions allow you to monitor kubernetes itself you can start customizing and expanding Prometheus with your own servicemonitors.