👨🏻‍💻 Thanks for reading! This blog has been archived and won't have content updates. You can follow my latest work at trujillo.io.

Eduardo Trujillo
3 minutes

Metrics in Grafana

I’ve been playing with Deis, a Docker orchestration platform on AWS for the past few days. In fact, I got some services running on it. I have found it to be very useful since it automates a many processes that would actually be a pain to setup manually, such as automatically building Docker images, setting up load balancing across containers and servers, and making it super simple to scale different parts of an application.

However, one key missing part is monitoring. Deis’ documentation points you to Cadvisor and Heapster. These are two applications built by Google in order to collect and store statistics.

Cadvisor is very simple to setup. All you need is to load the service file using fleetctl. Once its running you should be able to see stats about containers running on each server. This is nice, but it does not really give you a complete picture given that it only shows stats for the host it is running on. If you have 5 servers in your cluster, you need to open 5 tabs to see the stats of each server.

Heapster is the next step. It connects to instances of Cadvisor, collects metrics, and then pushes them to a sink, such as an InfluxDB server, every 10 seconds.

Heapster is built with Kubernetes in mind, but luckily it also supports CoreOS which is what Deis runs on. It uses fleet to discover other nodes in the cluster and connect to their instance of the Cadvisor container.

In the past, I had managed to get this working but only for a few minutes. There seemed to be a bug with Heapster that caused it to stop flushing data to InfluxDB after a while. However, I recently tried using the newest version (v0.10.0) and the problem seems fixed now.

Below I’m including the two unit files that I’m using per CoreOS cluster to get this service working. With a few modifications, you should be able to get it running on a Deis cluster or any generic CoreOS cluster.


Note: The following services do not setup either InfluxDB, or Grafana. You will need to install this on a server or set them up using containers, and create a database for Heapster to write to. You can setup the Grafana dashboard by importing this template


No modifications necessary.

Description=cAdvisor Service

ExecStartPre=-/usr/bin/docker kill cadvisor
ExecStartPre=-/usr/bin/docker rm -f cadvisor
ExecStartPre=/usr/bin/docker pull google/cadvisor
ExecStart=/usr/bin/docker run --volume=/:/rootfs:ro \
    --volume=/var/run:/var/run:rw \
    --volume=/sys:/sys:ro \
    --volume=/var/lib/docker/:/var/lib/docker:ro \
    --publish=4194:4194 --name=cadvisor \
    --net=host google/cadvisor:latest \
    --logtostderr \
ExecStop=/usr/bin/docker stop -t 2 cadvisor



Modify the environment variables to match your setup. Also, note that this is not a global service. It is intended to run on only one server. Otherwise you would get duplicate stats.

After=docker.service cadvisor.service
Requires=docker.service cadvisor.service

    "INFLUXDB_NAME=k8s" \

ExecStartPre=-/usr/bin/docker kill heapster1
ExecStartPre=-/usr/bin/docker rm heapster1
ExecStartPre=/usr/bin/docker pull kubernetes/heapster:v0.10.0
ExecStart=/usr/bin/docker run --name heapster1 kubernetes/heapster:v0.10.0 \
    /usr/bin/heapster \
    --sink influxdb \
    --sink_influxdb_host=${INFLUXDB_HOST} \
    --sink_influxdb_name=${INFLUXDB_NAME} \
    --sink_influxdb_username=${INFLUXDB_USERNAME} \
    --sink_influxdb_password=${INFLUXDB_PASSWORD} \
    --coreos \
    --fleet_endpoints=http://${COREOS_PRIVATE_IPV4}:4001 \
ExecStop=/usr/bin/docker stop heapster1

Additional resources:

Copyright © 2015-2021 - Eduardo Trujillo
Except where otherwise noted, content on this site is licensed under a Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.
Site generated using Gatsby.