Creating Monitored Environments with Docker Compose

I have already talked about several ways to monitor docker containers and also using Prometheus to monitor Rancher deployments. However, until now it has been a manual process of launching monitoring agents on our various hosts. With the release of the Rancher beta with scheduling and support for Docker compose we can begin to make monitoring a lot more automated. In today’s post we will look at using Rancher’s new \“Rancher compose\” tool to bring up our deployment with a single command, using scheduling to make sure we have a monitoring agent running on every host, and using labels to isolate and present our metrics. I will again be using Prometheusto present our container metrics and the Container-Exporter agent to capture metrics on our Docker hosts. If you have not already please do take a look at earlier articles (here and here) describing how to setup and use Prometheus. I am writing this article assuming you already have a Rancher environment stood up.

Installing Rancher Compose

To install Rancher compose we just need to download the binary from the Rancher Server UI. Browse to the Services tab in the Rancher Server UI and select the Rancher Compose CLI button in the top right corner of the screen. Then select your OS to download the respective binary. You then need to add the binary to your path by copying (or soft-linking) the binary to a directory in your path. You can then verify that its working by running rancher compose with the -h switch. On any *nix operating system you can add the rancher compose binary to your path as follows:

ln -s /DOWNLOAD_LOCATION/rancher-compose-v0.1.2/rancher-compose /usr/bin
rancher-compose -h

Creating Sample Project template

Now that we have compose installed we will need to create set of templates for compose to use in launching our deployment. We need two template files; docker-compose.yml to define the images and parameters that are required to bring up the containers which comprise our services and rancher-compose.yml to define how our containers are orchestrated into services.

To test our monitoring out, I will be creating a sample project today that will comprise of three Services, namely Database, Cache and Web. We will be using the MySQL image for our database service, memcached for our cache service and nginx for our web service. In front of the Web Service we will place a load-balancer in order to spread traffic onto our web containers. This is not meant to be a functional application, and the specific images are not important as we are concerned with monitoring and automation.

Docker Compose

The first step in defining our compose templates is to create our docker-compose file to define the containers which go into our service. For this purpose create a file called docker-compose.yml and add an element called Database with the following settings.

Database:
  image: mysql
  environment:
    MYSQL_ROOT_PASSWORD: **********
  stdin_open: true
  tty: true

The name of the root element will become the name of the service within your project. We need to specify that: we wish to use the mysql image, that we want the container to run as an interactive (stdib_open) and true type shell. The details of what each of these settings do can be found in the docker compose documentation. Lastly, as per the requirements of the MySQL image we need to specify the MYSQL_ROOT_PASSWORD environment variable.

Next we will add details of our caching service as follows. We specify that we are going to be using the memcached container image for caching. The memcached container does not require much configuration past image name and the standard shell directives.

cache:
 image: memcached
 stdin_open: true
 tty: true

Now we can bring up our web service which consists of nginx containers and links to the database and cache services we launched earlier. Note that we will label our web service with parameter foo and value bar. We can use these custom labels later to split out metrics into logical groups using arbitrary labels as described later on.

Web:
 image: nginx
 links:
 - Database
 - cache
 labels:
   foo: 'bar'
 stdin_open: true
 tty: true

With our web service defined we can now create a load balancer to send traffic to the web service. We define that the load balancer will be accepting connections on the public host network on port 80. We also define that the load balancer will be linked to the service called web.

WebLB:
  image: rancher/load-balancer-service
  ports:
  - 80:80
  links:
  - web
  tty: true
  stdin_open: true

Rancher Compose

Now we have the docker compose template we need to create the Rancher compose template to define how our services are orchestrated. The two simplest ones are the database and cache services they do not use any rancher specific features. Create a new file called rancher-compose.yml and add the following two lines to it. In it we define that both the database and cache service will have 3 containers each.

database:
  scale: 3
cache:
  scale: 3

Next we will configure the web service, which consists of the same scale parameter as with the database and cache services. However, we also specify a health check for the web service which will make sure our containers are responding to http requests. The health request are sent to port 80 and the root URI every 2000ms. If there are 3 failed health check requests in a row the container is marked as unhealthy and eventually terminated. If there two successful requests in a row then the container is marked as healthy again.

web:
  scale: 3
  health_check:
    port: 80
    interval: 2000
    unhealthy_threshold: 3
    request_line: GET / HTTP/1.0
    healthy_threshold: 2
    response_timeout: 2000

With the web service configured we can now create a load balancer to send traffic to the web service. The load balancer has the same scale parameter as earlier services to define how many containers we need. In addition it has a load_balancer_config setting which defines a health check used to determine which containers are able to serve traffic and settings for session stickyness.

WebLB:
  scale: 3
  load_balancer_config:
    lb_cookie_stickiness_policy: null
    description: null
    name: WebLB config
    app_cookie_stickiness_policy: null
    health_check:
      port: null
      interval: 2000
      unhealthy_threshold: 3
      request_line: GET / HTTP/1.0
      healthy_threshold: 2
      response_timeout: 2000

The complete files are available at these links:docker-composeand rancher-compose.

###

Creating Rancher Project

Now that we have our docker and rancher compose templates ready we can create our project with a single command. However, In order to do so you first need to create an Access Key and Secret Key to access the Rancher API. To create access keys click the settings menu in the top right of the your screen and select API & Keys in the settings section. In the resulting screen click Add API Key in order to create a new Key/Secret pair. Optionally give you key a logical name to make it easier to manage multiple keys and make sure you note the secret as it will not be visible again.

Use the key pair you generate above, along with your Rancher server host name (or IP) in the following command to generate a complete project configuration with our four services and a load balancer. Note that the docker-compose.yml and rancher-compose.yml must be in the current directory when you run this command.

rancher-compose  \
    --url http://RANCHER_SERVER_HOST:8080/v1/
    --access-key EF3599AF954334AE18C1
    --secret-key GQu2U4oZYfCV1o5jnN93XNuD7KXGAq7H
    --project-name PROJECT_NAME create

To see your project browse to http://RANCHER_SERVER_HOST:8080/static/services/projects. Initially all the services will be inactive as we have no hosts to launch the containers on. Once you launch hosts and start the services you can click View Graph in the project section to see all your services. Not how the load balancer is linked to the Web service, which in turn is linked to the cache and database services. These links define discountability, on any linked service you can do a DNS lookup for the linked service name and you will get a set of DNS A records which return the private IPs of the containers in the service.

Creating Monitoring Service Template

Now that we have our test app built, will launch a monitoring service which will ensure that there are monitoring agents running on all the hosts. We will be using Rancher’s global scheduling feature here to make sure there is an instance of this container running on every host. To view our metrics we will be launching a service with one container running the Prometheus Server.

docker-compose.yml

To configure the monitoring service we will be using the container-exporter image that I created for this purpose. We need to mount he cgroup and docker socket files into the container. In addition we need to specify the labels that the exporter should look for on Docker containers. Further we will configure the service to report the rancher project name and the rancher service name labels as well as the custom label ‘foo’. These labels will be checked against all containers running on the host and the value of the docker label will be added to Prometheus metrics. We also add two Rancher labels to our service; io.rancher.scheduler.global specifies that we want one instance of this container on all nodes and io.rancher.scheduler.affinity:host_label limits the previous specification to only those hosts which have been tagged with the monitored=true label. If you want a monitoring agent on all hosts then you can skip the second label. Note that the service must be called monitoring for service discovery by Prometheus server, details on this shortly.

monitoring:
 image: usman/container-exporter:v1
 command: "-labels=io.rancher.project.name,io.rancher.project_service.name,io.rancher.container.system,foo"
 labels:
 io.rancher.scheduler.global: 'true'
 io.rancher.scheduler.affinity:host_label: monitored=true
 volumes:
 - /sys/fs/cgroup:/cgroup
 - /var/run/docker.sock:/var/run/docker.sock
 stdin_open: true
 tty: true

We can now configure Prometheus Server to pull metrics from the various container exporters. We do this using the container image that I defined for this purpose. We link the monitoring service into the Prometheus server which allows us to query the embedded DNS server for monitoring container IPs. Using the IPs we will dynamically generate the Prometheus target configuration. This configuration is regenerated and loaded every 30s hence if you add or remove more hosts they will be picked up by the Prometheus server.

prometheus:
 ports:
 - 9090:9090/tcp
 image: usman/prometheus
 links:
 Monitoring: monitoring
 stdin_open: true
 tty: true

Rancher Compose.yml

For rancher compose yaml file for the monitoring services is fairly simple, we just need to define that we only need one container for each of the Monitoring and Prometheus Services.

monitoring:
 scale: 1
prometheus:
 scale: 1

The complete files are available at these links:docker-composeand rancher-compose.

####

Creating Monitoring Services

As with the earlier project we will use rancher compose to create the monitoring project. The graph view of the monitoring project with its two services is shown below. Note that this monitoring service and the associated compose templates can be used with any project. As there will be one monitoring container running on each host which will monitor all running containers regardless of which service they belong too.

rancher-compose  \
    --url http://RANCHER_SERVER_HOST:8080/v1/
    --access-key EF3599AF954334AE18C1
    --secret-key GQu2U4oZYfCV1o5jnN93XNuD7KXGAq7H
    --project-name monitoring create

Launching Tagged Hosts

With our projects and its services defined we now need to launch hosts so that we can start our service containers. To launch hosts browse to Infrastructure > Hosts > Add Host and select one of the cloud providers or Custom to register a new host. Regardless of where you launch the host make sure to add the label monitored=true. This will make sure a monitoring agent is launched on this host. For example the custom docker run command to launch a host is shown below.

sudo docker run -e CATTLE_HOST_LABELS='monitored=true' -d --privileged -v /var/run/docker.sock:/var/run/docker.sock rancher/agent:v0.7.9 http://RANCHER_SERVER_HOST:8080/v1/scripts/456FC30268A2806931DA:1434506400000:mg7x0OvhK7wVjUyZ74hZAPcaNIo

Once you have a few hosts up go back to the services page at http://RANCHER_SERVER_HOST/static/services/projects and start all the services by clicking the respective link. The started services should scale to the required number of containers in a few minutes. Once you start the Monitoring service browse to Infrastructure > Hosts and note that there will be exactly one instance of the Monitoring container in each host. Note that if you delete any of the monitoring containers Rancher will automatically launch one to replace it.

Generating Prometheus Graphs

In a few moments when all the services are up you can click the Prometheus container to bring up its details page and look up it’s Host IP. In your browser browse to http:://PROMETHEUS_HOST_IP:9090/ to bring up the Prometheus UI. Once there, click the Graphs tab in order to generate graphs for your container metrics. You can now enter queries into expression text box and click execute to draw the graph. We can split results based on Project, Service, System Service and also custom labels. Examples of the various queries are shown below:

To split the metrics into your various projects use the io_rancher_project_name label. For example the following query shows the CPU usage for all containers in the MyApp1 project.

container_cpu_usage_seconds_total{io_rancher_project_name=\“MyApp1\“}

Similarly, to split metrics by service name you can use the io_rancher_project_service_name label for example the following query shows the max memory used by the 3 containers of the Monitoring service.

container_memory_max_usage_bytes{io_rancher_project_service_name=\“MyApp1/Monitoring\“}

You may also split data out by system and user container by using the io_rancher_container_system label. An empty value in this label means the container is a user container. Hence we can use the query below to see the total number of user containers in our deployment.

count(container_last_seen{io_rancher_container_system=\”\“})

Lastly, we can split metrics based on custom labels as well as the Rancher labels. If you remember earlier we labeled our web service node with the custom label goo=bar. We can now use that label to query metrics, for example the query below shows us page faults in all containers tagged with the foo=bar label.

container_memory_paging_total{foo=\“bar\“,type=\“pgpgout\“}

Conclusion

Today we have defined an entire deployment using Docker and Rancher compose such that we can bring up an entire copy of our deployment with a single command. This is ideal for bringing up duplicate copies of our services for testing purposes as well as for redundancy or updates. Furthermore as part of the deployment we have used global scheduling to ensure that we have an instance of our monitoring agent on all hosts. Finally, using the service discover features of Rancher we have enabled the Prometheus server to dynamically discover monitoring agents and pull metrics from them. Using this we can create dashboards to present important information about our deployment or create alerts if containers fail our are experience load. (Note details of how to create dashboards and alerts can be found in an earlier article.) Managing micro-service deployments is a complicated undertaking, as you must keep track of many different services, with different life cycles, scaling profiles and availability requirements. In such a scenario detailed monitoring is essential and the ability to programmatically define such deployments and automate the monitoring of its containers makes it significantly easier to setup detailed monitoring for your docker based deployments.

I hope this demo helps, to get started with Rancher, visit GitHubfor info on setting up your own Rancher server, or register for the Rancher beta to get some help. You can also schedule a demo with one of our engineers to get some help.