From Cattle to K8s - Application Healthchecks in Rancher 2.0


How to Migrate from Rancher 1.6 to Rancher 2.1 Online Meetup
Key terminology differences, implementing key elements, and transforming Compose to YAML

When your application is user-facing, ensuring continuous availability and minimal downtime is a challenge. Hence, monitoring the health of the application is essential to avoid any outages.

HealthChecks in Rancher 1.6

Cattle provided the ability to add HTTP or TCP healthchecks for the deployed services in Rancher 1.6. Healthcheck support is provided by Rancher’s own healthcheck microservice. You can read more about it here.

In brief, a Cattle user can add a TCP healthcheck to a service. Rancher’s healthcheck containers, which are launched on a different host, will test if a TCP connection opens at the specified port for the service containers. Note that with the latest release (v1.6.20), healthcheck containers are also scheduled on the same host as the service containers, along with other hosts.

HTTP healthchecks can also be added while deploying services. You can ask Rancher to make an HTTP request at a specified path and specify what response is expected.

These healthchecks are done periodically at a configurable interval, and retries/timeouts are also configurable. Upon failing a healthcheck, you can also instruct Rancher if and when the container should be recreated.

Consider a service running an Nginx image on Cattle, with an HTTP healthcheck configured as below.

Imgur

The healthcheck parameters appear in the rancher-compose.yml file and not the docker-compose.yml because healthcheck functionality is implemented by Rancher.

Imgur

Lets see if we can configure corresponding healthchecks in Rancher 2.0.

HealthChecks in Rancher 2.0

In 2.0, Rancher uses the native Kubernetes healthcheck mechanisms: livenessProbe and readinessProbe.

As documented here, probes are diagnostics performed periodically by the Kubelet on a container. In Rancher 2.0, healthchecks are done by the Kubelet running locally, as compared to the cross-host healthchecks in Rancher 1.6.

A Quick Kubernetes Healthcheck Summary

  • livenessProbe

    A livenessProbe is an action performed on a container to check if the container is running. If the probe reports failure, Kubernetes will kill the pod container, and it is restarted as per the restart policy specified in the specs.

  • readinessProbe

    A readinessProbe is used to check if a container is ready to accept and serve requests. When a readinessProbe fails, the pod container is not exposed via the public endpoints so that no requests are made to the container.

    If your workload is busy doing some startup routine before it can serve requests, it is a good idea to configure a readinessProbe for the workload.

The following types of livenessProbe and readinessProbe can be configured for Kubernetes workloads:

  • tcpSocket - the Kubelet checks if TCP connections can be opened against the container’s IP address on a specified port.
  • httpGet - An HTTP/HTTPS GET request is made at the specified path and reported as successful if it returns a HTTP response code within 200 and 400.
  • exec - the Kubelet executes a specified command inside the container and checks if the command exits with status 0.

More configuration details for the above probes can be found here.

Configuring Healthchecks in Rancher 2.0

Via Rancher UI, users can add TCP or HTTP healthchecks to Kubernetes workloads. By default, Rancher asks you to configure a readinessProbe for the workload and applies a livenessProbe using the same configuration. You can choose to define a separate livenessProbe.

If the healthchecks fail, the container is restarted per the restartPolicy defined in the workload specs. This is equivalent to the strategy parameter in rancher-compose.yml files for 1.6 services using healthchecks in Cattle.

TCP Healthcheck

While deploying a workload in Rancher 2.0, users can configure TCP healthchecks to check if a TCP connection can be opened at a specific port.

Imgur

Here are the Kubernetes YAML specs showing the TCP readinessProbe configured for the Nginx workload as shown above. Rancher also adds a livenessProbe to your workload using the same config.

Imgur

Healthcheck parameters from 1.6 to 2.0:

  • port maps to tcpSocket.port
  • response_timeout maps to timeoutSeconds
  • healthy_threshold maps to failureThreshold
  • unhealthy_threshold maps to successThreshold
  • interval maps to periodSeconds
  • initializing_timeout maps to initialDelaySeconds
  • strategy maps to restartPolicy

HTTP Healthcheck

You can also specify an HTTP healthcheck and provide a path in the pod container at which HTTP/HTTPS GET requests will be made by the Kubelet. However, Kubernetes only supports an HTTP/HTTPS GET request, unlike any HTTP method supported by healthchecks in Rancher 1.6.

Imgur

Here are the Kubernetes YAML specs showing the HTTP readinessProbe and livenessProbe configured for the Nginx workload as shown above.

Imgur

Healthcheck in Action

Now let’s see what happens when a healthcheck fails and how the workload recovers in Kubernetes.

Consider the above HTTP healthcheck on our Nginx workload doing an HTTP GET on the /index.html path. To make the healthcheck fail, I did a exec into the pod container using the Execute Shell UI option in Rancher.

Imgur

Once I exec’ed to the container, I moved the file that the healthcheck does a GET on.

Imgur

The readinessProbe and livenessProbe check failed, and the workload status changed to unavailable.

Imgur

The pod was killed and recreated soon by Kubernetes, and the workload came back up since the restartPolicy was set to Always.

Using Kubectl, you can see these healthcheck event logs.

Imgur

Imgur

As a quick tip, the Rancher 2.0 UI provides the helpful option to Launch Kubectl from the Kubernetes Cluster view, where you can run native Kubernetes commands on the cluster objects.

Migrate Healthchecks via Docker Compose to Kubernetes Yaml?

Rancher 1.6 provided healthchecks via its own microservice, which is why the healthcheck parameters that a Cattle user added to the services appear in the rancher-compose.yml file and not in the docker-compose.yml config file. The Kompose tool we used earlier in this blog series works on standard docker-compose.yml parameters and therefore cannot parse the Rancher healthcheck constructs. So as of now, we cannot use this tool for converting the Rancher healthchecks from compose config to Kubernetes Yaml.

Conclusion

As seen in this blog post, the configuration parameters available to add TCP or HTTP healthchecks in Rancher 2.0 are very similar to Rancher 1.6. The healthcheck config used by Cattle services can be transitioned completely to 2.0 without loss of any functionality.

In the upcoming article, I plan to explore how to map scheduling options that Cattle supports to Kubernetes in Rancher 2.0. Stay tuned!

How to Migrate from Rancher 1.6 to Rancher 2.1 Online Meetup
Key terminology differences, implementing key elements, and transforming Compose to YAML
Prachi Damle
github
Prachi Damle
Principal Software Engineer
快速开启您的Rancher之旅