关注微信公众号
第一手干货与资讯
加入官方微信群
获取免费技术支持
John Patterson (@cantrobot) and Chris Lunsford run This End Out, an operations and infrastructure services company. You can find them online at https://www.thisendout.com *and follow them on twitter @thisendout. * Update: All four parts of the series are now live, you can find them here: Part 1: Getting started with CI/CD and Docker Part 2: Moving to Compose blueprints Part 3: Adding Rancher for OrchestrationPart 4: Completing the Cycle with Service Discovery This post is the first in a series in which we’d like to share the story of how we implemented a container deployment workflow using Docker, Docker-Compose and Rancher. Instead of just giving you the polished retrospective, though, we want to walk you through the evolution of the pipeline from the beginning, highlighting the pain points and decisions that were made along the way. Thankfully, there are many great resources to help you set up a continuous integration and deployment workflow with Docker. This is not one of them! A simple deployment workflow is relatively easy to set up. But our own experience has been that building a deployment system is complicated mostly because the easy parts must be done alongside the legacy environment, with many dependencies, and while changing your dev team and ops organization to support the new processes. Hopefully, our experience of building our pipeline the hard way will help you with the hard parts of building yours. In this first post, we’ll go back to the beginning and look at the initial workflow we developed using just Docker. In future posts, we’ll progress through the introduction of Docker-compose and eventually Rancher into our workflow. To set the stage, the following events all took place at a Software-as-a-Service provider where we worked on a long-term services engagement. For the purpose of this post, we’ll call the company Acme Business Company, Inc., or ABC. This project started while ABC was in the early stages of migrating its mostly-Java micro-services stack from on-premise bare metal servers to Docker deployments running in Amazon Web Services (AWS). The goals of the project were not unique: lower lead times on features and better reliability of deployed services. The plan to get there was to make software deployment look something like this: The process starts with the code being changed, committed, and pushed to a git repository. This would notify our CI system to run unit tests, and if successful, compile the code and store the result as an artifact. If the previous step was successful, we trigger another job to build a Docker image with the code artifact and push it to a private Docker registry. Finally, we trigger a deployment of our new image to an environment. The necessary ingredients are these:
When viewed this way, the process is deceptively simple. The reality on the ground, though, is a bit more complicated. Like many other companies, there was (and still is) an organizational divide between development and operations. When code is ready for deployment, a ticket is created with the details of the application and a target environment. This ticket is assigned to operations and scheduled for execution during the weekly deployment window. Already, the path to continuous deployment and delivery is not exactly clear. In the beginning, the deployment ticket might have looked something like this:
DEPLOY-111: App: JavaService1, branch "release/1.0.1" Environment: Production
The deployment process:
docker pull registry.abc.net/javaservice1:release-1.0.1
docker ps
docker stop [container_id]
docker run -d -p 8080:8080 … registry.abc.net/javaservice1:release-1.0.1
curl localhost:8080/api/v1/version
Admittedly, this deployment process isn’t very impressive, but it’s a great first step towards continuous deployment. There are plenty of places to improve, but consider the benefits:
Alright, on to the pain points:
Given this state of affairs, we started to implement changes to address the pain points. The first advancement was to write a bash script wrapping the common steps for a deployment. A simple wrapper might look something like this:
!/bin/bash APPLICATION=$1 VERSION=$2 docker pull "registry.abc.net/${APPLICATION}:${VERSION}" docker rm -f $APPLICATION docker run -d --name "${APPLICATION}" "registry.abc.net/${APPLICATION}:${VERSION}"
This works, but only for the simplest of containers: the kind that users don’t need to connect to. In order to enable host port mapping and volume mounts, we need to add application-specific logic. Here’s the brute force solution that was implemented:
APPLICATION=$1 VERSION=$2 case "$APPLICATION" in java-service-1) EXTRA_ARGS="-p 8080:8080";; java-service-2) EXTRA_ARGS="-p 8888:8888 --privileged";; *) EXTRA_ARGS="";; esac docker pull "registry.abc.net/${APPLICATION}:${VERSION}" docker stop $APPLICATION docker run -d --name "${APPLICATION}" $EXTRA_ARGS "registry.abc.net/${APPLICATION}:${VERSION}"
This script was installed on every Docker host to facilitate deployments. The ops engineer would login and pass the necessary parameters and the script would do the rest. Deployment time was simplified because there was less for the engineer to do. The problem of encoding the deployment logic didn’t go away, though. We moved it back in time and turned it into a problem of committing changes to a common script and distributing those changes to hosts. In general, this is a great trade. Committing to a repo gives you great benefits like code review, testing, change history, and repeatability. The less you have to think about at crucial times, the better. Ideally, the relevant deployment details for an application would live in the same source repo as the application itself. There are many reasons why this may not be the case, not the least of which being that developers may object to having \“ops\” stuff in their java repo. This is especially true for something like a deployment bash script, but also pertains to the Dockerfile itself. This comes down to a cultural issue and is worth working through, if at all possible. Although it’s certainly doable to maintain separate repositories for your deployment code, you’ll have to spend extra energy making sure that the two stay in sync. But, of course, this is an article about doing it the hard way. At ABC, the Dockerfiles started life in a dedicated repository with one folder per project, and the deploy script lived in its own repo. The Dockerfiles repository had a working copy checked out at a well-known location on the Jenkins host (say, ‘/opt/abc/Dockerfiles’). In order to build the Docker image for an application, Jenkins would first check for a Dockerfile in a local folder ‘docker’. If not present, Jenkins would search the Dockerfiles path, copying over the Dockerfile and accompanying scripts before running the ‘docker build’. Since the Dockerfiles are always at master, it’s possible to find yourself in a situation where the Dockerfile is ahead of (or behind) the application configuration, but in practice this mostly just works. Here’s an excerpt from the Jenkins build logic:
if [ -f docker/Dockerfile ]; then docker_dir=Docker elif [ -f /opt/abc/dockerfiles/$APPLICATION/Dockerfile ]; then docker_dir=/opt/abc/dockerfiles/$APPLICATION else echo "No docker files. Can’t continue!" exit 1 if docker build -t $APPLICATION:$VERSION $docker_dir
Over time, dockerfiles and supporting scripts were migrated into the application source repositories. Since Jenkins was already looking in the local repo first, no changes were required to the build pipeline. After migrating the first service, the repo layout looked roughly like: One problem we ran into with having a separate repo was getting Jenkins to trigger a rebuild of the application if either the application source or packaging logic changed. Since the ‘dockerfiles’ repo contained code for many projects, we didn’t want to trigger all repos when a change occurred. The solution: a well-hidden option in the Jenkins Git plugin called Included Regions. When configured, Jenkins isolates the build trigger to a change in a specific sub-directory inside the repository. This allows us to keep all Dockerfiles in a single repository and still be able to trigger specific builds when a change is made (compared to building all images when a change is made to a specific directory inside the repo). Another aspect of the initial workflow was that the deploy engineer had to force a build of the application image before deployment. This resulted in extra delays, especially if there was a problem with the build and the developer needed to be engaged. To reduce this delay, and pave the way to more continuous deployment, we started building Docker images on every commit to a well-known branch. This required that every image have a unique version identifier, which was not always the case if we relied solely on the official application version string. We ended up using a combination of official version string, commit count, and commit sha:
commit_count=$(git rev-list --count HEAD) commit_short=$(git rev-parse --short HEAD) version_string="${version}-${commit_count}-${commit_short}"
This resulted in a version string that looks like ‘1.0.1-22-7e56158’. Before we end our discussion of the Docker file phase of our pipeline, there are a few parameters that are worth mentioning. Before we operated a large number of containers in production we had little use for these, but they have proven helpful in maintaining the uptime of our Docker cluster.
Combining restart policies and resource limits gives you greater cluster stability while minimizing the impact of the failure and improving time to recovery. In practice, this type of safeguard gives you the time to work with the developer on the root cause, instead of being tied up fighting a growing fire. To summarize, we started with a rudimentary build pipeline that created tagged Docker images from our source repo. We went from deploying containers using the Docker CLI to deploying them using scripts and parameters defined in code. We also looked at how we organized our deployment code, and highlighted a few Docker parameters to help Ops keep the services up and running. At this point, we still had a gap between our build pipelines and deployment steps. The deployment engineer were bridging that gap by logging into a server to run the deployment script. Although an improvement from where we started, there was still room for a more automated approach. All of the deployment logic was centralized in a single script, which made testing locally much more difficult when developers need to install the script and muddle through its complexity. At this point, handling any environment specific information by way of environment variables was also contained in our deployment script. Tracking down which environmental variables were set for a service and adding new ones was tedious and error-prone. In the next post, we take a look at how we addressed these pain points by deconstructing the common wrapper script, bringing the deployment logic closer to the application using Docker Compose.Go to Part 2>> Please also download your free copy of \”Continuous Integration and Deployment with Docker and Rancher\” a detailed eBook that walks through leveraging containers throughout your CI/CD process.