关注微信公众号
第一手干货与资讯
加入官方微信群
获取免费技术支持
GlusterFS is a scalable, highly available, and distributed network file system widely used for applications that need shared storage including cloud computing, media streaming, content delivery networks, and web cluster solutions. High availability is ensured by the fact that storage data is redundant, so in case one node fails another will cover it without service interruption. In this post I’ll show you how to create a GlusterFS cluster for Docker that you can use to store your containers data. The storage volume where data resides is replicated twice, so data will be accessible if at least one Gluster container is working. We’ll use Rancher for Docker management and orchestration. In order to test storage availability and reliability I’ll be deploying an Asteroids game.
Preparing AWS environment Before deploying the GlusterFS cluster you need to satisfy the following requirements in AWS:
Create a RancherOS instance (look for RancherOS AMI in Community AMIs). Configure it to run Rancher Server by defining the following user data and associate it to the Gluster Security Group. Once the instance is running you can browse to Rancher UI: http://RANCHER_INSTANCE_PUBLIC_IP:8080/
#!/bin/bash docker run -d -p 8080:8080 rancher/server:v0.17.1
I have prepared two Docker images that we are using later. This is how I built them. The GlusterFS server image This is the Dockerfile:
FROM ubuntu:14.04 MAINTAINER Manel Martinez <manel@nixelsolutions.com> RUN apt-get update && \ apt-get install -y python-software-properties software-properties-common RUN add-apt-repository -y ppa:gluster/glusterfs-3.5 && \ apt-get update && \ apt-get install -y glusterfs-server supervisor RUN mkdir -p /var/log/supervisor ENV GLUSTER_VOL ranchervol ENV GLUSTER_REPLICA 2 ENV GLUSTER_BRICK_PATH /gluster_volume ENV GLUSTER_PEER **ChangeMe** ENV DEBUG 0 VOLUME ["/gluster_volume"] RUN mkdir -p /usr/local/bin ADD ./bin /usr/local/bin RUN chmod +x /usr/local/bin/*.sh ADD ./etc/supervisord.conf /etc/supervisor/conf.d/supervisord.conf CMD ["/usr/local/bin/run.sh"]
As you can see, we are using 2 replicas for distributing the Gluster volume ranchervol. All its data will be persisted in Docker volume /gluster_volume. Note that we are not exposing any port because GlusterFS containers are connecting through Rancher network. The run.sh script is as follows:
#!/bin/bash [ "$DEBUG" == "1" ] && set -x prepare-gluster.sh & /usr/bin/supervisord
It will invoke another script to prepare GlusterFS cluster in background. This is required because Gluster commands need to be executed when its daemon is running. This is the content for prepare-gluster.sh script:
#!/bin/bash set -e [ "$DEBUG" == "1" ] && set -x if [ "${GLUSTER_PEER}" == "**ChangeMe**" ]; then # This node is not connecting to the cluster yet exit 0 fi echo "=> Waiting for glusterd to start..." sleep 10 if gluster peer status | grep ${GLUSTER_PEER} >/dev/null; then echo "=> This peer is already part of Gluster Cluster, nothing to do..." exit 0 fi echo "=> Probing peer ${GLUSTER_PEER}..." gluster peer probe ${GLUSTER_PEER} echo "=> Creating GlusterFS volume ${GLUSTER_VOL}..." my_rancher_ip=`echo ${RANCHER_IP} | awk -F\/ '{print $1}'` gluster volume create ${GLUSTER_VOL} replica ${GLUSTER_REPLICA} ${my_rancher_ip}:${GLUSTER_BRICK_PATH} ${GLUSTER_PEER}:${GLUSTER_BRICK_PATH} force echo "=> Starting GlusterFS volume ${GLUSTER_VOL}..." gluster volume start ${GLUSTER_VOL}
As we can see, if we don’t provide GLUSTER_PEER environment variable the container will only start GlusterFS daemon and wait for a second peer container to join the cluster. The second container needs to know about GLUSTER_PEER address in order to contact it (peer probe) and create the shared storage volume. This is the supervisor configuration file, needed to start GlusterFS daemon:
[supervisord] nodaemon=true [program:glusterd] command=/usr/sbin/glusterd -p /var/run/glusterd.pid
The following commands are required to publish the Docker image:
docker build -t nixel/rancher-glusterfs-server . docker push nixel/rancher-glusterfs-server .
The Asteroids game image This is the image we are using to publish the Asteroids HTML5 game for testing Gluster HA capabilities. This container acts as a GlusterFS client that will mount the shared volume where the following game content is being stored:
This is the Dockerfile which defines the image:
FROM ubuntu:14.04 MAINTAINER Manel Martinez <manel@nixelsolutions.com> RUN apt-get update && \ apt-get install -y python-software-properties software-properties-common RUN add-apt-repository -y ppa:gluster/glusterfs-3.5 && \ apt-get update && \ apt-get install -y git nodejs nginx supervisor glusterfs-client dnsutils ENV GLUSTER_VOL ranchervol ENV GLUSTER_VOL_PATH /mnt/${GLUSTER_VOL} ENV GLUSTER_PEER **ChangeMe** ENV DEBUG 0 ENV HTTP_CLIENT_PORT 80 ENV GAME_SERVER_PORT 443 ENV HTTP_DOCUMENTROOT ${GLUSTER_VOL_PATH}/asteroids/documentroot EXPOSE ${HTTP_CLIENT_PORT} EXPOSE ${GAME_SERVER_PORT} RUN mkdir -p /var/log/supervisor ${GLUSTER_VOL_PATH} WORKDIR ${GLUSTER_VOL_PATH} RUN mkdir -p /usr/local/bin ADD ./bin /usr/local/bin RUN chmod +x /usr/local/bin/*.sh ADD ./etc/supervisord.conf /etc/supervisor/conf.d/supervisord.conf ADD ./etc/nginx/sites-available/asteroids /etc/nginx/sites-available/asteroids RUN echo "daemon off;" >> /etc/nginx/nginx.conf RUN rm -f /etc/nginx/sites-enabled/default RUN ln -fs /etc/nginx/sites-available/asteroids /etc/nginx/sites-enabled/asteroids RUN perl -p -i -e "s/HTTP_CLIENT_PORT/${HTTP_CLIENT_PORT}/g" /etc/nginx/sites-enabled/asteroids RUN HTTP_ESCAPED_DOCROOT=`echo ${HTTP_DOCUMENTROOT} | sed "s/\//\\\\\\\\\//g"` && perl -p -i -e "s/HTTP_DOCUMENTROOT/${HTTP_ESCAPED_DOCROOT}/g" /etc/nginx/sites-enabled/asteroids RUN perl -p -i -e "s/GAME_SERVER_PORT/${GAME_SERVER_PORT}/g" /etc/supervisor/conf.d/supervisord.conf RUN HTTP_ESCAPED_DOCROOT=`echo ${HTTP_DOCUMENTROOT} | sed "s/\//\\\\\\\\\//g"` && perl -p -i -e "s/HTTP_DOCUMENTROOT/${HTTP_ESCAPED_DOCROOT}/g" /etc/supervisor/conf.d/supervisord.conf CMD ["/usr/local/bin/run.sh"]
And this is the run.sh script:
#!/bin/bash set -e [ "$DEBUG" == "1" ] && set -x && set +e if [ "${GLUSTER_PEER}" == "**ChangeMe**" ]; then echo "ERROR: You did not specify "GLUSTER_PEER" environment variable - Exiting..." exit 0 fi ALIVE=0 for PEER in `echo "${GLUSTER_PEER}" | sed "s/,/ /g"`; do echo "=> Checking if I can reach GlusterFS node ${PEER} ..." if ping -c 10 ${PEER} >/dev/null 2>&1; then echo "=> GlusterFS node ${PEER} is alive" ALIVE=1 break else echo "*** Could not reach server ${PEER} ..." fi done if [ "$ALIVE" == 0 ]; then echo "ERROR: could not contact any GlusterFS node from this list: ${GLUSTER_PEER} - Exiting..." exit 1 fi echo "=> Mounting GlusterFS volume ${GLUSTER_VOL} from GlusterFS node ${PEER} ..." mount -t glusterfs ${PEER}:/${GLUSTER_VOL} ${GLUSTER_VOL_PATH} echo "=> Setting up asteroids game..." if [ ! -d ${HTTP_DOCUMENTROOT} ]; then git clone https://github.com/BonsaiDen/NodeGame-Shooter.git ${HTTP_DOCUMENTROOT} fi my_public_ip=`dig -4 @ns1.google.com -t txt o-o.myaddr.l.google.com +short | sed "s/\"//g"` perl -p -i -e "s/HOST = '.*'/HOST = '${my_public_ip}'/g" ${HTTP_DOCUMENTROOT}/client/config.js perl -p -i -e "s/PORT = .*;/PORT = ${GAME_SERVER_PORT};/g" ${HTTP_DOCUMENTROOT}/client/config.js /usr/bin/supervisord
As you can see we need to inform about GlusterFS containers where ranchervol storage is being served using GLUSTER_PEER environment variable. Although GlusterFS client does not need to know about all cluster nodes, this is useful for Asteroids container to be able to mount the volume if at least one GlusterFS container is alive. We are proving this HA feature later. In this case we are exposing 80 (Nginx) and 443 (Node.js Websocket server) ports so we can open the game in our browser. This is the Nginx configuration file:
server { listen HTTP_CLIENT_PORT; location / { root HTTP_DOCUMENTROOT/client/; } }
And the following supervisord configuration is required to run Nginx and Node.js:
[supervisord] nodaemon=true [program:nginx] command=/usr/sbin/nginx [program:nodejs] command=/usr/bin/nodejs HTTP_DOCUMENTROOT/server/server.js GAME_SERVER_PORT
Finally, the run.sh script will download the Asteroids source code and save it on GlusterFS shared volume. The last step is to replace the required parameters on configuration files to run Nginx and Node.js server application. The following commands are needed to publish the Docker image:
docker build -t nixel/rancher-glusterfs-client . docker push nixel/rancher-glusterfs-client .
Now we need to create three Docker hosts, two of them used to run GlusterFS server containers, and the third to publish the Asteroids game. In Rancher UI, click + Add Host button and choose Amazon EC2 provider. You need to specify, at least, the following information:
Repeat this step three times to create gluster01, gluster02, and asteroids hosts.
Now you are ready to deploy your GlusterFS cluster. First, click + Add Container button on gluster01 host and enter the following information:
Expand Advanced Options and follow these steps:
Now wait for gluster01 container to be created and copy its Rancher IP address, you are needing it now. Then click + Add Container button on gluster02 host to create the second GlusterFS server container with the following configuration:
Now wait for gluster02 container to be created and open its menu, then click View Logs option. You will see the following messages at the bottom of log screen confirming that shared volume was successfully created.
Now it is time to create our GlusterFS client container, which is publishing an Asteroids game to the Internet. Click + Add Container on asteroids host and enter the following container information:
Note that we are not configuring any container volume, because all data is stored in GlusterFS cluster. Wait for asteroids container to be created and show its logs. You will find something like this at the top: You will also see how Nginx server and Node.js application are started at the bottom: At this point your Rancher environment is up and running.
It is time to play and test GLusterFS HA capabilities. What you are doing now is to stop one GlusterFS container and check that game will not suffer downtimes. Browse to http://ASTEROIDS_HOST_PUBLIC_IP and you will access Asteroids game, enter your name and try to explode some asteroids. Go to Rancher UI and stop gluster02 container, then open a new browser tab and navigate to the game again. The game is accessible. You can start gluster02 container, then stop gluster01 container, and try again. You are still able to play. Finally, keep gluster01 container stopped, restart asteroids container and wait for it to start. As you can see, if at least one GlusterFS server container is running you are able to play. Finally you may want to stop gluster01 and gluster02 containers to check how game becomes unavailable because its public content is not reachable now. To recover the service start gluster01 and/or gluster02 containers again.
Shared storage is a required feature when you have to deploy software that needs to share information across all nodes. In this post you have seen how to easily deploy a Highly Available shared storage solution for Rancher based on GlusterFS Docker images. By using an Asteroids game you have checked that storage is available when, at least, one GlusterFS container is running. In future posts we are combining the shared storage solution with Rancher Load Balancing feature, added in 0.16 version, so you will see how to build scalable, distributable, and Highly Available Web server solutions ready for production use. To learn more about Rancher, please join us for our next online meetup, where we’ll be demonstrating some of these features and answering your questions. Manel Martinez is a Linux systems engineer with experience in the design and management of scalable, distributable and highly available open source web infrastructures based on products like KVM, Docker, Apache, Nginx, Tomcat, Jboss, RabbitMQ, HAProxy, MySQL and XtraDB. He lives in spain, and you can find him on Twitter @manel_martinezg.