Introduction

Running Docker containers securely and efficiently in a production environment poses a number of complex challenges. Most of the complexity arises due to running containers across a number of hosts. These containers may need to keep/share state, communicate with each other and may (dis)appear at any moment. You need an automated infrastructure platform to take care of things like storage, networking, container scheduling and load balancing in a fault-tolerant and high-available manner.

In this post I will describe a Docker infrastructure based on Apache Mesos. Mesos has been battle-tested and as such qualifies, in my opinion, as one of the more mature infrastructure platforms for running Docker containers across multiple hosts. The goal of this post is to quickly get started and experiment with a production grade Docker infrastructure. As such I will not be explaining all the (Puppet) scripts in-depth and most of it will simply be a matter of starting virtual machines with Vagrant. Please leave a comment if you have any questions on the technical implementation.

Let's start by taking a look at the overall architecture.

Architecture

Docker registry: assures the provenance of every Docker image
Nginx: provides load-balancing to the Docker containers
Apache Mesos: acts as a cluster scheduler for starting Docker tasks
Mesosphere Marathon: responsible for the lifecycle of Docker containers
Consul(-template): provides service discovery and dynamic configuration
Registrator: provides service discovery for Docker containers

Setup

I've created a Vagrant setup that allows you to run the whole stack locally on your laptop. For this to work you will need to add the following entries to your local hosts file:

192.168.33.11   mm1.localdomain mm1  
192.168.33.12   mm2.localdomain mm2  
192.168.33.13   mm3.localdomain mm3  
192.168.33.14   ms1.localdomain ms1  
192.168.33.15   ms2.localdomain ms2  
192.168.33.16   ms3.localdomain ms3  
192.168.33.21   ms4.localdomain ms4  
192.168.33.20   app-test app-production  

Next, you need to clone my Github repository to obtain all the necessary Puppet scripts, settings and auxiliary files. All the commands in the next sections assume you are in the root of this repository.

Puppet is used as a provisioners for the virtual machines and will only run when the virtual machine is created for the first time. If for some reason the Puppet run did not succeed you can retrigger it with:

vagrant provision {{vm name}}  

Please note that it might take a while for the Puppet run to finish because it needs to download quite a lot of packages and Docker images sometimes.

Docker Registry

First off, we need a private Docker registry so that we can be sure about the provenance of our Docker images. The Docker registry will run in a container and use local storage for the images through a persistent data container.

vagrant up registry  

The registry will be available on IP address 192.168.33.19. I will be using the registry without any security measures installed. This is obviously a bad idea™ in a production environment, but for local testing purposes it will do just fine. You will need to add some options to your docker daemon in order to access the unsecured registry:

--insecure-registry 192.168.33.19:5000

For boot2docker or Docker machine you can find the daemon options in /var/lib/boot2docker/profile (ssh into the vm first).

Load-balancing with Nginx

Next, we need some load-balancers to route the incoming traffic to our backend containers. For this, we will use a custom Nginx container that is running consul-template to dynamically update and reload the configuration.

cd Docker/Nginx  
docker build -t 192.168.33.19:5000/nginx .  
docker push 192.168.33.19:5000/nginx  

Take a look at the github repository to see the image configuration. Basically, consul-template is going to listen for any changes to the application service in Consul and reload the Nginx configuration accordingly.

Let's create the virtual machines that are going to run the Nginx container:

vagrant up lb1 lb2  

The two load-balancers will communicate with each other through the vrrp protocol to fail-over the virtual IP in case one of them goes offline. The active load-balancer will be available on IP address 192.168.33.20 and the (previously added) app-test/app-production records in the your local hosts file.

Mesos: masters and slaves

To safely control our cluster we will need at least three Mesos masters. As you can see in the architecture diagram the master servers will also be running a Consul server, Marathon framework, Zookeeper and Registrator container. Zookeeper is currently a prerequisite of Mesos but I prefer to use Consul for my service discovery needs because it offers a number of interesting advantages and consul-tempate is a great additional tool. Registrator is not strictly necessary for the masters but I added it to all Mesos master and slaves for consistency. The Marathon framework will provide us with Docker container lifecycle management. To start the Masters:

vagrant up mm1 mm2 mm3  

This will start the virtual machines sequentially. It will be faster to individually create them in seperate terminal windows at the same time. To run the actual Docker containers we will need some Mesos slaves. I've created four slaves. Two of them (ms1 and ms2) are considered the 'test' environment, while the other two (ms3 and ms4) are considered the 'production' environment. Registrator will add or remove any containers to the Consul service directory. Start the first slave:

vagrant up ms1  

After the first slave has finished installing you can start the rest:

vagrant up ms2 ms3 m4  

The first slave acts as a NFS server and is required for the others slaves to successfully complete their Puppet run. I've chosen NFS as a simple way to introduce a distributed filesystem. You should consider other options like GlusterFS, Ceph, etc. if they are more appropriate for your situation. You could also skip a local distributed storage and look for cloud based solutions.

Overview

Okay, so now that we have a complete Mesos stack running on our local machine it is time to see how these components work together to create a Docker based production platform. The web-based GUI interfaces can be accessed through the following URL's:

http://mm1.localdomain:5050 (Mesos)
http://mm1.localdomain:8080 (Marathon)
http://mm1.localdomain:8500 (Consul)

In the Mesos interface you should see four active slaves and an active Marathon framework. Consul should only have entries for itself and Marathon should provide you with an empty application overview.

Application deployment

I've included a small demo application (Notejam) to test-drive our infrastructure. It is a simple note-taking application that creates a login session and stores information to a sqlite3 database file. Let's build a Docker image that we can deploy:

cd Docker/App  
docker build -t 192.168.33.19:5000/app .  
docker push 192.168.33.19:5000/app  

Take a look at the github repository to see the image configuration. Let's deploy to image we just built to the test slaves of our cluster:

cd Docker/App  
APP_VERSION=latest APP_ENV=test ./deploy.sh  

Check the Marathon interface to see that a deployment has started. It might take a while for the first run since the image needs to be pulled by the slaves. If all three containers have started you can go to: http://app-test. The application should be visible. At the bottom there should be a status line indicating the environment in which the application is running and from which container it is being served, refreshing the page will demonstrate that the application is being served from multiple backends.

Let's now promote our application to the production slaves of our cluster:

cd Docker/App  
APP_VERSION=latest APP_ENV=production ./deploy.sh  

The application should now be accessible at http://app-production. I hope you can see that it would be quite easy to fit this procedure into an automated deployment. In the architecture overview you can find the necessary components.

You can scale the number of backend containers up and down through the Marathon interface. The load-balancers will be automatically updated with the the container changes.

What's next?

The setup described in this blog post is a solid, albeit simplistic, starting point for a production-ready Docker infrastructure. However, there are quite a few open issues remaining in the areas of storage, networking and monitoring. The next couple of blog posts will be dedicated to exploring these areas in-depth.

Please leave a comment if you have any suggestions or improvements on the things discussed in this blog.