[EN] - Creating a cluster with CoreOS and Docker
26/01/2015
Continuing my studies about CoreOS, in this post I will configure a cluster on Amazon EC2 and then I’ll start a Docker service in this cluster.
The cluster
The first thing I did was create two instances EC2 using CoreOS, for
this I used the stable image ami-3e750856 containing one of the
latest versions of CoreOS and Docker:

Then I configured the number of instances that I wanted to create:

To create the cluster, we need to store the CoreOS nodes’ addresses
and metadata, to make it easily we can use the Etcd as seen in the
previous post.
To use it, simply generate a new token accessing the url: https://discovery.etcd.io/new.
Also in the configuration panel, we have to customize the startup of
new instances, configuring network details, Etcd discoveries service
and Fleet cluster manager.
We can create our cloud-config file that is written to the YAML format.
This file is processed during startup of the cluster’s machines, the
minimum configuration would be this:
#cloud-config
coreos:
  etcd:
    # generate a new token from https://discovery.etcd.io/new
    discovery: https://discovery.etcd.io/4776a05c20897e83560b40a03c62918a
    # multi-region and multi-cloud deployments need to use $public_ipv4
    addr: $private_ipv4:4001
    peer-addr: $private_ipv4:7001
  units:
    - name: etcd.service
      command: start
    - name: fleet.service
      command: start
The cloud-config settings can be entered in the User data option as
text:

After finishing the configuration and creation of instances, we can test
the cluster accessing any node and the fleetctl utility request that
it lists all the machines included in the cluster:
$ ssh core@54.174.248.174
CoreOS (stable)
core@ip-172-31-34-98 ~ $ fleetctl list-machines
MACHINE		  IP			  METADATA
610aa9e3...	  172.31.34.98    -
0b735501...	  172.31.34.97    -
The cluster is configured correctly, because the commandand’s output returned the two machines that were created.
Creating a Service
By default the CoreOS comes with Docker. For this example I will create a Nginx container starting from an image that I maintain:
core@ip-172-31-34-98 ~ $ docker run -d -p 80:80 infoslack/docker-nginx
Unable to find image 'infoslack/docker-nginx' locally
Pulling repository infoslack/docker-nginx
76002e20f9ce: Pulling image (latest) from infoslack...: Download complete
511136ea3c5a: Download complete
c7b7c6419568: Download complete
70c8faa62a44: Download complete
...
Status: Downloaded newer image for infoslack/docker-nginx:latest
b1fb3d2a3995f92a9ad8b5c623315c23d5822da58f6481c39f3be0bcab2727ea
Now that we have a container running Nginx, we can build our unit-files,
for this we will use Fleet to schedule the update of each services
throughout the cluster, functioning as a centralized control interface
that handles the systemd of each cluster node.
We can start by creating the first unit-file nginx@.service, the
@ in the file description states that it is only a model:
[Unit]
Description=Nginx web server service
After=etcd.service
After=docker.service
Requires=nginx-discovery@%i.service
[Service]
TimeoutStartSec=0
KillMode=none
ExecStartPre=-/usr/bin/docker kill nginx%i
ExecStartPre=-/usr/bin/docker rm nginx%i
ExecStartPre=/usr/bin/docker pull infoslack/docker-nginx
ExecStart=/usr/bin/docker run -d --name nginx%i -p 80:80 infoslack/docker-nginx
[X-Fleet]
X-Conflicts=nginx@*.service
Analyzing by parts, we have a section header represented by [Unit]
and then some metadata about the unity created. In Description we
insert the service description and the After clausules check the
dependencies, in this case we are checking if the Etcd and Docker
services are available before running the next lines.
Another services file is added through the Requires, adding in this
case nginx-discovery@%i.service, a file responsible for updating the
Etcd with informations about our Docker service. The suffix % i
are variables to receive parameters that will be sent by Fleet.
Then we need to tell which services should be loaded, this is done in
the [Service] section. We will control Docker containers, but first we
need to disable the timeout service, because during the initial startup
of the container on each cluster node, it will take a longer time than
the default. To control the actions of Start and Stop in our
service, we need to tell Systemd that we want to have control, then
set the mode KillMode for none.
Before the service be initialized, we need to make sure that the
environment is clean, because the service is initialized by name and the
Docker only allows a single name by container. See the instructions
ExecStartPre, they have an =- in its syntax, this indicates that if
these routines fails they won’t raise an error and the script will
continues to run, if there is a container with  the name nginx these
tasks will be successful. In the last two instructions are executed the
pull of the image used, and the run to the creation of the container.
Finally, we want our service to run only on machines that do not have a
Nginx service, for this we use a section called [X-Fleet] where you
can store about the Fleet behavior, in this case we are entering a
restriction and ensuring that it will run only one Nginx service per
node in the entire cluster. This setting is interesting in larger clusters.
Etcd and Fleet
We need to record the current statuses of the services initialized in
the cluster, for this we will create another service file
nginx-discovery@.service. This new file is very similar to the
previous one, its only responsibility will be in tracking Etcd
updates, reporting the server availability:
[Unit]
Description=Announce Nginx@%i service
BindsTo=nginx@%i.service
[Service]
EnvironmentFile=/etc/environment
ExecStart=/bin/sh -c "while true; do etcdctl set /announce/services/nginx%i ${COREOS_PUBLIC_IPV4}:%i --ttl 60; sleep 45; done"
ExecStop=/usr/bin/etcdctl rm /announce/services/nginx%i
[X-Fleet]
X-ConditionMachineOf=nginx@%i.service
The clausule BindsTo is a dependency to monitor the service’s status
and capture its information. If the listed service is interrupted our
monitoring service will also stop, but we’re changing  that and if the
web service fails unexpectedly, the information in Etcd will be
updated. The ExecStart section keeps the information updated through
running the etcdctl command, which is responsible for changing the
values in Etcd being stored in /announce/services/nginx%i.
Finally, in the last statement we are ensuring that this service is
started on the same machine where the web server is running. Now that we
have models for the two services, we can send them to the cluster using
the fleetctl command:
$ fleetctl submit nginx@.service nginx-discovery@.service
After sending these files we can verify that both services are now available for the cluster:
$ fleetctl list-unit-files
UNIT                      HASH     DSTATE     STATE     TARGET
nginx-discovery@.service  9531802  inactive   inactive  -
nginx@.service            1e67818  inactive   inactive  -
Now that the models are available at system statementartup for the
entire cluster, we need to load them by specifying the new name for each
service and the port 80 that indicates our web server which port to use:
$ fleetctl load nginx@80.service
$ fleetctl load nginx-discovery@80.service
We can check which nodes of the cluster this service was loaded:
$ fleetctl list-unit-files
UNIT                        HASH      DSTATE     STATE      TARGET
nginx-discovery@.service    9531802   inactive   inactive   -
nginx-discovery@80.service  9531802   loaded     loaded     97cd08e8.../172.31.46.2
nginx@.service              1e67818   inactive   inactive   -
nginx@80.service            1e67818   launched   launched   97cd08e8.../172.31.46.2
As we can see, the services were loaded in the cluster machines. We can finally start to work:
$ fleetctl start nginx@80.service
To quickly check if the web server has been initialized and is operating normally, we can make requests to public ip of each cluster node:
$ http -h 54.174.248.174
HTTP/1.1 200 OK
Accept-Ranges: bytes
Connection: keep-alive
Content-Length: 612
Content-Type: text/html
Date: Fri, 23 Jan 2015 01:32:54 GMT
ETag: "5418459b-264"
Last-Modified: Tue, 16 Sep 2014 14:13:47 GMT
Server: nginx/1.6.2
$ http -h 54.174.226.238
HTTP/1.1 200 OK
Accept-Ranges: bytes
Connection: keep-alive
Content-Length: 612
Content-Type: text/html
Date: Fri, 23 Jan 2015 01:33:08 GMT
ETag: "5418459b-264"
Last-Modified: Tue, 16 Sep 2014 14:13:47 GMT
Server: nginx/1.6.2
Conclusion
Managing Docker containers in CoreOS is not as complicated as it seems, and distributing the containers in a cluster is a very interesting task, it takes a little time to familiarize yourself with the news of Docker, CoreOS, Fleet and Etcd. I will continue to explore them in other posts.
Happy Hacking ;)