A quick introduction to Apache Mesos

Apache Mesos is a centralised fault-tolerant cluster manager. It’s designed for distributed computing environments to provide resource isolation and management across a cluster of slave nodes.

In some ways, Mesos provides the opposite to virtualisation:

  • Virtualisation splits a single physical resource into multiple virtual resources
  • Mesos joins multiple physical resources into a single virtual resource

It schedules CPU and memory resources across the cluster in much the same way the Linux Kernel schedules local resources.

A Mesos cluster is made up of four major components:

  • ZooKeepers
  • Mesos masters
  • Mesos slaves
  • Frameworks


Apache ZooKeeper is a centralised configuration manager, used by distributed applications such as Mesos to coordinate activity across a cluster.

Mesos uses ZooKeeper to elect a leading master and for slaves to join the cluster.

Mesos masters

A Mesos master is a Mesos instance in control of the cluster.

A cluster will typically have multiple Mesos masters to provide fault-tolerance, with one instance elected the leading master.

Mesos slaves

A Mesos slave is a Mesos instance which offers resources to the cluster.

They are the ‘worker’ instances - tasks are allocated to the slaves by the Mesos master.


On its own, Mesos only provides the basic “kernel” layer of your cluster. It lets other applications request resources in the cluster to perform tasks, but does nothing itself.

Frameworks bridge the gap between the Mesos layer and your applications. They are higher level abstractions which simplify the process of launching tasks on the cluster.


Chronos is a cron-like fault-tolerant scheduler for a Mesos cluster.

You can use it to schedule jobs, receive failure and completion notifications, and trigger other dependent jobs.


Marathon is the equivalent of the Linux upstart or init daemons, designed for long-running applications.

You can use it to start, stop and scale applications across the cluster.


There are a few other frameworks:

You can also write your own framework, using Java, Python or C++.

The quick start guide

If you want to get a Mesos cluster up and running, you have a few options:

Using Vagrant

Vagrant and the vagrant-mesos Vagrantfile can help you quickly build:

  • a standalone Mesos instance
  • a multi-machine Mesos cluster of ZooKeepers, masters and slaves

Unfortunately, the network configuration is a bit difficult to work with - it uses a private network between the VMs, and SSH tunnelling to provide access to the cluster.

Using Mesosphere and Amazon Web Services

Mesosphere provide Elastic Mesosphere, which can quickly launch a Mesos cluster using Amazon EC2.

This is far easier to work with than the Vagrant build, but it isn’t free - around $1.50 an hour for 6 instances or $4.50 for 18.

A simpler Vagrant build

I’ve put together some Vagrantfiles to build individual components of a Mesos cluster. It’s a work in progress, but it can already build a working Mesos cluster without the networking issues. It uses bridged networking, with dynamically assigned IPs, so all instances can be accessed directly through your local network.

You’ll need the following GitHub repositories:

At the moment, a cluster is limited to one ZooKeeper, but can support multiple Mesos masters and slaves.

Each of the instances is also built with Serf to provide decentralised service discovery. You can use serf members from any instance to list all other instances.

To help test deployments, there’s also a MongoDB build with Serf installed:

Like the ZooKeeper instances, the MongoDB instance joins the same Serf cluster but isn’t part of the Mesos cluster.

Once your cluster is running

You’ll need to install a framework.

Mesosphere lets you choose to install Marathon on Amazon EC2, so that could be a good place to start.

Otherwise, manually installing and configuring Marathon or another framework is easy. The quick and dirty way is to install them on the Mesos masters, but it would be better if they had their own VMs.

With Marathon or Aurora, you can even run other frameworks in the Mesos cluster for scalability and fault-tolerance.