Apache Mesos: meet the data-center kernel

by mark · Published 11 April 2018 · Updated 10 April 2018

Mesos Logo

Apache Mesos, most of the times just called Mesos, is an open source cluster manager with an ambitious objective: to let you “Program against your datacenter
like it’s a single pool of resources“. Let’s take a look at what Mesos is and how it manages to deliver its promise.

The data-center evolution from ’90 to containers

The data-center has changed radically from the nineties. In the past there were huge servers, many servers, each one tasked with a particular task like one for database, one for web server etc. Most of the times servers were idle and only during spikes they were fully used. What if those servers were unable to withstand the spikes? More powerful servers had to be bought to replace the existing ones, newer servers could withstand spikes, but also were even more unutilized. Since machines can fail, mission-critical servers were often doubled and mirrored servers (high availability) in order to allow business continuity.

Then came virtualization and virtual machines which essentially compacted those workloads and allowed to buy less servers and manage workloads in an complete new fashion. Multiple idle workloads could be served by the same physical server that would have been otherwise idle. High availability inside virtualization became even more efficient allowing administrator to set up multiple copies or to spin up new virtual machines on the fly. Virtual machines could even scale meaning they can become bigger or smaller, however they introduced overhead, a necessary evil.

Then came containerization and containers which almost erased the overhead created by virtual machines while retaining all the advantages at the cost of introducing complexity. High availability is still possible but it has to be managed at a higher level of abstraction: orchestration. Because of this, orchestration solutions like Kubernetes and Docker Swarm were born.

So how does Mesos fit in this scenario?

Mesos and the need for a data-center kernel

Up until now complex features such as high availability, task scheduling and fault tolerance could be implemented at two levels: application level and cluster level. So, imagine both working at their own level:

A web server running and a backup web server ready to step up. Both reside in their own virtual machine, were the main server to fail, the second one would replace it instantly.
Now imagine the physical hosts running these two web servers virtual machines. They are configured so that if one of the server goes down the other one will step in, just like the web servers but at the physical level, thus achieving fault tolerance.

Can you spot the redundancy? Both virtual machines and web servers are performing the same redundant task but they’re not aware of each other, so they perform their operations of cluster management (health check, monitoring etc.) separately and wasting resources. The problem lies within the many layers of abstraction used in this setup.

Mesos aims to be a data-center kernel abstracting all the physical resources and making it appear as one huge mainframe. This approach enables applications to talk to the entire data-center just as if it was a normal, single, computer. By abstracting the servers and making them look like a single computer, the complex features aforementioned can be performed only once, by the cluster. Hence all the programs inside a Mesos framework immediately benefit from fault tolerance and scheduling without the need to implement a custom, static, cluster manager.

Mesos under the hood (Infrastructure)

The Mesos architecture is quite complex but we can schematize it in a few key components. Mesos Nodes are divided in two categories:

Master Nodes: they orchestrate the whole cluster, perform task scheduling and expose APIs. If only one Master Node is used and the Master goes down, the whole cluster goes down. In production environments multiple Master Nodes are used, usually 3 or 5 offer enough room for failure prevention. Only one Master Node can be active at a time, other Master Nodes are often called Standby Nodes.
Slave Nodes: nodes that are not Masters are slave, they are deemed with physically executing the workload. Slave nodes run a special program called Mesos Agent.

The infrastructure architecture is essentially simple since all the features provided by Mesos are implemented at its application level: frameworks, more in the next section. This architecture also provides resource isolation and uses CGroups (the same mechanism used by containers) to achieve efficient resource scheduling.

To achieve cluster integrity, Mesos uses Apache ZooKeeper. ZooKeeper is a common tool used in Hadoop environments, in Mesos environments it is used to elect the Master node.

Mesos frameworks (Applications)

Once the cluster is up and running it exposes a common API to program against the datacenter. Applications that leverage the Mesos API are called Mesos Framworks and they are the place where all the magic happens. Frameworks are composed of two components:

Scheduler: this component negotiates with the cluster in order to obtain the needed resources for the workload.
Executor: the executor is the process running on the slave node. Once the scheduler agrees with the cluster, the executor(s) are deployed. An executor can be a simple command or a program.

Of course you won’t have to reinvent the wheel and write complex frameworks every time. There are many common, battle-tested, frameworks for Mesos:

Marathon: is a container orchestration framework, it is used to run container workloads and you can think of it as a “systemD for Mesos”.
Chronos: is a job scheduler, if Marathon is similar to systemD, Chronos is a “cron for Mesos”.
Cassandra: is a popular distributed database maintained by the Apache foundation, it can run as a framework.
Hadoop: the most popular Open Source Big Data tool is available as a framework.
Spark: a popular framework for Big Data parallel, in-memory processing.

Discover DC/OS: Mesos on steroids

DC/OS by Mesosphere is the principal product based on Apache Mesos. The Data Center Operating System is an enterprise solution that leverages the capabilities of Mesos in order to provide an unified solution for managing, operating and logging an entire data center.

DC/OS leverages common frameworks such as Marathon and Chronos (both produced by Mesosphere) to provide a polished product that even allows single-click installs of complex software such as Kubernetes.

How does Mesos compare with Kubernetes?

Both Kubernetes and Mesos can manage clusters (multiple servers) and perform typical cluster management operations such as health check and scheduling. The only real difference is that Kubernetes manages containers while Mesos aims to manage any kind of application, even containers. As a matter of fact you can run a Kubernetes cluster on top of Mesos.

Image courtesy of mark | marksei

Author
Recent Posts

mark

The IT guy with a slight look of boredom in his eyes. Freelancer. Current interests: Kubernetes, Tensorflow, shiny new things.

Cookie	Duration	Description
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_60468161_1	past	Set by Google to distinguish users.
_ga_DR9SCJ09BV	2 years	This cookie is installed by Google Analytics.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
__gads	1 year 24 days	The __gads cookie, set by Google, is stored under DoubleClick domain and tracks the number of times users see an advert, measures the success of the campaign and calculates its revenue. This cookie can only be read from the domain they are set on and will not track any data while browsing through other sites.

Cookie	Duration	Description
edgebucket	session	Reddit sets this cookie to save the information about a log-on Reddit user, for the purpose of advertisement recommendations and updating the content.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
test_cookie	14 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
csv	2 years	No description available.
GoogleAdServingTest	session	No description
wp_api	past	No description
wp_api_sec	past	No description
_pk_id.1.95fa	1 year 27 days	No description
_pk_ses.1.95fa	29 minutes	No description
__smSessionId	9 hours	No description available.
__smToken	1 year	This cookie is set by the Sumo. This cookie is used for verifying whether the user is logged in or not.

Apache Mesos: meet the data-center kernel

The data-center evolution from ’90 to containers

Mesos and the need for a data-center kernel

Mesos under the hood (Infrastructure)

Mesos frameworks (Applications)

Discover DC/OS: Mesos on steroids

How does Mesos compare with Kubernetes?

You may also like...

Leave a ReplyCancel reply

Recent Posts

Recent Comments

Categories

Latest tutorials

Apache Mesos: meet the data-center kernel

The data-center evolution from ’90 to containers

Mesos and the need for a data-center kernel

Mesos under the hood (Infrastructure)

Mesos frameworks (Applications)

Discover DC/OS: Mesos on steroids

How does Mesos compare with Kubernetes?

Related posts:

You may also like...

SUSE to be acquired by EQT VIII for $2.5 billion

Change SSH port on CentOS 8 (with SELinux and Fail2Ban)

How to install Mageia 5 in 10 easy steps!

Leave a ReplyCancel reply

Recent Posts

Recent Comments

Categories

Latest tutorials