Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Containers: Who Is Containing Them?

DZone's Guide to

Containers: Who Is Containing Them?

Thanks to containers, almost anything is shippable.

· Cloud Zone ·
Free Resource

Learn how to migrate and modernize stateless applications and run them in a Kubernetes cluster.

Introduction

A typical way to structure an article is to start from the ground and show piece by piece how to build something very amazing.

This was also my intention, so I started developing a Spring Boot application that read data from a MySQL DB. I configured Docker to execute database and web application in different containers, making the database accessible only through the application and, in this way, demonstrating the powerful encapsulating possibilities of Docker. While I was describing all steps needed I asked to myself: what is really happening? How containers are running and, more important, where are they running?

If you are interested in an example of a Spring Boot application connected to MySQL DB running on different containers, you can see the code I prepared at the following Git repositories and run it: docker-cluster. But if you are more interested in the answers I gave to me regarding magic happen under, the ground read the following sections.

Am I Running Virtual Machines?

This is the first question I asked myself.

When I went deeper into the documentation regarding containers, I came across the difference between a virtual machine and a container. To compare the two technologies in the right way I needed to understand virtualization.

Virtualization is an old technology. It was first developed by IBM in the ’60s to handle multiple user access to machine running batches and discovered again in the ’90s to when companies start using a single host to execute multiple programs. In those days every program was installed on a different operating system instance that ran on its own host machine. This was for security issues and because programs were compiled for a specific operating system. Virtualization allowed companies to have multiple programs running on its own operating system hosted by a shared physical machine. This approach also speeds up software installation since after a virtual machine was created it was possible to save the image and run this image on a different host.

The heart of the technology is hypervisor: a program whose scope is to virtualize and limit physical resources and to make it available to virtual instances. There are two kinds of hypervisor:

  • Type-1 Hypervisors: they are installed directly on the host firmware giving high performances;
  • Type-2 Hypervisors: they are installed on the operating system. They have lower performance but enable the user to install them on every server they already have.

This technology was fine until the need arose to have good runtime performance on virtualized operating systems, but in last years a new need has come out: high availability.

High availability is the characteristic of a system that constantly serves. To accomplish that it must guarantee redundancy (elimination of single points of failure) and be able to respond if some node fails (detection of failures and auto-scaling).

Isolation (Containerization)

Despite the fact that containers are widely used there is not yet a common definition for them.

Docker defines containers as “an abstraction at the app layer that packages code and dependencies together” and virtual machines as “an abstraction of physical hardware turning one server into many servers.” This defines what a container is but not how the abstraction is implemented.

If, for virtual machines, the technology is “virtualization,” then “containerization” could be — and in some cases is — the term for the technology that enables us to use containers.

I prefer using the term “isolation.”

Containerization recalls something related to shipping:

containerize: pack into or transport by container

While isolation recalls the implementation behind it:

Identify (something) and examine or deal with it separately.

Containers do not replicate the whole OS but talk directly with the kernel. This means they do not need to replicate the whole OS for running and they do not need a long time to start. They are just a set of isolated running processes that can interact with the kernel.

This set of processes can be collected by giving the experience of acting as a different distro of the host operating system. An example: on an Ubuntu machine you can run a container that acts as CentOS, then another that acts as REHL and another one that acts as Debian.

For its nature, it is not possible to have a container that acts as Windows on a Linux machine since the kernel is not replicated but processes and resources are isolated.

With isolation redundancy of nodes became easy since do not need a lot of spaces for running containers and a lot of time to start up a node with a whole operating system.

An interesting observation on the containers definition is that the host operating system is a container, too.

How Does the Magic Happen?

Magic happen thanks to simple ideas that are been used to create great technologies: in this cases two Linux feature that have paved the way: namespaces  and cgroups.

Namespaces ware introduced to give a convenient way to isolate processes. This can be useful in a lot of situation in which there’s the necessity to avoid some process to kill others for error. As an example, imagine a machine in which more than one application server is running. One in stuck so someone lists all Java processes and kills the failing one but he picks the wrong PID and kills the wrong process. Namespaces can isolate PID, networks, mounts, UTS, user, IPC: basically, everything needed to run a sub-system. The only thing is missing is resources limits: this is the cgroup's job. They limit and isolate the resource usage (CPU, memory, disk I/O, network, etc.) of processes.

Magic in Action

The following commands have been used to test namespaces and cgroups.

For Namespaces, the goal of the example is to create 2 PID namespaces and demonstrate that processes running on one were not visible for other.

In a first console, I executed

sudo unshare --fork --pid --mount-proc bash


This runs bash in a new PID namespace. Inside of it, I ran the following Java class whose scope was to maintain an active Java process for some time.

public class NameSpaceTest {
    public static void main(String[] args) throws Exception{
        System.out.println("Goodnight");
        Thread.sleep(600000);
    }
}


Then I opened a new terminal and ran a bash in a second PID namespace. At this point executing:

ps aux


The first namespace gave the following output:

USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND

root          1  0.1  0.1  21616  5196 pts/2    S    13:36   0:00 bash
root         18  2.5  1.1 2708968 30492 pts/2   Sl   13:37   0:00 java NameSpaceTest
root         30  0.0  0.1  30200  3120 pts/2    R+   13:37   0:00 ps aux


while in the second namespace the following was given:

USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND

root          1  0.1  0.1  21352  5016 pts/3    S    13:36   0:00 bash
root         10  0.0  0.1  30200  3144 pts/3    R+   13:37   0:00 ps aux


Here there are 2 main things to notice:

  • Both namespaces have a PID n. 1;
  • The second one does not see the Java process.

After that, I ran ps aux   in a third terminal outside the forked namespaces and as you can see, bash namespaces have a PID in the outer world.

… useless processes removed …

root       2817  0.0  0.1  44416  4192 pts/2    S    13:36   0:00 sudo unshare --fork --pid --mount-proc bash
root       2818  0.0  0.0  16644   760 pts/2    S    13:36   0:00 unshare --fork --pid --mount-proc bash
root       2819  0.3  0.1  21352  4812 pts/2    S+   13:36   0:00 bash
root       2827  0.0  0.1  44416  4352 pts/3    S    13:36   0:00 sudo unshare --fork --pid --mount-proc bash
root       2828  0.0  0.0  16644   688 pts/3    S    13:36   0:00 unshare --fork --pid --mount-proc bash
root       2829  0.3  0.1  21352  5016 pts/3    S+   13:36   0:00 bash
admin      2841  0.0  0.1  30200  3136 pts/1    R+   13:36   0:00 ps aux


When the Java process was running it was visible to the outer world, too:

ps aux | grep java

root       2850  0.8  1.1 2708968 30492 pts/2   Sl   13:37   0:00 java NameSpaceTest
admin      2867  0.0  0.0  17480   896 pts/1    S+   13:37   0:00 grep java


For cgroups, the goal of my test was to create a control group with very limited memory and inside of it try to open some program.

I executed

sudo cgcreate -a admin -g memory:javaEnv


to create the cgroup. This command initializes a set of empty settings available at /sys/fs/cgroup/memory/javaEnv/. 

With the following command:

sudo echo 2000000 > /sys/fs/cgroup/memory/javaEnv/memory.kmem.limit_in_bytes


I limited memory dedicated to programs executing in javaEnv. Then I ran bash  in this cgroup 

sudo cgexec -g memory:javaEnv bash


When then I tried to run:

libreoffice –writer


I get:

fork: Cannot allocate memory

as expected.

After I deleted created cgroup by executing

sudo cgdelete memory:javaEnv

Conclusion

Containers and virtual machines are often compared and, in some cases, referred to as alternatives. I think they must be thought of as different technologies that address different aspect of the same principle: abstraction.

Often are used together,: it is common to run Ddocker images on existing servers that also run other standard application installed on virtual machines.

The important thing is to get the best from them by understanding their implementation and this was what I tried to do in this article.

A Curiosity

You may have read that Docker is available for Windows, too, and that you can run Linux on it, so you may say: “You just said that it’ impossible.” The funny answer is that it is possible, and it is thanks to virtual machines. Under the ground, Windows runs an AlpineLinux virtual machine with Docker installed on it through Hyper-V.

Join us in exploring application and infrastructure changes required for running scalable, observable, and portable apps on Kubernetes.

Topics:
docker architecture ,virtual machine ,virtualization ,containers ,isolation ,cloud

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}