Over a million developers have joined DZone.

Some GlusterFS Experiments and Benchmarks

DZone's Guide to

Some GlusterFS Experiments and Benchmarks

GlusterFS can provide distributed storage to Kubernetes nodes. This benchmark looks at how well it performs when geographically distributed.

· Cloud Zone
Free Resource

Linkerd, the open source service mesh for cloud native applications. Get the complete guide to using Linkerd and Kubernetes to build scalable, resilient applications.

The reason we used GlusterFS was to be able to have a shared storage between each node of the cluster, so we can spin an instance of any Docker image on any node without issues, as the container will use the shared storage for their business data (mounted as volume).

Installing GlusterFS

I first installed GlusterFS across the ocean, with one server in France and another one in Canada. It looked fine but when I started using it, my first Git clone on a GlusterFS mount point took so long that I had time to make coffee, drink a cup, and then drink a second one! So, it was not usable in production.

I decided to test the mount point by copying a big file just to see how fast it would be and whether the speed was okay. It reminds me of one good exercise by Kirk Pepperdine for optimizing a website that was way too slow because of too many connections to the database. We suffered the same issue, lots of files mean lots of connections that need to cross the ocean, and this delay in the handshake delays the copy.


Procedure Used

I decided to benchmark the FS. Since I didn’t find a lot of frameworks to test a filesystem, I just wrote a small script in bash to generate a file with dd and /dev/urandom. I would then run this script on every partition: one ext4 partition, one Gluster partition in the same datacenter, and one Gluster partition across the ocean.

for i in $(seq 1 $NUMBER);
 dd if=/dev/urandom of=$TARGET/file_$i bs=$SIZE count=$COUNT 2>&1 | grep -v records

Then I called this generation script with another one, to create 1GB in different settings.

# Creating 10240 files of 100k
export NUMBER=10240
export TARGET=`pwd`/100k
export SIZE=100K
sh generate.sh > 100k.log

# Creating 1024 files of 1M
export NUMBER=1024
export TARGET=`pwd`/1M
export SIZE=1M
sh generate.sh > 1M.log

# Creating 100 files of 10M
export NUMBER=100
export TARGET=`pwd`/10M
export SIZE=10M
sh generate.sh > 10M.log

# Creating 10 files of 100M
export NUMBER=10
export COUNT=100
export TARGET=`pwd`/100M
export SIZE=1M
sh generate.sh > 100M.log

# Creating 1 file of 1G
export NUMBER=1
export TARGET=`pwd`/1G
export SIZE=1M
export COUNT=1024
sh generate.sh > 1G.log 


Let’s see what the results were.

In the graphs below:

  • The Y-Axis is expressed in seconds.
  • The ext4 partition is represented by ext4 (blue color)
  • The Gluster partition in the same datacenter is represented by gluster (orange color)
  • The Gluster partition across the ocean is represented by gluster-atlantic. (grey color)

Benchmark for 1 GB Files

Here, only one file is copied. We can see that gluster-atlantic is 1.5 times slower, and the difference between ext4 and gluster is about 30%. However, to get the replication and the security—it is worth it.

Benchmark for 100M Files

For 100 million files we have pretty much the same kind of result–one file took more time but nothing too abnormal.

Benchmark for 10M Files

For 10 million files we can see that ext4 is getting ahead of gluster by 2.2 times, and the difference between gluster and gluster-atlantic is not as much this time. We can also see some spikes that seem to appear for the same amount of data. The tests were run in different timings so we can suppose that GlusterFS triggers some work when the cache is full.

Benchmark for 1M Files

Now, gluster is closer to ext4 and we can see that crossing the Atlantic seems to take at least 0.4s!

Benchmark for 100K Files

Here is the final one. We can round the gluster-atlantic to 0.3s.

To better see the difference between ext4 and gluster, I’ve created a logarithmic graph (keep in mind that the following graph is not a linear Y axis).

Logarithmic Graph for 100K Files

Median Time by Size

The above graph shows how problematic the small files are. Unfortunately, as I’m using my own git server (gist) and since any website or app is basically now a git clone, it makes it unusable in production.

This benchmark confirmed what I learned a long time ago: establishing a connection does take time; you have incompressible latency and this latency is directly related to the distance between objects (sadly the speed of light is not enough). That’s why CDN are so useful and are used frequently.

It would have been fun to test with data centers on the West Coast and the East Coast of the United States, and then between the West Coast and Europe.

Geo-replication was the Solution

In the GlusterFS documentation, the following table lists the differences between replicated volumes and geo-replication:

Replicated volumes and geo-replication

With geo-replication, we can have fault tolerance in the data centers. We might still lose a really small amount of data in the last operations, but it won’t end up with a total loss of data as it’s unlikely that a data center will go down completely. If the data is crucial (like data for banks or other financial institutions) then I would probably create a replica in a nearby data center, but definitely not thousands of miles away. This is also why AWS recommends using two availability zones to ensure speed and reliability, and replicate across regions for recovery.

The Final Solution

For my cluster, I finally took another server in Canada to handle the replication and added the server in France as a geo-replication. The final architecture looks like this:

Final Architecture

You enter the cluster by any of the NGINX servers that will redirect you to the right Kubernetes Service, which maps to the right containers.

The shared partition is handled by GlusterFS, replicated between the two servers in Canada and another safety geo-replication in the server in France.

The containers can then be shifted from one server to another without any trouble.

If I want to migrate a server, I just have to install a server, make it join the cluster, move containers around, and then stop the other server.

Linkerd, the open source service mesh for cloud native applications. Get the complete guide to using Linkerd and Kubernetes to build scalable, resilient applications.

gluster ,kubernetes ,performance ,benchmark

Published at DZone with permission of Rémi Cattiau, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.


Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.


{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}