Over a million developers have joined DZone.

I Know What Facebook is Using - Part 2

· Performance Zone

See Gartner’s latest research on the application performance monitoring landscape and how APM suites are becoming more and more critical to the business, brought to you in partnership with AppDynamics.


In Part 1 of the 'I Know What Facebook is Using' series, we talked about a technology called Varnish- a high-performance HTTP accelerator that uses a configuration language for optimization.  In this post, we are discussing one of the technologies of which Facebook is the largest user of- something called memcached.  memcached is a distributed memory object caching system.  What does that mean, you say? Here's a simple explanation:

Facebook uses memcached - Part 2

Memcached lets you borrow memory from other areas of your infrastructure where there plentiful amounts and makes it addressable in the areas where you need it.  Without memcached, independent servers can only address the spare memory that they have directly accessible to them.  With memcached, servers access a pool of available memory.  As you needs for servers grow, the virtual pool of memory grows as well, making applications that require scaling have a much larger amount of memory accessible to them.  If you'd like to read an amusing antic dote that explains a general use case memcached, check out the TutorialCachingStory. It describes how memcached clients operate with one another.

Hardware Options

There are a number of options that you could leverage for your memcached rig. Lots of shops use dedicated memcached servers; others pair them with systems running webservers- it really depends on what you can afford.  The main idea is to truly understand the amounts of unused memory on each of your servers, and carefully allocate a decent amount of it to be used by memcached.  In my readings, it is also understood that you should not over-allocate memory to memcached, since swapping can occur.  In that case, your performance will be degraded significantly.

Installing memcached

Just the same as the previous technology we discussed, memcached is available from source and binary package for major *nix distros.  Since memcached is a C application, you will need to make sure that you have a recent version of GCC, and a recent version of libevent when attempting to build from source.

Since Ubuntu/Debian happens to be my distro of choice, let's look at the command for fetching memcached:
sudo apt-get install memcached
The libevent dependency will be installed along with the memcached package.

Configuring your memcached instance

To get memcached running, there are a couple ways to do it.  memcached offers a couple command line options:

  • -m for allocating RAM
  • -d for daemonizong (run it as a service)
  • -v is for seeing the STDOUT from the application 

Chances are if you installed it via a binary package, there could also be a few init scripts available for you to use.

Best Practices for memcached Servers / Clients

Your memcached servers are best used by their clients when each of them have been configured consistently. This can be quite difficult in reality, since it's possible to have memcached running on a box with more or less memory than the others.  The clients have the ability to add weight to a specific server given this scenario, so take that into careful consideration as well.

Take careful notice of your connections objects.  These can run out quickly if not configured properly. Check you client configuration documentation to ensure you know what method calls or actions create new connections, and which ones don't.

Watching Server Health

memcached has a lot of statisical counters that allow you to monitor the overall health of your servers. Issuing the stats command to a memcached server can help you to see some of these health metrics. Included in those numbers might be:

  • curr_connections:  lists the number of clients currently connected
  • listen_disabled_num:  counts the number of times memcached has hit its connection limit
  • Global Hitrate: the number of times your app has found cache results

The important thing to remember here is to keep your OS from using swap.  Knowing this up front will keep your systems running healthy.

This series on 'I Know What Facebook is Using' will continue each Friday until there are no other technologies to talk about. Tune in next week for my latest installment!

The Performance Zone is brought to you in partnership with AppDynamics.  See Gartner’s latest research on the application performance monitoring landscape and how APM suites are becoming more and more critical to the business.


The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}