I Know What Facebook is Using - Part 2
I Know What Facebook is Using - Part 2
Join the DZone community and get the full member experience.Join For Free
xMatters delivers integration-driven collaboration that relays data between systems, while engaging the right people to proactively resolve issues. Read the Monitoring in a Connected Enterprise whitepaper and learn about 3 tools for resolving incidents quickly.
IntroductionIn Part 1 of the 'I Know What Facebook is Using' series, we talked about a technology called Varnish- a high-performance HTTP accelerator that uses a configuration language for optimization. In this post, we are discussing one of the technologies of which Facebook is the largest user of- something called memcached. memcached is a distributed memory object caching system. What does that mean, you say? Here's a simple explanation:
Facebook uses memcached - Part 2Memcached lets you borrow memory from other areas of your infrastructure where there plentiful amounts and makes it addressable in the areas where you need it. Without memcached, independent servers can only address the spare memory that they have directly accessible to them. With memcached, servers access a pool of available memory. As you needs for servers grow, the virtual pool of memory grows as well, making applications that require scaling have a much larger amount of memory accessible to them. If you'd like to read an amusing antic dote that explains a general use case memcached, check out the TutorialCachingStory. It describes how memcached clients operate with one another.
Hardware OptionsThere are a number of options that you could leverage for your memcached rig. Lots of shops use dedicated memcached servers; others pair them with systems running webservers- it really depends on what you can afford. The main idea is to truly understand the amounts of unused memory on each of your servers, and carefully allocate a decent amount of it to be used by memcached. In my readings, it is also understood that you should not over-allocate memory to memcached, since swapping can occur. In that case, your performance will be degraded significantly.
Installing memcachedJust the same as the previous technology we discussed, memcached is available from source and binary package for major *nix distros. Since memcached is a C application, you will need to make sure that you have a recent version of GCC, and a recent version of libevent when attempting to build from source.
Since Ubuntu/Debian happens to be my distro of choice, let's look at the command for fetching memcached:
sudo apt-get install memcachedThe libevent dependency will be installed along with the memcached package.
Configuring your memcached instance
To get memcached running, there are a couple ways to do it. memcached offers a couple command line options:
- -m for allocating RAM
- -d for daemonizong (run it as a service)
- -v is for seeing the STDOUT from the application
Chances are if you installed it via a binary package, there could also be a few init scripts available for you to use.
Best Practices for memcached Servers / ClientsYour memcached servers are best used by their clients when each of them have been configured consistently. This can be quite difficult in reality, since it's possible to have memcached running on a box with more or less memory than the others. The clients have the ability to add weight to a specific server given this scenario, so take that into careful consideration as well.
Take careful notice of your connections objects. These can run out quickly if not configured properly. Check you client configuration documentation to ensure you know what method calls or actions create new connections, and which ones don't.
Watching Server Health
memcached has a lot of statisical counters that allow you to monitor the overall health of your servers. Issuing the stats command to a memcached server can help you to see some of these health metrics. Included in those numbers might be:
- curr_connections: lists the number of clients currently connected
- listen_disabled_num: counts the number of times memcached has hit its connection limit
- Global Hitrate: the number of times your app has found cache results
The important thing to remember here is to keep your OS from using swap. Knowing this up front will keep your systems running healthy.
This series on 'I Know What Facebook is Using' will continue each Friday until there are no other technologies to talk about. Tune in next week for my latest installment!
Opinions expressed by DZone contributors are their own.