Mike Solomon - Scalability at YouTube
(Skip to the 10 minute mark for the beginning of Mike's presentation)
Most of the infrastructure on YouTube is Python, as there are currently over a million lines of Python code, and a lot of YouTube systems start out as one Python file that grows into a large ecosystem over the years. As Mike put it:
A scalable system is one that's not in your way, that you're sort of unaware of. It's not buzzwords or anything like that, its just about a general problem solving ethos…You need flexibility to solve problems and the minute you over specify something, you paint yourself into a corner.
YouTube is a prime example of this flexibility. For those who weren't aware, YouTube began as a dating site, and had they remained a dating site, Mike would have been giving a much different presentation.
Mike equates distributed applications to weather systems and says that debugging them is about as deterministic as predicting the weather. YouTube uses Jitter, which adds some variance to things like cache expirations, preventing the creation of "thundering herds". For example, if your general expiration time is 24 hours, Jitter allows you to vary that time from 18 to 30 hours for each machine.
Mike goes on to cover a number of Scalability techniques including:
Divide and Conquer - here, simple and loose connections are extremely valuable as work is partitioned out.
Approximate Correctness - the system is what it appears to be, so if a user doesn't know that something is missing, then technically, it isn't.
Expert Knob Twiddling - Adjust your system's consistency models based on the data you're processing.
Cheating - Or rather, "Knowing how to fake data." The fastest function call is the one that doesn't happen, so sometimes faking data is good enough.
Finally, Mike touched on the Efficiency of Python:
While C is more efficient, Python provides for greater scalability. Mike explains that there are a lot of things in Python that are counterintuitive, such as the cost of garbage collection, and efficiency in Python is more about what not to do. To counteract this efficiency issue with Python, YouTube uses efficient libraries like wiseguy, pycurl, and spitfire.
There's a lot of really powerful metaprogramming things [in Python] and how they interact and how dynamic you make things has a pretty direct correlation to how expensive it is to run your Python app.
In this case dumb = fast, meaning simple code is easier to grep for and easier to maintain. The more complex a codebase is, the harder it is to decode.
So overall, when working with scalability in mind, keep things simple. Find the simplest solution to your problem that has the loosest, most practical guarantees.