It's no secret that Netflix kicked off their move to a cloud-based infrastructure back in 2008. The changeover from a monolithic architecture to one dominated by microservices now just seems like common sense to many developers — if not inevitable.
After all, as cybersecurity consultant Chris Eng pointed out on Twitter a couple of years back, "A single missing semicolon brought down the entire (monolithic) Netflix website for several hours in 2008."
But it's hard to fathom just how deep that overriding philosophy goes. For example, ever wonder how they can stream high-definition movies and shows over what seems like limited bandwidth?
Head in the Clouds
Fortunately, the Netflix crew keeps its goings-on fairly transparent. As they explained on their blog back in December, their entire encoding process takes place in the cloud. "The video encoding pipeline runs EC2 Linux cloud instances," according to the blog. "The elasticity of the cloud enables us to seamlessly scale up when more titles need to be processed and scale down to free up resources."
Starting at the source video, they have a system in place to check its quality. This system is automated (because of course it is), scanning the source for compression artifacts and the like that would transfer to the viewer. "Garbage in means garbage out," the post goes on to say.
Assuming the video gets the thumbs up, it's just a matter of filtering it into their cloud-based encoding setup. If you're curious, their procedure looks a lot like this:
The Netflix gang uses several codecs to handle the variety of types of screens watching their content. "At Netflix we stream to a heterogenous set of viewing devices," their post says. "This requires a number of codec profiles: VC1, H.264/AVC Baseline, H.264/AVC Main and HEVC. We also support varying bandwidth scenarios for our members, all the way from sub-0.5 Mbps cellular to 100+ Mbps high-speed Internet."
And as for really big files, 4K video, for example, they break the process into chunks that can run in parallel. And the bigger the file is, the more chunks they create to handle it.
All Sizes Fit One
All right, that's nifty, but how do they get it to my screen? Or, more importantly, my screens? After all, I enjoy Netflix whether it's on my phone, my laptop, my desktop monitor, or my TV. How do they handle that? Do they just break the videos up into a bunch of resolutions and go from there?
Well, they used to. But now they prefer a more elegant, more personalized solution.
Netflix's system takes their entire library and decides how complex each and every title is. That's where the service differs from a lot of traditional broadcasters. Usually, a movie or show is given a set speed to transmit over, or, at most, given resolutions are paired with static bitrates.
As mentioned, that's how Netflix itself did it for a while. Then, they stumbled upon a realization.
Let's say I want to watch a cartoon — Young Justice, for example. Sure, it's packed with action and sound effects, but Robin and Kid Flash are never going to be more than two-dimensional renderings. Cadmus Labs looks ominous, but that animation is almost never going to be more complex than a high-budget action show — Marvel's Daredevil, for instance.
As impressive as Young Justice is (season one, at least), it's never going to need the resources that Daredevil has to have to transmit cleanly and clearly. Yeah, Aqualad's embassy brawl with Sportsmaster was cool, but did you see any of Murdock's 57 protracted hallway fight sequences? The man deserves a bit of extra bandwidth.
Meanwhile, Young Justice can transmit just fine using fewer resources, which can be allocated elsewhere.
Netflix uses the peak signal-to-noise ratio (PSNR), to help determine visual quality. It's measured logarithmically and generally expressed in decibels. To keep it brief, PSNR compares the maximum power of a signal with the power of corrupting noise that leads to distortion. It's generally a quick and dirty way to determine video quality. By Netflix's own admission, it isn't a perfect metric, but it works for their purposes.
To prove the diversity of its titles, and to show why a per-title system is a good idea, Netflix's staff compiled 100 of them on a graph.
The target for very good quality is 45 dB, but 38 dB is acceptable. The lower the bitrate it takes a title reaches those targets, the more visually simple it tends to be. And if something can be encoded at 1080p with a bitrate of 2,000 kbps, then there's no need for a one-size-fits-all pipeline that transmits it at 4,500 kbps.
Here's the difference, courtesy of Netflix:
The Method to the Madness
So, how do they figure out what shows need which bitrates? An algorithm, of course.
In the team's own words: "To design the optimal per-title bitrate ladder, we select the total number of quality levels and the bitrate-resolution pair for each quality level according to several practical constraints. For example, we need backward-compatibility (streams are playable on all previously certified Netflix devices), so we limit the resolution selection to a finite set — 1920x1080, 1280x720, 720x480, 512x384, 384x288 and 320x240. In addition, the bitrate selection is also limited to a finite set, where the adjacent bitrates have an increment of roughly 5%."
Well, 5% of what?
The ideal bitrate is one that reaches (or very nearly reaches) Pareto efficiency. It's the place where two measurements (in this case, visual quality and efficiency) meet without interfering with each other.
So, starting there, or as close to it as they can possibly get, they introduce other bitrates at 5% intervals. According to Netflix's research, that's just under 1 just-noticeable difference (JND). The JND is something out of experimental psychology. It's the amount of stimulus needed for a person to recognized that something has changed.
So, by keeping the bitrate increments to under 1 JND, a viewer should theoretically only see a change when several steps are taken once. In effect Netflix takes the stance that those shifts should be as fluid as possible. Or at least as fluid as economically feasible for them.
Their whole process really shines for those low-intensity titles. Animation and live-action dramas that focus on characters rather than explosions (not that there's anything wrong with a good car crash) can be transmitted far more efficiently this way than with the older model, which just paired bitrates to resolutions.
The end result is a fluid bitrate ladder that is personalized not just to your resolution and bandwidth, but to the title that you're watching. It's just one of those things that makes sense in hindsight and makes you wonder why it didn't happen sooner.
Then again, Netflix has been making us say that for years now, so maybe it shouldn't be so surprising.