Since Netflix’s new rating system released in April, we’ve seen a slurry of opinion articles crashing down on the change (with over-the-top, clickbaity titles, I might add):
There is even a petition going around to change it back.
Although the change has received largely negative reviews (#meta), I believe it to be a mighty improvement from an AI and business perspective.
Instead of using the overwhelmingly common 5-star system, Netflix has switched to a thumbs-down or thumbs-up approach. While this change has received coverage from a wide array of online media outlets, not a single one (according to my research) mentions how this impacts Netflix’s collaborative filtering learning algorithms.
A major interest for Netflix to make this change is to better leverage entry data within their learning algorithms.
To do so, they needed to move from an objective rating system to a subjective one.
People like the 5-star rating because it is viewed as objective. It basically pools together all the different ratings it received and shows you the average. Google Play’s interface is a great example to better understand this. Let’s take a look at the Netflix app:
We can see the app is appreciated because most people rated it 5 stars and that all ratings put together average a 4.4/5. The same goes for content on Netflix. With enough people using it, you tend to get a somewhat good idea if a movie or TV show is good. While this was one of the strengths of the 5-star system, it is also one of the main causes for the change. Obviously, Netflix doesn’t want to log in and see plenty of 1-star or 2-star movies. You would quickly think less of Netflix as a content provider and producer, but I would argue this is only part of the equation.
The five-star rating system is considered objective, but many, if not most, people naturally step out of the system’s intended use. Rather than objectively trying to assess the worth of the content, most people understand the principle that content will be presented to them according to how they rate it, creating a bias in how the system is used. I, for example, am very much guilty of this. I used to rate shows either 1 or 5 stars. My goal was to accelerate content filtering according to my tastes. Some TV shows and movies that could’ve been potentially good to others fell victim to my not-so-objective behavior and the binary rating system I crafted for myself could’ve skewed ratings for others. Even though Orange Is the New Black is a great show, it is not to my liking, so I gave it 1 star. With consumers more and more informed on the principles of recommendation engines, this type of biased behavior could make a big difference in the precision of movie recommendations.
Instead of fighting consumer habits, Netflix adapted to them. In my mind, as a marketer, this is always a good thing. When you think about it, I was already using the 5-star system as a binary (thumbs up/down). They want you to think whether it is right for you. It’s all about personalization; moving from a collective, community rating system to a more personalized system.
It’s All About Entry Data
With the 5-star system, things tend to even out when enough people are involved, but with AI data, validity and precision are key to drive valuable results. Building a recommendation engine that uses machine learning to curate content appropriately is way more challenging if your data is unreliable.
Machine learning can be used to assess a large amount of your consumer behavior, compare it to similar consumers, and suggest content that should match those habits. Part of Netflix’s recommendation engine uses collaborative filtering. Simply put, it is a three-step process in which you collect user information, form a matrix to calculate associations, and finally make a recommendation.
When working on a learning algorithm, you aim for the most accurate prediction. You end up fighting for tiny percentage points. On scales as large as Netflix, every percentage point can mean the difference between millions of dollars.
That is precisely why Netflix used to hold an open competition with a grand prize of $1,000,000. From 2006 to 2009, “the Netflix Prize sought to substantially improve the accuracy of predictions about how much someone is going to enjoy a movie based on their movie preferences.” The contest came to a halt in March 2010 when privacy concerns lead to a class action lawsuit against Netflix.
At some point, significantly improving the algorithm's’ performance is hardly possible unless you make changes to how your data is sourced. Moving from a 5-possibility (1 to 5 stars) variable to a simpler binary input reduces complexity and provides cleaner, more accurate data.
A bonus to the change was an increase in engagement. Netflix reported a 200% increase in user ratings. This means more data and more data means improved accuracy on predictions.
Combine the influence on data quality with the fact that we can now use a rating system (with respect to our personal preferences) that's no longer about good or bad content but about matching content, and the reason why they made the change becomes clear.