Our post received over 4,200 upvotes and more than 20,000 visitors rated over 250,000 movies and shows…in just one day. We weren’t giving movies away or incentivizing users in any secret way – people were just really excited about the “graphy” and connected nature of the product, and they dove right in.
Some quick context about Reddit and the subreddit we posted to: r/InternetIsBeautiful looks for “Awesome websites that offer a unique service.” Posts are meant to be a conversation — so a product launch is a great way to get real-time feedback on what users like, love, hate or want.
Now that the dust has settled, we thought we would look at some real user feedback on the NextQueue features that people loved and explain how our media personalization platform, The Entertainment Graph, uses Neo4j to make those features possible.
Here are some of the things we heard:
“Upvoted and shared to my friends for two reasons. One being this service is great. The second is that you actually make the distinction between Fantasy and Sci-fi.”The Entertainment Graph powers NextQueue in part by connecting more than 4,500 traits to over 60,000 premium movies and shows. We can easily combine these traits to generate what we call GraphGenres – meaningful lists of content that share a specific set of traits – and we can run algorithms against the set that would be extremely difficult in a traditional RDBMS.
Modeling all that data in a graph allows us to take a very nuanced approach to describing content with genres, moods, tones, themes, styles and other important terms. That nuance shows in the little things, like splitting out Fantasy and Sci-Fi, but also in the overall results of the recommendations.
I’m so happy someone created this – It was almost obvious when you go to Netflix or IMDb and you see a ‘people also liked’ option and you ended up in an infinite loop of the same 4-5 items.
While building The Entertainment Graph we focused a lot on two qualities: variety and versatility. We didn’t want to create a massive echo chamber that keeps bringing back the same tired suggestions, and we didn’t want to build a rigid platform that limits the possibilities for our API partners.
Here’s the reality: Netflix can only recommend media to you from what they have available. The same goes for Hulu and Amazon Prime. But because NextQueue brings together all your online services, it has the advantage of a larger catalog of titles to recommend from.
We’ve also built our API to be very flexible – it accepts one or more factors such as Movies, Shows, Traits, Collections, Users, or almost anything else you can think of, and then you can add filters like MPAA, Rotten Tomatoes score or year to narrow down the results.
Fast graph traversals using low-level extension APIs allow us to determine the content that is the most relevant within the constraints provided. Because of the traversal speed, we can consider many more data points that bring helpful context to the recommendations, such as the connected traits, the user’s preferences and the activities of other users with similar tastes.
“I’ve tried a number of similar resources in the hopes of finding reliably accurate recommendations, and I have to say that this is by far the best resource I’ve encountered so far. After picking my initial 3 movies, about 80-90% of recommendations were right on point. I imagine that as I continue inputting my preferences, it will only get more accurate. Keep up the good work!”
We actually validate our recommendations against our real user dataset of a million taste profiles, allowing us to assign confidence scores to our results. For example, with ten data points for a user, we have an average of 90% confidence that they will like the first four recommendations we return. So this Reddit user wasn’t far off!
The recommendation engine gets better as people add more data to their taste profile, but that improvement isn’t just a straight line – we’ve made some big leaps forward over time.
By analyzing the graph database, we are able to find clusters of users that share similar interests in content. These clusters evolve over time so our algorithms are constantly scanning the graph looking to identify new clusters and more precise placement of users within existing and new clusters.