# A Deep Dive Into Recommendation Algorithms With Netflix Case Study and NVIDIA Deep Learning Technology

### Take a deep dive into recommendation algorithms that are crucial for internet platforms, driving user engagement and revenue, and used by major platforms.

Join the DZone community and get the full member experience.

Join For Free## What Are Recommendation Algorithms?

Recommendation Engines are the secret behind every Internet transaction, be it Amazon, Netflix, Flipkart, YouTube, TikTok, even LinkedIn, Facebook, X(Twitter), Snapchat, Medium, Substack, HackerNoon. . . all of these sites and nearly every content curation or product marketplace site on the Internet make their big bucks from recommendation algorithms.

Simply put, a recommendation algorithm builds a model of your likes, dislikes, favorites, things you prefer, genres you prefer, and items you prefer, and when one transaction is made on the site, they practically almost read your mind and predict the next product you are most likely to buy. Some of the recommendation algorithms on YouTube and TikTok are so accurate that they can keep users hooked for hours. I would be surprised if even one reader did not report a YouTube binge that came out of just scrolling and clicking/tapping for around ten minutes.

This leads to better customer engagement, a better customer experience, increased revenue, and more money for the platform itself. Addiction is built upon the accuracy and the scary performance of these ultra-optimized algorithms.

This is how these giants build their audience.

The monthly visitors to YouTube, TikTok, Instagram and Facebook are (source):

- Facebook: 2.9 Billion
- YouTube: 2.2 Billion
- Instagram: 1.4 Billion
- TikTok:1 Billion

And the secret to their success: fantastic recommendation algorithms.

## Types of Recommendation Algorithms

### Collaborative Filtering (User-Based)

User-based collaborative filtering is a recommendation technique that assumes users with similar preferences will have similar tastes. It utilizes user-item interaction data to identify similarities between users, often employing measures such as cosine similarity or Pearson correlation. The method predicts a user's ratings or preferences based on the ratings given by similar users.

However, it can face challenges, such as the cold-start problem for new users who have not yet interacted with the system, and scalability issues may arise when dealing with a large number of users.

```
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
def user_based_cf(ratings_matrix, user_id, k=5):
similarities = cosine_similarity(ratings_matrix)
user_similarities = similarities[user_id]
similar_users = np.argsort(user_similarities)[::-1][1:k+1]
recommendations = np.zeros(ratings_matrix.shape[1])
for similar_user in similar_users:
recommendations += ratings_matrix[similar_user]
return recommendations / k
```

- Uses cosine similarity to calculate user similarities
- Finds the
`k`

most similar users to the target user - Aggregates the ratings of similar users to generate recommendations
- Returns the average rating for each item from similar users
- Simple implementation that can be easily modified or extended

### Collaborative Filtering (Item-Based)

Item-based collaborative filtering assumes that users will prefer items similar to those they have liked in the past. It calculates the similarity between items based on user ratings or interactions. This approach is often more scalable than user-based collaborative filtering, particularly when there are many users and fewer items. It allows for the pre-computation of item similarities, which can make real-time recommendations faster.

While it handles new users better than user-based methods, it may struggle with new items that lack sufficient ratings. Additionally, it is less affected by changes in user preferences over time.

```
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
def item_based_cf(ratings_matrix, item_id, k=5):
similarities = cosine_similarity(ratings_matrix.T)
item_similarities = similarities[item_id]
similar_items = np.argsort(item_similarities)[::-1][1:k+1]
recommendations = np.zeros(ratings_matrix.shape[0])
for similar_item in similar_items:
recommendations += ratings_matrix[:, similar_item]
return recommendations / k
```

- Transposes the rating matrix to calculate item-item similarities
- Finds the
`k`

most similar items to the target item - Aggregates user ratings for similar items
- Returns the average rating for each user based on similar items
- Efficient for systems with more users than items

### Matrix Factorization

Matrix factorization decomposes the user-item interaction matrix into lower-dimensional matrices, assuming that user preferences and item characteristics can be represented by latent factors. Techniques such as Singular Value Decomposition (SVD) or Alternating Least Squares (ALS) are commonly used for this purpose.

This approach can efficiently handle large, sparse datasets and often provides better accuracy compared to memory-based collaborative filtering methods. Additionally, it can incorporate regularization techniques to prevent overfitting, enhancing the model's generalization to unseen data.

```
import numpy as np
def matrix_factorization(R, P, Q, K, steps=5000, alpha=0.0002, beta=0.02):
Q = Q.T
for step in range(steps):
for i in range(len(R)):
for j in range(len(R[i])):
if R[i][j] > 0:
eij = R[i][j] - np.dot(P[i,:], Q[:,j])
for k in range(K):
P[i][k] += alpha * (2 * eij * Q[k][j] - beta * P[i][k])
Q[k][j] += alpha * (2 * eij * P[i][k] - beta * Q[k][j])
e = 0
for i in range(len(R)):
for j in range(len(R[i])):
if R[i][j] > 0:
e += pow(R[i][j] - np.dot(P[i,:], Q[:,j]), 2)
for k in range(K):
e += (beta/2) * (pow(P[i][k], 2) + pow(Q[k][j], 2))
if e < 0.001:
break
return P, Q.T
```

- Implements a basic matrix factorization algorithm
- Uses gradient descent to minimize the error between predicted and actual ratings
- Incorporates regularization to prevent overfitting
- Iteratively updates user and item latent factors
- Stops when error falls below a threshold or maximum steps are reached

### Content-Based Filtering

Content-based filtering recommends items based on their features and user preferences. It builds a profile for each user and item based on their characteristics.

Techniques such as TF-IDF (Term Frequency-Inverse Document Frequency) for text analysis and cosine similarity for matching are commonly employed. This approach effectively addresses the new item problem, as it does not rely on prior user interactions.

However, it may suffer from overspecialization, resulting in a lack of diversity in recommendations. Additionally, effective implementation requires good feature engineering to ensure that the relevant characteristics of items are accurately captured.

```
import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
def content_based_filtering(item_descriptions, user_profile, k=5):
vectorizer = TfidfVectorizer()
item_vectors = vectorizer.fit_transform(item_descriptions)
user_vector = vectorizer.transform([user_profile])
similarities = cosine_similarity(user_vector, item_vectors)
top_items = np.argsort(similarities[0])[::-1][:k]
return top_items
```

- Uses TF-IDF to convert text descriptions into numerical vectors
- Calculates cosine similarity between user profile and item descriptions
- Returns the top
`k`

most similar items to the user profile - Efficient for systems with well-defined item features
- Can be easily extended to include multiple feature types

### Hybrid Recommendation System

Hybrid recommendation systems combine two or more recommendation techniques to leverage their respective strengths. By integrating multiple approaches, hybrid systems can mitigate the weaknesses of individual methods, such as the cold-start problem. Common combinations include collaborative and content-based filtering. Various methods are used for combining these techniques, such as weighted, switching, mixed, or meta-level approaches.

Hybrid systems often provide more robust and accurate recommendations compared to single-approach systems. However, effective implementation requires careful tuning to balance the different components and ensure optimal performance.

```
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
def hybrid_recommender(ratings_matrix, content_matrix, user_id, alpha=0.5, k=5):
cf_similarities = cosine_similarity(ratings_matrix)
content_similarities = cosine_similarity(content_matrix)
hybrid_similarities = alpha * cf_similarities + (1 - alpha) * content_similarities
user_similarities = hybrid_similarities[user_id]
similar_users = np.argsort(user_similarities)[::-1][1:k+1]
recommendations = np.zeros(ratings_matrix.shape[1])
for similar_user in similar_users:
recommendations += ratings_matrix[similar_user]
return recommendations / k
```

- Combines collaborative filtering and content-based similarities
- Uses a weighted sum approach with parameter alpha
- Finds similar users based on the hybrid similarity
- Generates recommendations from similar users' ratings
- Allows for easy adjustment of the balance between CF and content-based approaches

### Singular Value Decomposition (SVD)

Singular Value Decomposition (SVD) is a matrix factorization technique that decomposes a matrix into three components: U, Σ, and V^T. In this decomposition, U and V represent the left and right singular vectors, respectively, while Σ contains the singular values.

SVD reduces dimensionality by retaining only the top `k`

singular values, which helps uncover latent factors in user-item interactions.

This method is efficient for handling large, sparse matrices commonly found in recommendation systems. Additionally, SVD provides a good balance between accuracy and computational efficiency, making it a popular choice for generating recommendations.

```
import numpy as np
from scipy.sparse.linalg import svds
def svd_recommender(ratings_matrix, k=5):
U, s, Vt = svds(ratings_matrix, k=k)
sigma = np.diag(s)
predicted_ratings = np.dot(np.dot(U, sigma), Vt)
return predicted_ratings
```

- Uses scipy's
`svd`

s function to perform truncated SVD - Reconstructs the rating matrix using only the top
`k`

singular values - Returns a dense matrix of predicted ratings for all user-item pairs
- Efficient for large, sparse rating matrices
- Can be easily integrated into a larger recommendation system

### Tensor Factorization

The technique of tensor factorization extends traditional matrix factorization to multi-dimensional data, allowing the incorporation of contextual information such as time and location into recommendations. It utilizes methods like the CP decomposition, which decomposes a tensor into a sum of component tensors, capturing complex interactions between multiple factors. This approach requires more data and computational resources compared to two-dimensional methods, as it deals with higher-dimensional arrays.

However, it can provide highly personalized and context-aware recommendations by leveraging the additional dimensions of data. The increased complexity of the data structure allows for a more nuanced understanding of user preferences in various contexts, enhancing the overall recommendation accuracy.

```
import numpy as np
import tensorly as tl
from tensorly.decomposition import parafac
def tensor_factorization_recommender(tensor, rank=10):
factors = parafac(tensor, rank=rank)
reconstructed_tensor = tl.kruskal_to_tensor(factors)
return reconstructed_tensor
```

- Uses the TensorLy library for tensor operations and decomposition
- Applies PARAFAC decomposition to the input tensor
- Reconstructs the tensor from the decomposed factors
- Returns the reconstructed tensor as recommendations
- Can handle multi-dimensional data (e.g., user-item-context)

### Neural Collaborative Filtering

Deep learning-based recommendation systems combine collaborative filtering techniques with neural networks. This approach allows for learning non-linear user-item interactions, which traditional matrix factorization methods may struggle with. Deep learning recommenders typically use embedding layers to represent users and items in a dense, low-dimensional space. This enables easy integration of additional features or side information, such as user demographics or item descriptions, to enhance the recommendation performance.

When trained on large datasets, deep learning-based systems can often outperform traditional matrix factorization methods in terms of accuracy. However, this advantage comes at the cost of increased computational complexity and the need for large amounts of data.

Deep learning recommenders also require careful hyperparameter tuning to achieve optimal results, making them more challenging to implement and maintain compared to simpler collaborative filtering approaches.

```
import tensorflow as tf
from tensorflow.keras.layers import Input, Embedding, Flatten, Dense, Concatenate
from tensorflow.keras.models import Model
def neural_collaborative_filtering(num_users, num_items, embedding_size=100):
user_input = Input(shape=(1,), dtype='int32', name='user_input')
item_input = Input(shape=(1,), dtype='int32', name='item_input')
user_embedding = Embedding(num_users, embedding_size, name='user_embedding')(user_input)
item_embedding = Embedding(num_items, embedding_size, name='item_embedding')(item_input)
user_vecs = Flatten()(user_embedding)
item_vecs = Flatten()(item_embedding)
concat = Concatenate()([user_vecs, item_vecs])
dense1 = Dense(128, activation='relu')(concat)
dense2 = Dense(64, activation='relu')(dense1)
output = Dense(1, activation='sigmoid')(dense2)
model = Model(inputs=[user_input, item_input], outputs=output)
model.compile(optimizer='adam', loss='binary_crossentropy')
return model
```

- Uses TensorFlow and Keras to build a neural network model
- Creates embedding layers for users and items
- Concatenates user and item embeddings
- Adds dense layers for learning non-linear interactions
- Returns a compiled model ready for training

## The Case Study of Netflix

The journey of Netflix’s recommendation algorithm began with CineMatch in 2000, a collaborative filtering algorithm that used member ratings to estimate how much a user would enjoy a movie. In 2006, The Netflix Prize of 1 million USD was launched to challenge data scientists to create a model that would beat CineMatch by 10%. The winning algorithm was then implemented into Netflix’s internal data.

Netflix soon began to accumulate users and there was a shift to streaming data in 2007. Viewers were exposed to reinforcement learning algorithms and clustering algorithms that generated suggestions in real time. As the algorithm improved, more and more users began to switch to Netflix, simply because of the effectiveness of the recommendation algorithm. Almost 80% of the content viewed on Netflix is suggested by the recommendation algorithm.

The company estimates that it saves 1 billion annually from lost users because of the effectiveness of the recommendation algorithm.

Netflix uses advanced machine learning techniques and clustering with a system of over 1300 clusters based on the metadata of the films that the users watch. This allows them to deliver highly optimized suggestions to their users. But Netflix soon ran into a problem: scale. As the number of monthly users went into hundreds of millions, and the total number of users went to over 200 million, Netflix went all in on cloud computing.

Simply put, they migrated all the data into Amazon Web Services (AWS), starting in 2008. The complete transition process took years to complete and finished in 2015. Netflix reportedly saves 1 billion a year using AWS. AWS also has support built-in for machine learning, which Netflix uses to the full. Netflix reportedly used over 100,000 AWS servers and 1,000 Kinesis shards for its global audience back in 2022.

From 2015, Netflix has also started offering its own productions, over thousands of movies and shows in a wide variety of formats. Netflix recommender algorithms are highly automated and perform thousands of A/B tests for users per day. Today’s Netflix user subscription base exceeds 280 million.

While Netflix now faces stiff competition, especially from Disney+, which has acquired the Marvel and the Star Wars franchise, the company aims to hit 500 million subscribers by 2025.

Last year, Netflix earned a whopping 31 billion in revenue.

The major parts of its current recommendation systems involve:

**Reinforcement learning:**Depending upon user behavior, Netflix changes the content on the screen in real-time. Thus, the system is in a state of constant flux and changes depending upon the user’s interactions.**Deep neural networks:**Because of the scale of the data (over 15,000 shows and almost 300 million users), standard ML techniques are not easy to apply. Deep Learning is used extensively, using NVIDIA’s technology. (See the end of this article for a program that uses NVIDIA’s latest Merlin deep learning technology).**Matrix factorization:**By effectively performing Singular Value Decomposition (SVD) on highly sparse and highly vast matrices, Netflix estimates the importance and the attraction of each user to certain genres and shows.**Ensemble learning:**Clever combinations of the algorithms listed above adjust the recommendations on the fly so that no two users see the same screen. This personalization is what creates the big bucks and keeps Netflix on top of all the OTT platforms.

And all these models and optimizations run hundreds of thousands of times a day for hundreds of thousands of users.

### Modern Deep Learning Technology

With such scales, no single computer can run these ML models alone. That is why AWS runs ML algorithms in a distributed fashion over thousands of machines.

NVIDIA has recently released several products to enable recommendation systems at scale. NVIDIA's GPU clusters also play a big part in the ML algorithm execution. NVIDIA has recently released Merlin, a high-performance recommender algorithm optimized to run on thousands of machines and deliver superior results. This was perhaps only a matter of time, as dataset sizes exceeded far beyond what single computers could process.

Modern recommendation systems use deep learning extensively. As a part of DL, GPU/TPU computing systems are extensively used to speed up the computation.

Some of NVIDIA’s recent offerings for Merlin include:

### NVIDIA Recommender Systems

*(From Announcing NVIDIA Merlin: An Application Framework for Deep Recommender Systems)*

Available as open-source projects:

**NVTabular**

NVTabular is a feature engineering and preprocessing library, designed to quickly and easily manipulate terabyte-scale datasets. It is especially suitable for recommender systems, which require a scalable way to process additional information, such as user and item metadata and contextual information. It provides a high-level abstraction to simplify code and accelerates computation on the GPU using the RAPIDS cuDF library. Using NVTabular, with just 10-20 lines of high-level API code, you can set up a data engineering pipeline and achieve up to 10X speedup compared to optimized CPU-based approaches while experiencing no dataset size limitations, regardless of the GPU/CPU memory capacity.

**HugeCTR**

HugeCTR is a highly efficient GPU framework designed for recommender model training, which targets both high performance and ease of use. It supports both simple deep models and also state-of-the-art hybrid models such as W&D, Deep Cross Network, and DeepFM. We are also working on enabling DLRM with HugeCTR. The model details and hyperparameters can be specified easily in JSON format, allowing for quick selection from a range of common models.

**TensorRT and Triton Server for Inference**

NVIDIA TensorRT is an SDK for high-performance DL inference. It includes a DL inference optimizer and runtime that delivers low latency and high throughput for inference applications. TensorRT can accept trained neural networks from all DL frameworks using a common interface, the open neural network exchange format (ONNX).

NVIDIA Triton Inference Server provides a cloud-inferencing solution optimized for NVIDIA GPUs. The server provides an inference service via an HTTP or gRPC endpoint, allowing remote clients to request inferencing for any model being managed by the server. Triton Server can serve DL recommender models using several backends, including TensorFlow, PyTorch (TorchScript), ONNX runtime, and TensorRT runtime.

## Code Example

The following code example shows an actual preprocessing workflow required to transform the 1-TB Criteo Ads dataset, implemented with just a dozen lines of code using NVTabular. Briefly, numerical and categorical columns are specified. Next, we define an NVTabular workflow and supply a set of train and validation files. Then, preprocessing operations are added to the workflow, and data is persisted to disk. In comparison, custom-built processing codes, such as the NumPy-based data util in Facebook’s DLRM implementation, can have 500-1000 lines of code for the same pipeline.

```
import nvtabular as nvt
import glob
cont_names = ["I"+str(x) for x in range(1, 14)] # specify continuous feature names
cat_names = ["C"+str(x) for x in range(1, 27)] # specify categorical feature names
label_names = ["label"] # specify target feature
columns = label_names + cat_names + cont_names # all feature names
# initialize Workflow
proc = nvt.Worfklow(cat_names=cat_names, cont_names=cont_names, label_name=label_names)
# create datsets from input files
train_files = glob.glob("./dataset/train/*.parquet")
valid_files = glob.glob("./dataset/valid/*.parquet")
train_dataset = nvt.dataset(train_files, gpu_memory_frac=0.1)
valid_dataset = nvt.dataset(valid_files, gpu_memory_frac=0.1)
# add feature engineering and preprocessing ops to Workflow
proc.add_cont_feature([nvt.ops.ZeroFill(), nvt.ops.LogOp()])
proc.add_cont_preprocess(nvt.ops.Normalize())
proc.add_cat_preprocess(nvt.ops.Categorify(use_frequency=True, freq_threshold=15))
# compute statistics, transform data, export to disk
proc.apply(train_dataset, shuffle=True, output_path="./processed_data/train", num_out_files=len(train_files))
proc.apply(valid_dataset, shuffle=False, output_path="./processed_data/valid", num_out_files=len(valid_files))
```

The entire technology stack can be found at the following GitHub repository:

## Conclusion

Recommendation systems have come a long way.

From simple statistical modeling, content-based filtering, and collaborative filtering, we now have deep learning neural networks, HPC nodes, matrix factorization, and its extension to greater dimensions, tensor factorization.

The most profitable recommender system for streaming is NVIDIA, and they run their entire Machine Learning algorithms on the cloud with AWS.

Recommender systems are used everywhere, from Google to Microsoft to Amazon to Flipkart. It is a critical part of the modern-day enterprise, and there is no company online that does not use it in one form or the other.

There are many companies today that offer custom recommendation systems online.

Some of the leading ones include:

**Netflix:**Known for its sophisticated recommendation engine that analyzes user viewing habits to suggest movies and TV shows**Amazon:**Utilizes a powerful recommendation engine that suggests products based on user purchase history and browsing behavior**Spotify:**Employs a recommendation system that curates music playlists and song suggestions based on user listening history**YouTube:**Uses a recommendation engine to suggest videos based on users' viewing patterns and preferences**LinkedIn:**Recommends jobs, connections, and content based on user profiles and professional history**Zillow:**Suggests real estate properties tailored to user preferences and search history**Airbnb:**Provides accommodation recommendations based on user travel history and preferences**Uber:**Recommends ride options based on user preferences and previous rides**IBM Corporation:**A leader in the recommendation engine market, offering various AI-driven solutions**Google LLC (Alphabet Inc.):**Provides recommendation systems across its platforms, leveraging extensive data analytics

Hopefully, one day, your company will be one among this elite list. And all the best for your enterprise.

Regardless of which sector you are in, if you have an online presence, you need to use recommendation systems one-way or another. Continue to explore this segment, and if you have an excellent expertise, rest assured that you will be highly in demand.

Never stop learning. Keep up the enthusiasm. Always believe in your infinite potential for growth. Your future is in your hands. Make it extraordinary!

### References

- Recommender system - Wikipedia
- What are Recommender Systems? - GeeksforGeeks
- Types of Recommendation Systems: How They Work & Use Cases (almabetter.com)
- Recommendation Systems and Machine Learning (itransition.com)
- Recommender Systems in Python 101 (kaggle.com)
- Recommendation System Algorithms: An Overview - KDnuggets

Opinions expressed by DZone contributors are their own.

Comments