Movie Recommendation App using Spring Data and Redis
Join the DZone community and get the full member experience.
Join For FreeThis blog explains how to build a movie recommendation app using Spring Data and Redis, a NoSQL database. We will be using a NoXML approach and try to identify the nuances of a NoSQL database.
The Movie recommendation application stores the ratings for different movies by users and tries to provide similarity scores between users and recommend movies. It is based on an example in the book 'Programming Collective Intelligence'. We will be using Redis ZSet to store data. A ZSet is a sorted set which keeps the members of the set sorted by a supplied rank.
Code is located on github
Configuring Redis
Install redis as per installations on redis site. run 'redis-server'. Thats it.Configuring Spring Data
The Spring Data documentation explains how to configure and use Spring Data and redis. The most important part is to add Spring milestone/snapshot repositories in your pom.xml. But I won't be repeating that here. Let us take a JavaConfig based approach to configure Spring.
We need the following configuration to setup the RedisConnectionFactory and a StringRedisTemplate. If you are not familiar with Spring JavaConfig then you would point out an issue of RedisConnectionFactory not being a Singleton anymore. But thats what you need cglib for in your pom.xml (the Singleton issue will be taken care of by enhancing your config class).
@Configuration
public class Config {
@Bean
public RedisConnectionFactory getConnectionFactory() {
JedisConnectionFactory cf = new JedisConnectionFactory();
return cf;
}
@Bean
public StringRedisTemplate getRedisTemplate() {
return new StringRedisTemplate(getConnectionFactory());
}
}
DataModel
So, now how do you create a non-relational data model. In NoSQL world you try to optimize your data model for use cases. You give things like data duplications etc lesser importance (and that is why purists hate NoSQL solutions).
Our use cases are:-
1. Store rating for movies by a user.
2. Compute similarity between users.
3. Recommend movies
Here is the code in the AllInOne *UberDao*. Ignore lines adding Movies and Users. The StringRedisTemplate is autowired in DAO.
We create ZSetOperations bound to the user key and then add ratings for the movies. It also maintains a ZSet for mapping movie to user ratings (this is required for user case #3). If we do not maintain duplicate data then logic would be required to extract same data later (space vs time).
@Component
public class UberDao {
@Autowired
private StringRedisTemplate srt;
public void addRatings(String user, Map<String, Double> ratings) {
// Used for batch mode
srt.multi();
srt.boundSetOps("Users").add(user);
BoundZSetOperations<String, String> boundZSetOps = srt.boundZSetOps(user);
for (Map.Entry<String, Double> mr : ratings.entrySet()) {
srt.boundSetOps("Movies").add(mr.getKey());
// ZSet to keep track of movie => user rank map
srt.boundZSetOps(mr.getKey()).add(user, mr.getValue());
boundZSetOps.add(mr.getKey(), mr.getValue());
}
// runs all commands in batch
srt.exec();
}
Now an observation here, the datamodel looks like a Map. Yes it does: it is a key value store and the point to note is that the database is the extension of the application. There is no impedence mismatch between the datastore and the application model. Is it right or wrong? I will keep that question open.
Operations in Redis allow you to do transactional updates to counters and do server side operations like UNION and INTERSECT. You can see use of multi and exec to do transactional updates.
Computing similarity
Similarity between users can be used by calculating the euclidian distance between user ratings for the common movies or finding the correlation. Class Recommend implements both (please refer to source code on github).To get the common movies for two users we can fetch their movies and add loops in the client codes. But Redis has a built in intersect mechanism for such *social* tasks. We use zInterStore to compute the difference between user ratings and then compute the euclidean distance. See class Recommend for details of calculating similarity scores and 'Collective Intellegence' for details.
public Map<String, Double> getScoreDiff(final String p1, final String p2) {
Map<String, Double> mScoreMap = new HashMap<String, Double>();
final String combinedKey = p1 + ":" + p2;
Set<Tuple> movieAndScores = srt.execute(new RedisCallback<Set<Tuple>>() {
@Override
public Set<Tuple> doInRedis(RedisConnection con)
throws DataAccessException {
// emits a new zset ...
con.zInterStore(combinedKey.getBytes(), Aggregate.SUM, new int[] {1,-1}, p1.getBytes(), p2.getBytes());
// remove this key after a while.
con.expire(combinedKey.getBytes(), 120);
return con.zRangeByScoreWithScore(combinedKey.getBytes(), 1, 20);
}
});
for (Tuple t : movieAndScores) {
mScoreMap.put(new String(t.getValue()), t.getScore());
}
return mScoreMap;
}
A person can be compared with every other person and then a list of the top 5/10 people with similar tastes can be found. The movies that those people see could be of interest.
Recommendations
To compute recommendations for a user you create a weighted (by similarity scores) rating for Movies that the users have not seen. For this you need ratings for a movie from all users. Class Recommend (method getRecommendations) does this.You can play with the class Ratings to change the feed data and find recommendations.
How do i run it
I am using a testcase (MovieTest) to capture different steps (no assertions there). They need to be run in sequence one after the other. I did not find any JUnitRunner for JavaConfig in spring so we have to initialise the application in test case.
public class MovieTest {
private AnnotationConfigApplicationContext ctx;
private UberDao dao;
private Recommend recomender;
@Before
public void init() {
// No junit runner to run app with javaconfig.
ctx = new AnnotationConfigApplicationContext(Config.class);
ctx.scan("xebia.moviez.dao");
ctx.scan("xebia.moviez.service");
dao = ctx.getBean(UberDao.class);
recomender = ctx.getBean(Recommend.class);
}
Conclusion
Creating applications with a NoSQL database can be difficult at first as we try to create a relational model in a NoSQL database. NoSQL databases have the schema information embedded in code and without that information, data is more or less useless. NoSQL is not fit for everything, it has its use cases. It's not only for scalability. Imagine, if someone would have created an application using HashMaps only, before the term NoSQL was there.Opinions expressed by DZone contributors are their own.
Comments