Using Map and Reduce View for Ranking
Starting from version 2.0, Couchbase server offers a powerful way of creating indexes for JSON documents through the concept of views.
Using views, it is possible to define primary indexes, composite indexes and aggregations allowing to:
. query documents on different JSON properties
. create statistics and aggregates
Views generate materialized indexes so provide a fast and efficient way for executing pre-defined queries.
This blog provides a simple example of how a view using map and reduce can be created to index a JSON document attribute but also to determine document ranking based on that attribute.
Using map and reduce is a very fast efficient way to determine ranking and can scale across million of users and provide very fast ranking lookup. Thanks Aaron for teaching me this!
This can be used for instance for ranking users based on score or experience.
This blog illustrates that concept for a user document with 2 attributes: name and experience, index this document based on experience and allow determining ranking based on the experience attribute.
We will first write some Java code allowing connecting to Couchbase server and creating User documents. The following Java code is self-contained and will create the users.
It is leveraging Gson Google code libraries for creating JSON objects.
package blog;
import com.couchbase.client.CouchbaseClient;
import com.couchbase.client.CouchbaseConnectionFactoryBuilder;
import com.google.gson.Gson;
import java.io.IOException;
import java.io.UnsupportedEncodingException;
import java.net.URI;
import java.util.ArrayList;
class UserDoc {
String name;
long experience;
UserDoc(String name, long experience) {
this.name = name;
this.experience = experience;
}
}
/**
*
* @author alexis
*/
public class RankView {
private static CouchbaseClient client;
public static void main(String[] args) throws UnsupportedEncodingException, IOException {
ArrayList<URI> nodes = new ArrayList<>();
// Add one or more nodes of your cluster (exchange the IP with yours)
nodes.add(URI.create("http://127.0.0.1:8091/pools"));
// Try to connect to the client
CouchbaseConnectionFactoryBuilder cfb = new CouchbaseConnectionFactoryBuilder();
cfb.setOpTimeout(10000);
cfb.setReadBufferSize(1024);
cfb.setShouldOptimize(true);
cfb.setTimeoutExceptionThreshold(100);
try {
client = new CouchbaseClient(cfb.buildCouchbaseConnection(nodes, "default", ""));
} catch (Exception e) {
System.err.println("Error connecting to Couchbase: " + e.getMessage());
System.exit(1);
}
UserDoc user = null;
// Creates users
for (int i = 0; i < 10; i++) {
user = new UserDoc("User" + i, Math.round(Math.random()*1000));
Gson json = new Gson();
String jsonString = json.toJson(user);
client.set(user.name, 0, jsonString);
}
client.shutdown();
}
}
After running this program (please change the URL or bucket name as appropriate) you should now have 10 users in your bucket.
The next step is to a User design document with a Rank view.
The first step is create a simple Map for the Rank view which will emit the experience attribute:
function (doc, meta) {
if (doc.experience)
emit(doc.experience, null);
}
Doing so will create an index based on experience attribute but will not allow to determine the ranking.
This is where adding a Reduce fits in. We will add simple built-in _count reduce.
The full view should look like this:
The Reduce function allows to aggregate the number of User documents with a known experience value (if doc.experience).
By not specifying any query parameter, it will output 10 which is the number of documents that we have created.
In order to look up the ranking of a specific User what we need to do is first look up the ranking of a given user (User5 in this example):
// Look up a specific user
String jsonString = (String) client.get("User5");
user = json.fromJson(jsonString, UserDoc.class);
From there, we can create a query which will filter this count with a range query with a descending order which will start by max value to capture all the users which have a greater experiencethe experience and end with value for the user:
View view = client.getView("User", "Rank");
Query query = new Query();
query.setIncludeDocs(true).setLimit(10000);
query.setRangeStart(Long.toString(Long.MAX_VALUE));
query.setRangeEnd(Long.toString(user.experience));
query.setDescending(true);
query.setReduce(true);
As such the reduce will output the number of users which have a greater experience than the experience for that user. The ranking is simply that number + 1.
ViewResponse response = client.query(view, query);
Iterator<ViewRow> itr = response.iterator();
while (itr.hasNext()) {
ViewRow row = itr.next();
System.out.println("Rank: " + Long.parseLong(row.getValue()) + 1 );
}
This will output the Rank based on the experience for that user such as:
Rank: 7
The full Java code (again self contained) is:
package blog;
import com.couchbase.client.CouchbaseClient;
import com.couchbase.client.CouchbaseConnectionFactoryBuilder;
import com.couchbase.client.protocol.views.Query;
import com.couchbase.client.protocol.views.View;
import com.couchbase.client.protocol.views.ViewResponse;
import com.couchbase.client.protocol.views.ViewRow;
import com.google.gson.Gson;
import java.io.IOException;
import java.io.UnsupportedEncodingException;
import java.net.URI;
import java.util.ArrayList;
import java.util.Iterator;
class UserDoc {
String name;
long experience;
UserDoc(String name, long experience) {
this.name = name;
this.experience = experience;
}
}
/**
*
* @author alexis
*/
public class RankView {
private static CouchbaseClient client;
public static void main(String[] args) throws UnsupportedEncodingException, IOException {
ArrayList<URI> nodes = new ArrayList<>();
// Add one or more nodes of your cluster (exchange the IP with yours)
nodes.add(URI.create("http://127.0.0.1:8091/pools"));
// Try to connect to the client
CouchbaseConnectionFactoryBuilder cfb = new CouchbaseConnectionFactoryBuilder();
cfb.setOpTimeout(10000);
cfb.setReadBufferSize(1024);
cfb.setShouldOptimize(true);
cfb.setTimeoutExceptionThreshold(100);
try {
client = new CouchbaseClient(cfb.buildCouchbaseConnection(nodes, "default", ""));
} catch (Exception e) {
System.err.println("Error connecting to Couchbase: " + e.getMessage());
System.exit(1);
}
UserDoc user = null;
Gson json = null;
// Creates users
for (int i = 0; i < 10; i++) {
user = new UserDoc("User" + i, Math.round(Math.random() * 1000));
json = new Gson();
String jsonString = json.toJson(user);
client.set(user.name, 0, jsonString);
}
// Look up a specific user
String jsonString = (String) client.get("User5");
user = json.fromJson(jsonString, UserDoc.class);
View view = client.getView("User", "Rank");
Query query = new Query();
query.setIncludeDocs(true).setLimit(10000);
query.setRangeStart(Long.toString(Long.MAX_VALUE));
query.setRangeEnd(Long.toString(user.experience));
query.setDescending(true);
query.setReduce(true);
ViewResponse response = client.query(view, query);
Iterator<ViewRow> itr = response.iterator();
while (itr.hasNext()) {
ViewRow row = itr.next();
System.out.println("Rank: " + Long.parseLong(row.getValue()) + 1 );
}
client.shutdown();
}
}
Using a map and reduce view for ranking allows to quickly and very efficiently look up a rank without having to do additional processing on the client side.
To learn more about views and queries in Couchbase, read: http://www.couchbase.com/docs/couchbase-devguide-2.1.0/indexing-querying-data.html
Comments