Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Update Fixed number of MongoDB Documents

DZone's Guide to

Update Fixed number of MongoDB Documents

· Java Zone ·
Free Resource

Get the Edge with a Professional Java IDE. 30-day free trial.

Recently I worked on a project which uses MongoDB as a source data system and uses R for analysis and MongoDB again for output storage.

In this project we faced a different problem. We were using R to process source data present in MongoDB and if we gave large number of documents for analysis to R it was becoming slower and a bottleneck. To avoid this bottleneck we had to implement processing of a fixed number of documents in R for a batch.

To achieve this we needed some kind of record number in MongoDB, but being a distributed database getting some sequential number in MongoDB was not supported. Also our MongoDB source was getting populated by a distributed real-time stream so implementing some logic on application side was also deterrent.

To have some batchId field for a fixed number of documents in MongoDB we implemented below algorithm :

1. Find for documents which didn't had batchId field.

2. Sort by some timestamp field.

3. Limit the number of documents (say 10000).

4. Append batchId field to documents and save them (get value of batchId from audit table).

MongoDB shell command for this is :

db['collection1'].find({batchId:null}).sort({systemTime:1}).limit(10000).forEach(
  function (e) {
// get value of batchId from audit table
  e.batchId = 1;
  db['collection1'].save(e);
  }
);

Using the above code we appeneded batchId to MongoDB documents and picked only current batchId for analysis in R.

Java code for above MongoDB shell command is :

public class UpdateMongoBatchId {
	public static void main(String[] args) {
		Integer batchId = new Integer(args[0]);
		
		try {
			Mongo mongo = new Mongo("10.x.x.x", 27017);
			DB db = mongo.getDB("dbname");
			DBCollection coll1 = db.getCollection("collname");
		
			// MongoDB find conditions
			BasicDBObject searchQuery = new BasicDBObject();
			searchQuery.put("batchId", null);
			BasicDBObject searchFields = new BasicDBObject();
			BasicDBObject sortOrder = new BasicDBObject();
			sortOrder.put("systemTime", 1);
			
			DBObject currDocument;
			
			DBCursor cursor = coll1.find(searchQuery).sort(sortOrder).limit(MongoVariables.BATCH_SIZE);
			try {
				while (cursor.hasNext()) {
					currDocument = cursor.next();
					currDocument.put("batchId", batchId);
					coll1.save(currDocument);
				}

			} catch (Exception e) {
				// TODO: handle exception
			} finally { 
				cursor.close();
			}
			System.out.println("Updated batchId to MongoDB");
		} catch (Exception e) {
			// TODO: handle exception
		} finally {
			if (mongo != null) {
				mongo.close();
			}
		}
	}
}

Get the Java IDE that understands code & makes developing enjoyable. Level up your code with IntelliJ IDEA. Download the free trial.

Topics:

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}