Binary Storage with MongoDB and GridFS

DZone 's Guide to

Binary Storage with MongoDB and GridFS

Nuxeo has a ton of storage options. MongoDB and GridFS are superb for binary storage. Learn more about using MongoDB, GridFS, and Nuxeo.

· Integration Zone ·
Free Resource

In a previous blog post, I discussed various storage options for the Nuxeo Platform and specifically support for MongoDB and appropriate use cases. Today, let’s talk about GridFS for binary storage and learn more about the flexibility of the Nuxeo Platform to support various storage and query subsystems to achieve the ideal configuration for your specific use cases.

Let’s take a look at the Nuxeo storage architecture again, in this case configured with MongoDB and GridFS.
MongoDB - BlobManager
You can see that the Nuxeo Platform uses different subsystems for the Document (metadata) and Blob (binaries). A Document is a set of metadata values, various attributes (Facets) as well as referenced binaries, combined in the output from the Repository to the application. A binary in Nuxeo is wrapped by a Document. This allows us to use the most appropriate storage for metadata as well as binaries. Nuxeo Document Store and Blob stores are pluggable, allowing the choice of using these abstractions.

Benefits of Using GridFS

MongoDB users will certainly be familiar with GridFS. By providing a Blob Store implemented by GridFS for binary storage to replace the File System, we are sensibly integrating Document and Blob storage into the MongoDB container.

There are two areas of benefits here:

First, the GridFS capabilities themselves are great. Instead of directly dealing with the File System, GridFS uses the functionality of MongoDB to provide replication, distribution and possible redundancy capabilities. This is possible by breaking every binary into chucks of configurable size, and storing them as individual records in a collection designated for them, while another collection stores records for each file’s information. The GridFS API is built on top of the MongoDB system and handles all the chunking, rebuilding of files and read/write access. Since the files are broken into chunks, you can access the chunks you need for read or write without streaming the whole file. You can also download and stream multiple pieces simultaneously to potentially increase throughput.

The second benefit of combining MongoDB with GridFS is the common management and administration tools and methods. It’s no longer needed to have different backup and infrastructure strategies for database and File System, so it’s effiecient from an administration perspective.

Configuring the Nuxeo Platform to use GridFS

Configuring the Nuxeo Platform to use GridFS instead of the File System is an easy and transparent process. Since the Nuxeo Platform is extension based, it’s just a matter of deploying an XML based configuration to specify the Blob Manager implementation and a couple of properties.

 <extension target="org.nuxeo.ecm.core.blob.BlobManager" point="configuration">
     <blobprovider name="default">
         <property name="server">localhost</property>
         <property name="dbname">nuxeo</property>
         <property name="bucket">nxblobs</property>

The GridFS package is available on the Marketplace package – MongoDB GridFS Storage

This article was written by Mike Obrebski

database, ei, enterprise integration, mongodb

Published at DZone with permission of Mike Obrebski . See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}