Over a million developers have joined DZone.

Binary Storage with MongoDB and GridFS

Nuxeo has a ton of storage options. MongoDB and GridFS are superb for binary storage. Learn more about using MongoDB, GridFS, and Nuxeo.

· Integration Zone

Build APIs from SQL and NoSQL or Salesforce data sources in seconds. Read the Creating REST APIs white paper, brought to you in partnership with CA Technologies.

In a previous blog post, I discussed various storage options for the Nuxeo Platform and specifically support for MongoDB and appropriate use cases. Today, let’s talk about GridFS for binary storage and learn more about the flexibility of the Nuxeo Platform to support various storage and query subsystems to achieve the ideal configuration for your specific use cases.

Let’s take a look at the Nuxeo storage architecture again, in this case configured with MongoDB and GridFS.
MongoDB - BlobManager
You can see that the Nuxeo Platform uses different subsystems for the Document (metadata) and Blob (binaries). A Document is a set of metadata values, various attributes (Facets) as well as referenced binaries, combined in the output from the Repository to the application. A binary in Nuxeo is wrapped by a Document. This allows us to use the most appropriate storage for metadata as well as binaries. Nuxeo Document Store and Blob stores are pluggable, allowing the choice of using these abstractions.

Benefits of Using GridFS

MongoDB users will certainly be familiar with GridFS. By providing a Blob Store implemented by GridFS for binary storage to replace the File System, we are sensibly integrating Document and Blob storage into the MongoDB container.

There are two areas of benefits here:

First, the GridFS capabilities themselves are great. Instead of directly dealing with the File System, GridFS uses the functionality of MongoDB to provide replication, distribution and possible redundancy capabilities. This is possible by breaking every binary into chucks of configurable size, and storing them as individual records in a collection designated for them, while another collection stores records for each file’s information. The GridFS API is built on top of the MongoDB system and handles all the chunking, rebuilding of files and read/write access. Since the files are broken into chunks, you can access the chunks you need for read or write without streaming the whole file. You can also download and stream multiple pieces simultaneously to potentially increase throughput.

The second benefit of combining MongoDB with GridFS is the common management and administration tools and methods. It’s no longer needed to have different backup and infrastructure strategies for database and File System, so it’s effiecient from an administration perspective.

Configuring the Nuxeo Platform to use GridFS

Configuring the Nuxeo Platform to use GridFS instead of the File System is an easy and transparent process. Since the Nuxeo Platform is extension based, it’s just a matter of deploying an XML based configuration to specify the Blob Manager implementation and a couple of properties.

 <extension target="org.nuxeo.ecm.core.blob.BlobManager" point="configuration">
     <blobprovider name="default">
         <property name="server">localhost</property>
         <property name="dbname">nuxeo</property>
         <property name="bucket">nxblobs</property>

The GridFS package is available on the Marketplace package – MongoDB GridFS Storage

This article was written by Mike Obrebski

The Integration Zone is brought to you in partnership with CA Technologies.  Use CA Live API Creator to quickly create complete application backends, with secure APIs and robust application logic, in an easy to use interface.

database,mongodb,enterprise integration,ei

Published at DZone with permission of Mike Obrebski. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}