DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Data Engineering
  3. Databases
  4. Building MongoDB Applications with Binary Files Using GridFS: Part 1

Building MongoDB Applications with Binary Files Using GridFS: Part 1

Francesca Krihely user avatar by
Francesca Krihely
·
Jan. 06, 15 · Interview
Like (0)
Save
Tweet
Share
7.24K Views

Join the DZone community and get the full member experience.

Join For Free

Use Cases

This is a two part series that explores a powerful feature in MongoDB for managing large files called GridFS. In the first part we discuss use cases appropriate for GridFS, and in part 2 we discuss how GridFS works and how to use it in your apps.

In my position at MongoDB, I speak with many teams that are building applications that manage large files (videos, images, PDFs, etc.) along with supporting information that fits naturally into JSON documents. Content management systems are an obvious example of this pattern, where is it necessary to both manage binary artifacts as well as all the supporting information about those artifacts (e.g., author, creation date, workflow state, version information, classification tags, etc.).

MongoDB manages data as documents, with a maximum size of 16MB. So what happens when your image, video, or other file exceeds 16MB? GridFS is a specification implemented by all of the MongoDB drivers that manages large files and their associated metadata as a group of small files. When you query for a large file, GridFS automatically reassembles the smaller files into the original large file.

Content management is just one of the many uses of GridFS. For example, McAfee optimizes delivery of analytics and incremental updates in MongoDB as binary packages for efficiently delivery to customers. Pearson stores student data in GridFS and leverages MongoDB’s replication to distribute data and keep it synchronized across facilities.

There are also many other systems with this requirement, many of which are in healthcare. Hospitals and other care-providing organizations want to centralize patient record information to provide a single view of the patient making this information more accessible to doctors, care providers, and patients themselves. The central system improves the quality of the patient care (care providers get a more complete and global view of the patient’s health status) as well as the efficiency of providing care.

A typical patient health record repository includes general information about the patient (name, address, insurance provider, etc.) along with all various types of medical records (office visits, blood tests and labs, medical procedures, etc.). Much of this information fits nicely into JSON documents and the flexibility of MongoDB makes it easy to accommodate variability in content and structure. Healthcare applications also manage large binary files such as radiology tests, x-ray and MRI images, CAT scans, or even legacy medical records created by scanning paper documents. It is not uncommon for a large hospital or insurance provider to have 50-100 TB of patient data and as much as 90% is large binary files.

When building an application like this, organizations usually have to decide whether to store the binary data in a separate repository, or along with the metadata. With MongoDB, there are a number of compelling reasons for storing the binary data in the same system as the rest of the information. They include:

  • The resulting application will have a simpler architecture: one system for all types of data;
  • Document metadata can be expressed using the rich flexible document structure, and documents can be retrieved using all the flexibility of MongoDB’s query language;
  • MongoDB’s high availability (replica sets) and scalability (sharding) infrastructure can be leveraged for binary data as well as the structured data;
  • One consistent security model for authenticating and authorizing access to the metadata and files;
  • GridFS doesn’t have the limitations of some filesystems, like number of documents per directory, or file naming rules.

Fortunately, GridFS makes working with large binary files in MongoDB easy. In part two, I will walk through in detail how you use GridFS to store and manage binary content in MongoDB. In the meantime if you’re looking to learn more, download our architecture guide:

DOWNLOAD ARCHITECTURE GUIDE

 

MongoDB application

Published at DZone with permission of Francesca Krihely, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Remote Debugging Dangers and Pitfalls
  • Load Balancing Pattern
  • The Importance of Delegation in Management Teams
  • Beginners’ Guide to Run a Linux Server Securely

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: