Over a million developers have joined DZone.

How Big is Your MongoDB?

· Java Zone

Learn more about how the Java language, tools and frameworks have been the foundation of countless enterprise systems, brought to you in partnership with Salesforce.

As your MongoDB grows in size, information from the db.stats() diagnostic command (or the database “Stats” tab in our management portal) becomes increasingly helpful for evaluating hardware requirements.

We frequently get questions about the dataSize, storageSize and fileSize metrics, so we want to help developers better understand how MongoDB storage works and what these particular metrics mean.

MongoDB storage structure basics

First, we’ll go over the basics of how MongoDB stores your data.

Data files

Every MongoDB instance consists of a namespace file,  journal files and data files. For our discussion, we’ll only be focusing on data files, since that is where all of the data and indexes for your database reside.

Data files store BSON documents, indexes, and MongoDB-generated metadata in structures called extents. Each data file is made up of multiple extents.


Extents are logical containers within data files used to store documents and indexes.

Photo of data files and extents

The above diagram illustrates the relationship between data files and extents. Note:

  • Data and indexes are each contained in their own sets of extents; no extent will ever contain content for more than one collection
  • Data and indexes are never contained within the same extent
  • The data and indexes for a collection will usually span multiple extents
  • When a new extent is needed, MongoDB will attempt to use available space within current data files. If space cannot be found MongoDB will create new data files.

Metrics from db.stats()

Now that we understand the basics of how MongoDB storage is organized, we can explore metrics commonly examined with db.stats(): dataSize, storageSize and fileSize.


Picture of MongoDB dbStats dataSize

The dataSize metric is the sum of the the sizes (in bytes) of all the documents and padding stored in the database.

While dataSize does decrease when you delete documents, dataSize does not decrease when documents shrink because the space used by the original document has already been allocated (to that particular document) and cannot be used by other documents.

Alternatively, if a user updates a document with more data, dataSize will remain the same as long as the new document fits within its originally padded pre-allocated space.


Photo of MongoDB dbStats storageSize

The storageSize metric is equal to the size (in bytes) of all the data extents in the database. This number is larger than dataSize because it includes yet-unused space (in data extents) and space vacated by deleted or moved documents within extents.

The storageSize does not decrease as you remove or shrink documents.


Photo of MongoDB dbStats fileSize

The fileSize metric is equal to the size (in bytes) of all the data extents, index extents and yet-unused space (in data files) in the database. This metric represents the storage footprint of your database on disk. fileSize is larger than storageSize because it includes index extents and yet-unused space in data files.

While fileSize does decrease when you delete a database, fileSize does not decrease as you remove collections, documents or indexes.

What now?

That’s it! The next time someone asks you how big your database is you know what to tell them.

Discover how the Force.com Web Services Connector (WSC) is a code-generation tool and runtime library for use with Force.com Web services, brought to you in partnership with Salesforce.


Published at DZone with permission of Chris Chang, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}