DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Building a 3D WebXR Game with WASI Cycles: Integrating WasmEdge, Wasmtime, and Wasmer to Invoke MongoDB, Kafka, and Oracle
  • A Developer's Guide to Database Sharding With MongoDB
  • Kafka Link: Ingesting Data From MongoDB to Capella Columnar
  • Harmonizing Space, Time, and Semantics: Navigating the Complexity of Geo-Distributed IoT Databases

Trending

  • Querying Without a Query Language
  • Why Google Data Migration Gets Stuck at 99%: Causes and Proven Fixes
  • Java Backend Development in the Era of Kubernetes and Docker
  • How to Prevent Data Loss in C#
  1. DZone
  2. Data Engineering
  3. Databases
  4. Managing Disk Space in MongoDB

Managing Disk Space in MongoDB

By 
Chris Chang user avatar
Chris Chang
·
Feb. 06, 14 · Tutorial
Likes (0)
Comment
Save
Tweet
Share
23.6K Views

Join the DZone community and get the full member experience.

Join For Free

In our previous post on MongoDB storage structure and dbStats metrics, we covered how MongoDB stores data and the differences between the dataSize, storageSize and fileSize metrics. We can now apply this knowledge to evaluate strategies for re-using MongoDB disk space.

When documents or collections are deleted, empty record blocks within data files arise. MongoDB attempts to reuse this space when possible, but it will never return this space to the file system. This behavior explains why fileSize never decreases despite deletes on a database.

If your app frequently deletes or if your fileSize is significantly larger than the size of your data plus indexes, you can use one of the methods below reclaim free space.

Getting your free space back

Compacting individual collections

You can compact individual collections using the compact command. This command rewrites and defragments all data in a collection, as well as all of the indexes on that collection.

Important notes on compacting:

  • This operation blocks all other database activity when running and should be used only when downtime for your database is acceptable. If you are running a replica set, you can perform compaction on secondaries in order to avoid blocking the primary and use failover to make the primary a secondary before compacting it.
  • Compacting individual collections will not reduce your storage footprint on disk (i.e., your fileSize) but it will defragment the collections you compact.

Compacting one or more databases

For a single-node MongoDB deployment, you can use the db.repairDatabase() command to compact all the collections in the database. This operation rewrites all the data and indexes for each collection in the database from scratch and thereby compacts and defragments the entire database.

To compact all the databases on your server process, you can stop your mongod process and run it with the “–repair” option.

Important notes on running a repair:

  • This operation blocks all other database activity when running and should be used only when downtime for your database is acceptable.
  • Running a repair requires free disk space equal to the size of your current data set plus 2 GB.  You can use space in a different volume than the one that your mongod is running in by specifying the “–repairpath” option.

Compacting all databases on a server by re-syncing replica set nodes

For a multi-node MongoDB deployment, you can resync a secondary from scratch to reclaim space. By resyncing each node in your replica set you effectively rewrite the data files from scratch and thereby defragment your database.

Please note that if your cluster is comprised of only two electable nodes, you will sacrifice high availability during the resync because the secondary is completely wiped before syncing.

If your app is sensitive to downtime, we recommend a process similar to the one we use here at MongoLab which we call a “rolling node replacement.” This process replaces each node in your cluster in turn by bringing a new node into the cluster, replicating the data to that new node and removing the old node.  In this way,  your cluster can maintain the same level of redundancy during the compaction as during normal operations.

A tip about efficiently using space

usePowerOf2Sizes

Setting the usePowerof2Sizes option is a proactive approach to reusing space in collections that experience frequent document moves or deletions. This option supersedes the default padding factor mechanism and reduces the impact of fragmentation within the collection by allocating additional space for each document in intervals that follow the powers of 2. Setting this option for a specific collection makes it less likely that documents in that collection need to be moved when they grow in size, less likely that a document will need to be moved more than once in its lifetime, and more likely that space left by moving documents can be reused by new or other moved documents.

Thanks for reading!

We hope the above strategies help guide you in evaluating options for reusing empty space in your MongoDB.

Space (architecture) MongoDB Database

Published at DZone with permission of Chris Chang. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Building a 3D WebXR Game with WASI Cycles: Integrating WasmEdge, Wasmtime, and Wasmer to Invoke MongoDB, Kafka, and Oracle
  • A Developer's Guide to Database Sharding With MongoDB
  • Kafka Link: Ingesting Data From MongoDB to Capella Columnar
  • Harmonizing Space, Time, and Semantics: Navigating the Complexity of Geo-Distributed IoT Databases

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook