DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Java and MongoDB Integration: A CRUD Tutorial [Video Tutorial]
  • MongoDB to Couchbase for Developers, Part 1: Architecture
  • MongoDB to Couchbase: An Introduction to Developers and Experts
  • Python Memo 2: Dictionary vs. Set

Trending

  • Revolutionizing Financial Monitoring: Building a Team Dashboard With OpenObserve
  • Unmasking Entity-Based Data Masking: Best Practices 2025
  • AI-Based Threat Detection in Cloud Security
  • How to Practice TDD With Kotlin
  1. DZone
  2. Data Engineering
  3. Databases
  4. Reclaiming Disk Space From MongoDB

Reclaiming Disk Space From MongoDB

If you have used MongoDB, you probably have noticed that it follows a default disk usage policy a bit like "take what you can, give nothing back." However, there will be situations where you don't want to allow MongoDB to keep hogging all of your disk space to itself. So, how would you reclaim this disk space? Read on and find out.

By 
Chamal Nanayakkara user avatar
Chamal Nanayakkara
·
Apr. 22, 16 · Tutorial
Likes (5)
Comment
Save
Tweet
Share
64.4K Views

Join the DZone community and get the full member experience.

Join For Free

If you have used MongoDB, you probably have noticed that it follows a default disk usage policy a bit like "take what you can, give nothing back." Here's a simple example: Let's say you have 10 GB of data in a MongoDB database, and you delete 3 GB of that data. However, even though that data is deleted and your database is holding only 7 GB worth of data, that unused 3 GB will not be released to the OS. MongoDB will keep holding on to the entire 10 GB disk space it had before, so it can use that same space to accommodate new data. You can easily see this yourself by running a db.stats():

A db.stats() example, showing dataSize, storageSize, and fileSize

The dataSize parameter shows the size of the data in the database, while storageSize shows the size of data plus unused/freed space. The fileSize parameter, which is essentially the space your database is taking up on disk, includes the size of data, indexes, and unused/freed space.

MongoDB is commonly used to store large quantities of data, often in read-heavy situations where the amount of data manipulation operations are relatively much less. In this kind of situation, it makes sense to anticipate that if you had to handle a certain amount of data before, then you might have to handle a similar amount again. Nevertheless, there will be situations (your development environment, for example) where you don't want to allow MongoDB to keep hogging all your disk space to itself. So, how would you reclaim this disk space? Depending on your setup and the storage engine you're using for your MongoDB, you have a couple of choices.

Compact

The compact command works at the collection level, so each collection in your database will have to be compacted one by one. This completely rewrites the data and indexes to remove fragmentation. In addition, if your storage engine is WiredTiger, the compact command will also release unused disk space back to the system. You're out of luck if your storage engine is the older MMAPv1 though; it will still rewrite the collection, but it will not release the unused disk space. Running the compact command places a block on all other operations at the database level, so you have to plan for some downtime.

Usage example:

db.runCommand({compact:'collectionName'})

Repair

If your storage engine is MMAPv1, this is your way forward. The repairDatabase command is used for checking and repairing errors and inconsistencies in your data. It performs a rewrite of your data, freeing up any unused disk space along with it. Like compact, it will block all other operations on your database. Running repairDatabase can take a lot of time depending on the amount of data in your db, and it will also completely remove any corrupted data it finds.

RepairDatabase needs free space equivalent to the data in your database and an additional 2GB more. It can be run either from the system shell or from within the mongo shell. Depending on the amount of data you have, it may be necessary to assign a sperate volume for this using the --repairpath option.

Usage examples

In the system shell

mongod --repair --repairpath /mnt/vol1

In the mongo shell

db.repairDatabase()

In the mongo shell, with runCommand

db.runCommand({repairDatabase:1})

Resync

In a replica set, unused disk space can be released by running an initial sync. This involves stopping the mongod instance, emptying the data directory, and then restarting to allow it to reconstruct the data through replication.

Space (architecture) MongoDB Data (computing) Database

Published at DZone with permission of Chamal Nanayakkara. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Java and MongoDB Integration: A CRUD Tutorial [Video Tutorial]
  • MongoDB to Couchbase for Developers, Part 1: Architecture
  • MongoDB to Couchbase: An Introduction to Developers and Experts
  • Python Memo 2: Dictionary vs. Set

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!