Over a million developers have joined DZone.

Regarding a Common MongoDB Scaling Problem Running 'Shard Collection' Command

· Database Zone

To stay on top of the changing nature of the data connectivity world and to help enterprises navigate these changes, download this whitepaper from Progress Data Direct that explores the results of the 2016 Data Connectivity Outlook survey.

In the last couple weeks, we’ve been getting a lot of questions like: (no one asked this specific question, this is just similar to the questions we’ve been getting)

I ran shardcollection, but it didn’t return immediately and I didn’t know what was going on, so I killed the shell and tried deleting a shard and then running the ‘shard collection’ command again and then I started killing ops and then I turned the balancer off and then I turned it back on and now I’m not sure what’s going on…

Aaaaagh! Stop running commands!

If a single server is like a TIE fighter then a sharded cluster is like the Death Star: you’ve got more power but you’re not making any sudden movements. For any configuration change you make, at least four servers have to talk to each other (usually more) and often a great deal of data has to get processed. If you ran all of the commands above on a big MongoDB install, everything would eventually work itself out (except the killing random operations part, it sort of depends on what got killed), but it could take a long time.

I think these questions stem from sharding being nerve-wracking: the documentation says what commands to run, but then nothing much seems to happen and everything seems slow and the command line doesn’t return a response (immediately). Meanwhile, you have hundreds of gigabytes of production data and MongoDB is chugging along doing… something.

So, I added some new sections to Scaling MongoDB on what to expect when you shard a big collection: if you run x in the shell, you’ll see y in the log, then MongoDB will be doing z until you see w. What it’s doing, what you’ll see, how (and if) you should react. In general: a sharding operation that hasn’t returned yet isn’t done, keep you eye on the logs, and don’t panic.

I’ve also added a section on backing up config servers and updated the chunk size information. If you bought the eBook, you can update it free to the latest version for free to get the new info. (I love this eBook update system!) The update should be going out this week.

Let me know if there’s any other info that you think is missing and I’ll add it for future updates.

Source: http://www.snailinaturtleneck.com/blog/2011/02/24/scaling-mongodb-update/

Turn Data Into a Powerful Asset, Not an Obstacle with Democratize Your Data, a Progress Data Direct whitepaper that explains how to provide data access for your users anywhere, anytime and from any source.

Topics:

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}