Over a million developers have joined DZone.

The Rise of Big Data

· Big Data Zone

Learn how you can maximize big data in the cloud with Apache Hadoop. Download this eBook now. Brought to you in partnership with Hortonworks.

I was helping a MongoDB user with sharding one time. His chunks weren’t splitting and I was trying to diagnose the issue. His shard key looked reasonable, he didn’t have any errors in his log, and manually splitting the chunks worked. Finally, I looked at how much data he was storing: only a few MB per chunk. “Oh, I see the problem,” I told him. “It looks like your chunks are too small to split, you just need more data.”

“No, my data is huge, enormous,” he said.

“Um, okay. If you keep inserting data, it should split.”

“This is a bug. My data is big.”

We argued back and forth a bit, but I managed to back off from having called his data small and convince him it wasn’t a bug. That day, I learned that people take their data size very personally.

Posts that an algorithm decided were similar to this one:

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

Topics:

Published at DZone with permission of Kristina Chodorow, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}