DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
  1. DZone
  2. Data Engineering
  3. Databases
  4. Why I Said Goodbye to MongoDB

Why I Said Goodbye to MongoDB

Oren Eini user avatar by
Oren Eini
·
Apr. 18, 12 · Interview
Like (1)
Save
Tweet
Share
38.39K Views

Join the DZone community and get the full member experience.

Join For Free

we got asked several times to respond to this post , about the reason kiip moved away from mongodb:

image

on the surface, ravendb and mongodb are really similar, looking at the good parts of the kiip post, we have schemalessness, easy replication, rich query langauge and we can be access from multiple languages.

but under the hood, ravendb operates in a completely different way than mongodb does. a vast majority of the issues that kiip run into are actually low level ( really low level, is some cases) issues that shouldn’t really be visible to the user.

non-counting b-trees

the fact that mongodb uses non counting b-trees? the only reason that the user care about that is that it actually impacts performance, but the kiip blog mentions a bunch of other issues related to that.

in ravendb, we use lucene as the indexing format, and we really don’t care about the actual format of the indexes. we natively support count() and limit / skip, because we feel that those are actually core parts of what most users need. in fact, our api allows us to get the total count of results of a paged query as a by product of actually making the query. there isn’t any additional cost for doing this.

poor memory management

mongodb relies on the os to do the memory management, by letting the os memory manager to do its work. that is actually quite a smart decision, because i can guarantee that more work has gone into optimizing the os memory manager than could have been invested by the mongodb project. but that is just part of the work.

in ravendb, we are actually a managed application, so we don’t have directly control over memory. that doesn’t mean that we don’t actually manage it. we have several layers of caching in place, exactly because we know more than the os about our own usage scenarios. in many cases, even if you are making a totally new request, it would never hit the disk, because we are keeping track on hot data and making sure that it resides in memory. this applies to both indexes and documents, mind. and during the indexing process we are very careful about memory management.

sure, the os memory manager is more optimized, but the database knows what is going on, and can predict its own usage patterns. that is how ravendb does a lot of magic relating to auto configuration.

uncompressed field names

in mongodb, it is considered good practice to shorten field names for space optimization. but mongodb doesn’t do it for you automatically.

ravendb doesn’t compress field names, but at the same time, it isn’t a good practice to do so. in fact, i think that this is a horrible little mess . there are a lot of arguments against compressing field names, not the least of which is that it makes it pretty hard to figure out what it is that you are actually trying to do. looking at the raw data, something that is done fairly frequently when debugging and troubleshooting becomes harder to work with and manage:

{
  "a2": "nathan ",
  "d3": "",
  "a2": "2012-05-17t00:00:00.0000000",
  "h3": "2012-04-15t00:00:00.0000000",
  "r2": "archanid@sample.com",
  "o2": "8169cd4a-babf-4015-a3c7-4d503642e021",
  "o1": "products/nhprof"
}

anyone wants to figure out what this document is about? and at least in this one, the data itself tells you a lot about the actual content.

there are far better alternatives in place. in ravendb, we do full response / request compression, and we allow to do document compression on disk as well. if we were ever to get to the point where this would be a serious problem (and so far, it isn’t, even on large data sets), it would be less than a week of work to implement string interning inside ravendb, so we would use the same string references for field values.

global write lock

mongodb (as of the current version at the time of writing: 2.0), has a process-wide write lock . … at this point, all other operations including reads are blocked because of the write lock.

now, to be fair, also have a write lock, but it isn’t nearly as bad as it is in mongodb. ravendb write lock is actually for… writes, and it doesn’t interfere with the either reads or indexes. it is on the list of things to remove, but the crazy part is. so far, and we have really demanding users, no one cares. the reason that no one cares is that this is really small lock, and it only affects writes, it is not stop the world type of thing.

safe off by default

i am just going to let kiip’s words stand for themselves (emphasis mine):

this is a crazy default, although useful for benchmarks . as a general analogy: it’s like a car manufacturer shipping a car with air bags off, then shrugging and saying “you could’ve turned it on” when something goes wrong.

ravendb entire philosophy is around safe by default. that is the only thing that really make sense, because otherwise… well… here is what happenned at kiip:

we lost a sizable amount of data at kiip for some time before realizing what was happening and using safe saves where they made sense (user accounts, billing, etc.).

offline table compaction

every now and then, you need to take down mongodb and let it compact its on disk data. this is another stop the world operation, and the only way to keep up when you do so is to have a hot standby ready.

ravendb does all maintenance task while the server is up and serving requests. you don’t need any downtime just because ravendb need to arrange some data on disk, we take care of that live, and with no interruption in service.

secondaries do not keep hot data in ram

as kiip explains it:

the primary doesn’t relay queries to secondary servers, preventing secondaries from maintaining hot data in memory. this severely hinders the “hot-standby” feature of replica sets, since the moment the primary fails and switches to a secondary, all the hot data must be once again faulted into memory.

ravendb doesn’t do so either, but for a drastically different reason. as i mentioned earlier, the way ravendb works is quite different. when you are running a hot standby node, it will get the new data from the server and index it. we keep the index open, so for a lot of the data, it is already going to be in memory. for the rest, as i mentioned, we have several layers of caches that would help prevent needing to page gigabytes on data into memory.

conclusion

as an utterly unbiased observer ( smile ), i can say that ravendb rocks.

what we are actually seeing here is that ravendb put different emphasis on different things. i really care for making the common application level scenarios easy and nice to work with. and i had enough time supporting production level apps that i tried very hard to make sure that ravendb can take care of itself for most scenarios without any hand holding.

MongoDB Database Data (computing) Memory (storage engine)

Published at DZone with permission of Oren Eini, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Metrics Part 2: The DORA Keys
  • How To Set Up and Run Cypress Test Cases in CI/CD TeamCity
  • LazyPredict: A Utilitarian Python Library to Shortlist the Best ML Models for a Given Use Case
  • Top 11 Git Commands That Every Developer Should Know

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: