MyDrive consists of "an AWS cloud-hosted data processing platform powered in part by a chain of Resque workers." This week, Karl Matthais, an engineer at MyDrive, posted an article about their switch from MongoDB to Cassandra along with some stats showing significant performance improvements.
MongoDB behaved reasonably well, but the unpredictability of the load times for different sizes of work was troubling and made pipeline tuning difficult. Some queries would return 30 documents and others 300. With the way Mongo is designed (and most relational datastores also), this resulted in a varying IO load...
With the way our workload behaves, it seems that with MongoDB the number of writes was noticeably impacting read performance. With Cassandra, this does not seem to be the case for our scenario.
-- Karl Matthais
There's plenty of additional detail about the MyDrive systems in the comments of the blog post as well.