Exploring PyMongo, and When to Use ODMs Instead of the Real Thing
Join the DZone community and get the full member experience.
Join For FreeOn Object-Document Mapping
One of the PyMongo questions I get asked most frequently is whether people should use PyMongo directly or use one of the ODMs (Object-Document Mappers: think ORM for non-relational databases) that have been written on top of it. My recommendation has always been to “just use PyMongo” - I’ll try to explain why in this post. While that makes this post a bit Python specific, I think most of the thinking here applies equally well to Ruby, PHP, etc.
Avoiding New Software
MongoDB is new software. It’s used in a lot of high profile production deployments and I think it’s a great choice for almost all web application development, but it is still a relatively young project. From a practical point of view that youth means that you’re more likely to run into bugs, or just situations that aren’t heavily documented yet or where best practices have yet to be established. With that in mind, I think it makes sense to limit your exposure to new software. Again, I think the benefits of MongoDB make it a worthwhile choice. Adding an ODM to the mix is depending on even more new software, though, so you might need to be prepared to do some trailblazing (bug reports, patches, etc.).
Sticking with the Lingua Franca
One of the best parts The best part about MongoDB is
the community. The mailing list is always active and incredibly
helpful, in spite of not yet being a Fiesta
list :). One of the problems we’ve seen in scaling out the
community/support infrastructure is that there is a great deal of
diversity in how MongoDB is being used. There are dozens of different
language drivers, each of which has at least a couple idiosyncrasies.
When a question comes in to the mailing list the first step in
responding is always figuring out what the question targets: is it a
MongoDB question or is it a PyMongo/Ruby-driver/etc. question? For
questions in the latter category, is it something general enough that it
can be answered by anyone with MongoDB experience or is it territory
that only the PHP expert can answer?
Questions that are general are much more likely to get a quick, thorough answer: there are many more people who are qualified to answer them. By sticking close to the “lingua franca” of MongoDB (in Python that’s PyMongo), you’ll have an easier time getting help when you need it.
Understanding What’s Under the Hood
Finally, I suggest using PyMongo directly because, even if you use an ODM, in all likelihood at some point you’ll need to know how to use PyMongo directly. As with ORMs, ODMs tend to handle the basic cases very nicely. Unfortunately, most applications end up needing to perform at least one or two operations that aren’t covered cleanly by the OxM: developers end up needing to drop out of the abstraction and write raw SQL (or “raw” PyMongo operations).
Since you should plan on needing to look “under the hood” at some point anyway, I think it makes sense to start by learning how things work in PyMongo. That way, if you decide to use an ODM later on you’ll have a good sense of what it’s doing. When things are behaving weirdly or when you need to do something “advanced” you’ll already have the experience necessary to go back under the hood.
It’s important to note that using PyMongo directly is a lot different than embedding SQL queries in your application code. One of the great things about MongoDB is how the drivers manage to feel “native” in so many different languages. In Python, you’re just working with dictionaries. This might not be as nice an abstraction as proper model classes, but it’s certainly a step up from raw SQL.
A Quick Disclaimer
I had the good fortune of working with the developers of MongoEngine, MongoKit andMing (three of the more popular Python ODMs) during my tenure leading the PyMongo project. These people are smart and I think their software is good. So even after reading this post I recommend you check out the projects and decide for yourself whether you want to use one of them; they aren’t for me but they might be for you.
Our Setup
I’ve explained why we use PyMongo directly instead of using an ODM, but I haven’t explained how we use it. Here’s a quick look at the setup we’re using for Fiesta:
We’ve got a single module, db.py, where we place helper methods for doing any interactions with PyMongo. These methods are mostly one or two lines, here’s an example:
def new_group(group): db.groups.insert(group, safe=True) return group
The reason for putting everything in one module is mainly organizational - we can easily see all of the different types of queries and commands that we’re using.
There are some places where we call these methods directly from other pieces of the application, but for the most part we tend to wrap up the calls in home-grown Model classes (e.g. a Group model or a User model). That way we can put validation and other niceties directly into the model code. Our model infrastructure is home-grown, but I think something like DictShield might be the right combination of simplicity & ease of use here.
When using our own models or something like DictShield the end result isn’t that much different than using an ODM (although we do a bit more work to hook everything together). I think the advantage, however, is that we’re keeping validation and modeling completely separate from interacting with the database. That should alleviate most of the concerns I pointed out above.
I’d love to hear from other people on this topic. Have you had success with ODMs or are you using an approach similar to ours? What works and what doesn’t?
Published at DZone with permission of Daniel Gottlieb, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments