Getting Started with MongoDB and Python
Join the DZone community and get the full member experience.
Join For FreeIf you've been following this blog for a while, you've seen me mention MongoDB more than once. One exciting thing for me is that I'll be co-teaching a tutorial at PyCon this year on Python and MongoDB that will cover MongoDB, PyMongo, and Ming. So to hopefully whet your appetite for learning more at the tutorial, I thought I'd write a few posts covering MongoDB, PyMongo, and Ming from a beginner's perspective.
What is MongoDB?
From MongoDB.org:
MongoDB (from "humongous") is a scalable, high-performance, open source NoSQL database.
Well, that's not all that enlightening, so I'll expand a bit here on MongoDB's features...
MongoDB is a document database
MongoDB is a document database, which means that instead of storing "rows" in
"tables" like you do in a relational database, you store "documents" in
"collections." Documents are basically JSON objects
(technically BSON. This is to be distinguished from other
NoSQL-type databases such as key-value stores (e.g. Tokyo
Cabinet), column family stores (e.g. Cassandra) or
column stores (e.g. MonetDB).
MongoDB has a flexible query language
This is one thing that makes MongoDB a pleasure to work with, particularly if you come from another NoSQL database where querying is either restrictive (key-value stores which can only be queried by key) or cumbersome (something like CouchDB that requires you to write a map-reduce query). MongoDB has a BSON-based query language that's a bit more restrictive than SQL, that you can still use to get a lot done.
Here's an example of a simple MongoDB query that we use at SourceForge to find all the blog posts for a project:
blog_post.find({'state':'published','app_config_id':{'$in':app_con
There are also several other operators like '$lt', '$nin', '$not', and '$or' that
allow you to construct quite complex queries, though you are somewhat restricted
from what you can do in SQL (even with a single table).
MongoDB is fast and scalable
A single MongoDB node is able to comfortably serve 1000s of requests per second on cheap hardware. When you need to scale beyond that, you can use either replication (keeping several copies of the data on different servers) or sharding (partitioning the data across servers). MongoDB even includes logic to automatically load-balance your shards as your database and load increase.
Getting Started with MongoDB
While MongoDB is fairly straightforward to install on (64-bit) systems, there are also a couple of companies that provide a free tier of MongoDB hosting, MongoLab and MongoHQ that are great for getting started. I've been using, for no particular reason, MongoLab for my own things and I can recommend them, and it's what I have experience with, so that's what I'll cover here.
Let's assume you sign up for a MongoLab account. Once you've done this, you can create a database using their web-based control panel and click on it, you'll note the connection info at the top of the page:
(Your server name and port number may be different.) At this point, most tutorials would tell you to install and launch the 'mongo' command-line tool to begin exploring your database. We'll skip that here and use the python driver PyMongo directly. I like to use virtualenv myself and ipython, so that's the approach I'll take here:
$ virtualenv mongo ... install messages ... $ source mongo/bin/activate (mongo) $ pip install pymongo ipython ... install messages ... (mongo) $ ipython ... banner message ...
Now that we're in ipython, we'll go ahead and connect to the database and create a document.
In [1]: import pymongo In [2]: conn = pymongo.Connection('mongodb://tutorial-test:u3ZYh136@ds029187.mongolab.com:29187/tutorial-test') In [3]: db = conn['tutorial-test'] In [4]: db.test_collection.insert({}) Out[4]: ObjectId('4f16f5c7eb03306a92000000') In [5]: db.test_collection.find() Out[5]: <pymongo.cursor.Cursor at 0x7fbb9006f350> In [6]: list(db.test_collection.find()) Out[6]: [{u'_id': ObjectId('4f16f5c7eb03306a92000000')}]
Well, that's it for now. I'll be posting several followup articles in this series that will go into more detail on how to do various queries and updates using PyMongo, the MongoDB python driver, as well as how to effectively use Ming, so stay tuned!
Opinions expressed by DZone contributors are their own.
Comments