Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Handling Indexes in MongoDB With Python

DZone's Guide to

Handling Indexes in MongoDB With Python

In this multi-faceted post, we take a look at how to use MongoEngine to create indexes in MongoDB via Python, and how different setups impact overhead.

· Database Zone
Free Resource

Navigating today's database scaling options can be a nightmare. Explore the compromises involved in both traditional and new architectures.

MongoEngine is an Object Document Mapper (ODM) for working with MongoDB from Python. The ODM layer maps an object model to a  document database in a way similar to an ORM  mapping an object model to a relational database. ODMs like MongoEngine offer relational database-like features e.g. schema enforcement, foreign key, field-level constraint etc at the application level.

Many good resources are available to learn the use of MongoEngine including a tutorial here.

In this post, we will discuss a MongoEngine programming construct for creating indexes as a MongoDB python tutorial and the performance overhead associated with it.

Automatic Index Creation in MongoEngine

By default, MongoEngine stores documents in a collection that is named as the pluralized form of the class name. For example, the User class shown below will be stored in a collection named users.A model should inherit the MongoEngine class Document to become a mapped object.

class User(Document):
    meta = {
        'indexes': [{
            'fields': ['+name']
        }, {
            'fields': ['#email']
        }]
    }


The User class defined above declares two indexes: 1. name (sort order) and 2. email (hashed). MongoEngine creates each declared index at the first upsert operation. These indexes are created on the collection via a createIndex/ensureIndex call. MongoEngine attempts to create these indexes every time a document is inserted into the collection. For example,

User(name = "Ross", email='ross@gmail.com",address="127,Baker Street").save()


This call results in three command requests to the database server: two commands to ensure that name and email index exist on users collection, one to do the actual upsert.

COMMAND [conn8640] command admin.$cmd command: createIndexes { createIndexes: "user", indexes: [ { background: false, name: "name_1", key: { name: 1 } } ] } keyUpdates:0 writeConflicts:0 numYields:0 reslen:149 locks:{ Global: { acquireCount: { r: 1, w: 1 } }, Database: { acquireCount: { W: 1 } } } protocol:op_query 0ms
COMMAND [conn8640] command admin.$cmd command: createIndexes { createIndexes: "user", indexes: [ { background: false, name: "email_hashed", key: { email: "hashed" } } ] } keyUpdates:0 writeConflicts:0 numYields:0 reslen:149 locks:{ Global: { acquireCount: { r: 1, w: 1 } }, Database: { acquireCount: { W: 1 } } } protocol:op_query 0ms
COMMAND [conn8640] command admin.user command: insert { insert: "user", ordered: true, documents: [ { name: "Ross", email: "ross@example.com", address: "127, Baker Street", _id: ObjectId('584419df01f38269dd9d63c1') } ], writeConcern: { w: 1 } } ninserted:1 keyUpdates:0 writeConflicts:0 numYields:0 reslen:40 locks:{ Global: { acquireCount: { r: 1, w: 1 } }, Database: { acquireCount: { W: 1 } }, Collection: { acquireCount: { w: 1 } } } protocol:op_query 0ms


This is okay for applications where write load is low to moderate. However, if your application is write-intensive, this has a serious adverse impact on write performance.

Avoiding Auto Index Creation

If 'auto_create_index' is set to false in the meta-dictionary, then MongoEngine skips the automatic creation of indexes. No extra createIndex requests are sent during write operations. Turning off Auto Index Creation is also useful in productions systems where indexes are applied typically during database deployment.
 For example,

meta = {
    'auto_create_index': false,
    'indexes': [
        .....
    ]
}


In case you are designing a write-intensive application, it makes sense to decide on your indexes during the schema design phase and deploy them even before the application is deployed. If you are planning to add indexes on existing collections, it would be better to follow the documentation to build indexes on replica sets. Using this approach, we bring down the servers one at a time and build indexes on them.

Use MongoEngine create_index method to create indexes within the application:

User.create_index(keys, background=False, **kwargs)


You can also use the ScaleGrid UI to help you build indexes in a ‘Rolling fashion’ with no downtime. For more details refer to our MongoDB index building blog post.

Understand your options for deploying a database across multiple data centers - without the headache.

Topics:
database ,python ,mongodb ,database perfromance ,tutorial

Published at DZone with permission of Vishal Kumawat, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}