Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Testing Network Errors With MongoDB

DZone's Guide to

Testing Network Errors With MongoDB

· Java Zone
Free Resource

Build vs Buy a Data Quality Solution: Which is Best for You? Gain insights on a hybrid approach. Download white paper now!

Someone asked on Twitter today for a way to trigger a connection failure between MongoDB and the client. This would be terribly useful when you're testing your application's handling of network hiccups.

You have options: you could use mongobridge to proxy between the client and the server, and at just the right moment, kill mongobridge.

Or you could use packet-filtering tools to accomplish the same: iptables on Linux and ipfw or pfctl on Mac and BSD. You could use one of these tools to block MongoDB's port at the proper moment, and unblock it afterward.

There's yet another option, not widely known, that you might find simpler: use a MongoDB "failpoint" to break your connection.

Failpoints are our internal mechanism for triggering faults in MongoDB so we can test their consequences. Read about them on Kristina's blog. They're not meant for public consumption, so you didn't hear about it from me.

The first step is to start MongoDB with the special command-line argument:

mongod --setParameter enableTestCommands=1

Next, log in with the mongo shell and tell the server to abort the next two network operations:

> db.adminCommand({
...   configureFailPoint: 'throwSockExcep',
...   mode: {times: 2}
... })
2014-03-20T20:31:42.162-0400 trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed

The server obeys you instantly, before it even replies, so the command itself appears to fail. But fear not: you've simply seen the first of the two network errors you asked for. You can trigger the next error with any operation:

> db.collection.count()
2014-03-20T20:31:48.485-0400 trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed

The third operation succeeds:

> db.collection.count()
2014-03-20T21:07:38.742-0400 trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed
2014-03-20T21:07:38.742-0400 reconnect 127.0.0.1:27017 (127.0.0.1) ok
1

There's a final "failed" message that I don't understand, but the shell reconnects and the command returns the answer, "1".

You could use this failpoint when testing a driver or an application. If you don't know exactly how many operations you need to break, you could set times to 50 and, at the end of your test, continue attempting to reconnect until you succeed.

Ugly, perhaps, but if you want a simple way to cause a network error this could be a reasonable approach.

Build vs Buy a Data Quality Solution: Which is Best for You? Maintaining high quality data is essential for operational efficiency, meaningful analytics and good long-term customer relationships. But, when dealing with multiple sources of data, data quality becomes complex, so you need to know when you should build a custom data quality tools effort over canned solutions. Download our whitepaper for more insights into a hybrid approach.

Topics:

Published at DZone with permission of A. Jesse Jiryu Davis, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}