Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

The Best Multi-Threading Debugging Tool in Microsoft Excel

DZone's Guide to

The Best Multi-Threading Debugging Tool in Microsoft Excel

· Database Zone
Free Resource

Check out the IT Market Clock report for recommendations on how to consolidate and replace legacy databases. Brought to you in partnership with MariaDB.

In any system that gets to a certain size, especially one that is multi threaded, there are a certain class of bugs that are really a bitch to figure out.

They are usually boil down to some sort of a race condition. I have been doing that recently, trying to fix a bunch of race conditions in RavenDB. The problem with race conditions is that they are incredibly hard to reproduce, and even when you can reproduce them, you can’t really debug them.

I came up with the following test harness to try to narrow things down. Basically, create a test case that sometimes fails, and run it a thousand times.

Odds are that you’ll get the failure, but that isn’t the important bit. The important bit is that you setup logging properly. In my case, I set things up so the logging output to CSV and each test run had a different file.

This gives me output that looks like this:

23:54:34.5952,Raven.Database.Server.HttpServer,Debug,Request #  10: GET     -    12 ms - <default>  - 200 - /indexes/dynamic/Companies?query=&start=0&pageSize=1&aggregation=None,,11
23:54:34.5952,Raven.Client.Document.SessionOperations.QueryOperation,Debug,"Stale query results on non stale query '' on index 'dynamic/Companies' in 'http://reduction:8079/', query will be retried, index etag is: bcf0fed1-d975-43b4-bfb7-65221ef06b99",,1
23:54:34.5952,Raven.Database.Indexing.WorkContext,Debug,"No work was found, workerWorkCounter: 10, for: TasksExecuter, will wait for additional work",,6
23:54:34.5952,Raven.Database.Indexing.WorkContext,Debug,Incremented work counter to 11 because: WORK BY IndexingExecuter,,12
23:54:34.5952,Raven.Database.Indexing.WorkContext,Debug,"No work was found, workerWorkCounter: 11, for: TasksExecuter, will wait for additional work",,6
23:54:34.5952,Raven.Database.Indexing.WorkContext,Debug,"No work was found, workerWorkCounter: 11, for: ReducingExecuter, will wait for additional work",,21
23:54:34.5952,Raven.Database.Indexing.WorkContext,Debug,"No work was found, workerWorkCounter: 11, for: IndexingExecuter, will wait for additional work",,12
23:54:34.6952,Raven.Client.Document.SessionOperations.QueryOperation,Debug,Executing query '' on index 'dynamic/Companies' in 'http://reduction:8079/',,1
23:54:34.8172,Raven.Database.Indexing.Index.Querying,Debug,Issuing query on index Temp/Companies for all documents,,11
23:54:34.8172,Raven.Storage.Esent.StorageActions.DocumentStorageActions,Debug,Document with key 'companies/1' was found,,11

Not really helpful in figure out what is going on, right? Except for one tiny thing, we load them to Excel, and we use conditional formatting to get things to look like this:

image

The reason this is so helpful? You can actually see the threads interleaving. This usually help me get to roughly the right place, and then I can add additional logging so I can figure out better what is actually going on.

I was able to detect and fix several race conditions using this approach.

And yes, I know that this is basically printf() debugging. But at least it is printf() debugging with pretty colors.

Interested in reducing database costs by moving from Oracle Enterprise to open source subscription?  Read the total cost of ownership (TCO) analysis. Brought to you in partnership with MariaDB.

Topics:

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}