Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Performance Analysis With pyArango: Part II, Inspecting Transactions

DZone's Guide to

Performance Analysis With pyArango: Part II, Inspecting Transactions

Following up on Part 1 of this series, this installment will look into inspecting transactions in the initialization and test code.

· Performance Zone ·
Free Resource

Sensu is an open source monitoring event pipeline. Try it today.

Following the previous blog post on performance analysis with pyArango, where we had a look at graphing using statsd for simple queries, we will now dig deeper into inspecting transactions. At first, we split the initialization code and the test code.

Initialization Code

We load the collection with simple documents. We create an index on one of the two attributes:

#!/usr/bin/python
from pyArango.connection import *
from pyArango.collection import *

conn = Connection(username="root", password="")

db = conn["_system"]

if not db.hasCollection('test'):
    testCol = db.createCollection('Collection', name='test')
else:
    testCol = db.collections['test']
    testCol.truncate()

i = 0;
while i < 100000:
    i+=1
    testCol.createDocument({'foo': 'bar', 'count': i, 'counter': i}).save()

testCol.ensureHashIndex(['count'], sparse=False)

Application Code

Transaction Code

We're creating a simple ArangoDB Transaction containing 3 queries:

    function(params) {
      var db = require('@arangodb').db;
      var startOne = Date.now();
      var q1 = db._query(`FOR doc IN test FILTER doc.count == @i - 1 RETURN doc`, {i:params.i})
      var startTwo = Date.now()
      var q2 = db._query(`FOR doc IN test FILTER doc.counter == @i - 1 RETURN doc`, {i:params.i})
      var startThree = Date.now()
      var q3 = db._query(`RETURN 1`)
      var end = Date.now();
      return {
        tq1: startTwo - startOne,
        tq2: startThree - startTwo,
        tq3: end - startThree,
        all: end - startOne
      }

We put in the counter as a parameter the same way we used it in the last article (Performance analysis with pyArango: Part I). We measure the current time between the 3 queries. As a return value, we calculate the deltas to the invoking program.

Joining the Transaction Code Into the Python Script

Here we have the complete application code including a transaction that we want to monitor:

#!/usr/bin/python
import sys
import statsd
from pyArango.connection import *
from pyArango.collection import *
 
statsdc = statsd.StatsClient('127.0.0.1', '8125')
conn = Connection(username="root", password="", statsdClient = statsdc)
db = conn["_system"]
testCol = db.collections['test']
 
transaction = '''
    function(params) {
      var db = require('@arangodb').db;
      var startOne = Date.now();
      var q1 = db._query(`FOR doc IN test FILTER doc.count == @i - 1 RETURN doc`, {i:params.i})
      var startTwo = Date.now()
      var q2 = db._query(`FOR doc IN test FILTER doc.counter == @i - 1 RETURN doc`, {i:params.i})
      var startThree = Date.now()
      var q3 = db._query(`RETURN 1`)
      var end = Date.now();
      return {
        tq1: startTwo - startOne,
        tq2: startThree - startTwo,
        tq3: end - startThree,
        all: end - startOne
      }
    }
'''
 
i = 0;
while i < 100000:
    aql = '''
    FOR doc IN test FILTER doc.count == @i - 1 RETURN doc
    '''
    db.AQLQuery(aql, rawResults=True, batchSize=1, count=True, bindVars= {'i' : i})
    times = db.transaction(['test'], transaction, params={'i': i})['result']
    for which in times:
        statsdc.timing(which, times[which])

The python script picks the result part containing the return values of the transaction. It then iterates over the values, and sends it to statsd so we can, later on, have a look at it using kcollectd:

We choose 3 graphs, in one we put the Q1 and Q3 times. If you have a look at Q1, you will remember it from our last article - it's covered by an index. Q3 is really simple, so we wouldn't expect much to be going on here. The graph seconds that - Q1 is taking up to 2ms, Q3 up to 1.8ms.

But, with Q2 we added a query which is neither dead simple nor backed by an index. You can see that it takes up to 60 ms and dominates the total time of the transaction.

Again we got useful data by instrumenting parts of our application with gauges we feed into statsd.

What's Next

In the next Part III article of these series, we're going to implement a user scenario to measure the possible capacity of a given system. Using that, we are going to see what to look at on clusters. Your questions, comments, and feedback is more than welcome on our community Slack or Twitter @dothebart. 'Til the next time!

Sensu: workflow automation for monitoring. Learn more—download the whitepaper.

Topics:
performance ,arangodb ,performance analysis

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}