Neo4j and Gatling Sitting in a Tree, Performance T-E-S-T-ING
Join the DZone community and get the full member experience.
Join For Free
i was introduced to the open-source performance testing tool gatling a few months ago by dustin barnes and fell in love with it. it has an easy to use dsl, and even though i don’t know a lick of scala , i was able to figure out how to use it. it creates pretty awesome graphics and takes care of a lot of work for you behind the scenes. they have great documentation and a pretty active google group where newbies and questions are welcomed.
it ships with scala, so all you need to do is create your tests and
use a command line to execute it. i’ll show you how to do a few basic
things, like test that you have everything working, then we’ll create
nodes and relationships, and then query those nodes.
we start things off with the import statements:
import com.excilys.ebi.gatling.core.predef._ import com.excilys.ebi.gatling.http.predef._ import akka.util.duration._ import bootstrap._
then we start right off with our simulation. for this first test, we are just going to get the root node via the rest api. we specify our neo4j server, in this case i am testing on localhost (you’ll want to run your test code and neo4j server on different servers when doing this for real). next we specify that we are accepting json to return. for our test scenario, for a duration of 10 seconds, we’ll get “/db/data/node/0″ and check that neo4j returns the http status code 200 (for everything be ok). we’ll pause between 0 and 5 milliseconds between calls to simulate actual users, and in our setup we’ll specify that we want 100 users.
class getroot extends simulation { val httpconf = httpconfig .baseurl("http://localhost:7474") .acceptheader("application/json") val scn = scenario("get root") .during(10) { exec( http("get root node") .get("/db/data/node/0") .check(status.is(200))) .pause(0 milliseconds, 5 milliseconds) } setup( scn.users(100).protocolconfig(httpconf) ) }
we’ll call this file “getroot.scala” and put it in the user-files/simulations/neo4j.
gatling-charts-highcharts-1.4.0/user-files/simulations/neo4j/
we can run our code with:
~$ bin/gatling.sh
we’ll get a prompt asking us which test we want to run:
gatling_home is set to /users/maxdemarzi/projects/gatling-charts-highcharts-1.4.0 choose a simulation number: [0] getroot [1] advanced.advancedexamplesimulation [2] basic.basicexamplesimulation
choose the number next to getroot and press enter.
next you’ll get prompted for an id, or you can just go with the default by pressing enter again:
select simulation id (default is 'getroot'). accepted characters are a-z, a-z, 0-9, - and _
if you want to add a description, you can:
select run description (optional)
finally it starts for real:
================================================================================ 2013-02-14 17:18:03 10s elapsed ---- get root ------------------------------------------------------------------ users : [#################################################################]100% waiting:0 / running:0 / done:100 ---- requests ------------------------------------------------------------------ > get root node ok=58457 ko=0 ================================================================================ simulation finished. simulation successful. generating reports... reports generated in 0s. please open the following file : /users/maxdemarzi/projects/gatling-charts-highcharts-1.4.0/results/getroot-20130214171753/index.html
the progress bar is a measure of the total number of users who have completed their task, not a measure of the simulation that is done, so don’t worry if that stays at zero for a long while and then jumps quickly to 100%. you can also see the ok (test passed) and ko (tests failed) numbers. lastly it creates a great html based report for us. let’s take a look:
here you can see statistics about the response times as well as the requests per second. so that’s great, we can get the root node, but that’s not very interesting, let’s create some nodes:
class createnodes extends simulation { val httpconf = httpconfig .baseurl("http://localhost:7474") .acceptheader("application/json") val createnode = """{"query": "create me"}""" val scn = scenario("create nodes") .repeat(1000) { exec( http("create node") .post("/db/data/cypher") .body(createnode) .asjson .check(status.is(200))) .pause(0 milliseconds, 5 milliseconds) } setup( scn.users(100).ramp(10).protocolconfig(httpconf) ) }
in this case, we are setting 100 users to create 1000 nodes each with a ramp time of 10 seconds. we’ll run this simulation just like before, but choose create nodes. once it’s done, take a look at the report, and scroll down a bit to see the chart of the number of requests per second:
you can see the number of users ramp up over the first 10 seconds and fade at the end. let’s go ahead and connect some of these nodes together:
we’ll add jsonobject to import statements, and since i want to see what nodes we link to what nodes together, we’ll print the details for the request. i am randomly choosing two ids, and passing them to a cypher query to create the relationships:
import com.excilys.ebi.gatling.core.predef._ import com.excilys.ebi.gatling.http.predef._ import akka.util.duration._ import bootstrap._ import util.parsing.json.jsonobject class createrelationships extends simulation { val httpconf = httpconfig .baseurl("http://localhost:7474") .acceptheader("application/json") .requestinfoextractor(request => { println(request.getstringdata) nil }) val rnd = new scala.util.random val chooserandomnodes = exec((session) => { session.setattribute("params", jsonobject(map("id1" -> rnd.nextint(100000), "id2" -> rnd.nextint(100000))).tostring()) }) val createrelationship = """start node1=node({id1}), node2=node({id2}) create unique node1-[:knows]->node2""" val cypherquery = """{"query": "%s", "params": %s }""".format(createrelationship, "${params}") val scn = scenario("create relationships") .during(30) { exec(chooserandomnodes) .exec( http("create relationships") .post("/db/data/cypher") .header("x-stream", "true") .body(cypherquery) .asjson .check(status.is(200))) .pause(0 milliseconds, 5 milliseconds) } setup( scn.users(100).ramp(10).protocolconfig(httpconf) ) }
when you run this, you’ll see a stream of the parameters we sent to our post request:
{"query": "start node1=node({id1}), node2=node({id2}) create unique node1-[:knows]->node2", "params": {"id1" : 98468, "id2" : 20147} } {"query": "start node1=node({id1}), node2=node({id2}) create unique node1-[:knows]->node2", "params": {"id1" : 83557, "id2" : 26633} } {"query": "start node1=node({id1}), node2=node({id2}) create unique node1-[:knows]->node2", "params": {"id1" : 22386, "id2" : 99139} }
you can turn this off, but i just wanted to make sure the ids were random and it helps when debugging. now we can query the graph. for this next simulation, i want to see the answers returned from neo4j, and i want to see the nodes related to 10 random nodes passed in as a json array. notice it’s a bit different from before, and we are also checking to see if we got “data” back in our request.
import com.excilys.ebi.gatling.core.predef._ import com.excilys.ebi.gatling.http.predef._ import akka.util.duration._ import bootstrap._ import util.parsing.json.jsonarray class querygraph extends simulation { val httpconf = httpconfig .baseurl("http://localhost:7474") .acceptheader("application/json") .responseinfoextractor(response => { println(response.getresponsebody) nil }) .disableresponsechunksdiscarding val rnd = new scala.util.random val noderange = 1 to 100000 val chooserandomnodes = exec((session) => { session.setattribute("node_ids", jsonarray.apply(list.fill(10)(noderange(rnd.nextint(noderange length)))).tostring()) }) val getnodes = """start nodes=node({ids}) match nodes -[:knows]-> other_nodes return id(other_nodes)""" val cypherquery = """{"query": "%s", "params": {"ids": %s}}""".format(getnodes, "${node_ids}") val scn = scenario("query graph") .during(30) { exec(chooserandomnodes) .exec( http("query graph") .post("/db/data/cypher") .header("x-stream", "true") .body(cypherquery) .asjson .check(status.is(200)) .check(jsonpath("data"))) .pause(0 milliseconds, 5 milliseconds) } setup( scn.users(100).ramp(10).protocolconfig(httpconf) ) }
if we take a look at the details tab for this simulation we see a small spike in the middle:
this is a tell-tale sign of a jvm garbage collection taking place and we may want to look into that. edit your neo4j/conf/neo4j-wrapper.conf file and uncomment the garbage collection logging, as well as add timestamps to gain better visibility in to the issue:
# uncomment the following line to enable garbage collection logging wrapper.java.additional.4=-xloggc:data/log/neo4j-gc.log wrapper.java.additional.5=-xx:+printgcdatestamps
neo4j performance tuning deserves its own blog post, but at least now you have a great way of testing your performance as you tweak jvm, cache, hardware, load balancing, and other parameters. don’t forget while testing neo4j directly is pretty cool, you can use gatling to test your whole web application too and measure end to end performance.
Published at DZone with permission of Max De Marzi, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Trending
-
HashMap Performance Improvements in Java 8
-
How to Load Cypress Chrome Extension
-
File Upload Security and Malware Protection
-
Simplifying SAP Data Integration With Google Cloud
Comments