DZone
Database Zone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
  • Refcardz
  • Trend Reports
  • Webinars
  • Zones
  • |
    • Agile
    • AI
    • Big Data
    • Cloud
    • Database
    • DevOps
    • Integration
    • IoT
    • Java
    • Microservices
    • Open Source
    • Performance
    • Security
    • Web Dev
DZone > Database Zone > Neo4j Cluster Performance Tuning

Neo4j Cluster Performance Tuning

We examine some tips for tuning your Neo4j cluster that will help ensure your installation is running as optimally as possible.

Moshe Kaplan user avatar by
Moshe Kaplan
·
Jun. 29, 16 · Database Zone · Tutorial
Like (2)
Save
Tweet
3.88K Views

Join the DZone community and get the full member experience.

Join For Free

Neo4j is one of the leading graph database these days, and it is very popular in recommendation systems, fraud detection, and social networks scenarios.

While the single instance (that is included in the community edition) performs very well (usually with under 10ms response time), you may face challenges in cluster mode.

Why Should You Expect for Performance Degradation in Neo4j Cluster?

Two simple reasons:

  • Neo4j cluster is a Master-Slave cluster with an auto failover method (much like MongoDB). However, unlike MongoDB, primary node detection by the client is done by a server-side load balancer and not by the client's driver. 
  • Cluster replication is synchronous by default, unlike MongoDB's async default behavior.

How Much Will It Cost Us?

  • The various nodes of the cluster should be behind a LB. If you select AWS ELB, it will cost you 7 to 30ms, according to our measures below. The ELB latency is increased as request and response become larger (see details on the bottom). Note: Implementing a MongoDB-like driver could be a great improvement and will help save latency and minimize system cost. Plus, it's a great idea for a side project!
  • The nodes behind the ELB replicate change from master to clients. The level of synchronization is controlled by the ha.tx_push_factor parameter with a default value of 1. This parameter controls the number of slaves that should receive the commit before answering the client. By setting it to 0, you avoid synchronization and get a similar result to a single node. Changing the factor will save 70ms on average (and much more at peak time), and will leave us with an average 40ms per query (inc. ELB cost).

You can find the differences below, wherein the tested environment a community edition instance was replaced by a three-node cluster behind an ELB:

  • In the left section, you can see an average of 9ms in the initial state (single community edition instance).
  • In the middle, you can see a fluctuating response time of 40ms (reads) to 300ms (writes) for a three-node cluster behind ELB with  ha.tx_push_factor parameter w/ default value 1.
  • In the right section, you can see a steady 40ms for both reads and writes for a three-node cluster behind ELB with the ha.tx_push_factor parameter w/ value set to 0 (async replication).
Neo4j performance as measured by DataDog client side metrics


Bottom Line

HA has some cost by its side. Better implementation of the load balancing and the right selection of synchronization model can help you gain the needed performance.

Some More Measures to Gain Data to Explore the Neo4j Performance:

  • Enable slow log query and filter server time.
  • Install monitoring like Datadog in the application level.

Related Refcard:

Querying Graphs With Neo4j

cluster Neo4j

Published at DZone with permission of Moshe Kaplan, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Message Queuing and the Database: Solving the Dual Write Problem
  • How Dynamic Rendering Works Using HTML and CSS?
  • Handling Sensitive Data: A Primer
  • Package and Deploy a Lambda Function as a Docker Container With AWS CDK

Comments

Database Partner Resources

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • MVB Program
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends:

DZone.com is powered by 

AnswerHub logo