DZone
Big Data Zone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
  • Refcardz
  • Trend Reports
  • Webinars
  • Zones
  • |
    • Agile
    • AI
    • Big Data
    • Cloud
    • Database
    • DevOps
    • Integration
    • IoT
    • Java
    • Microservices
    • Open Source
    • Performance
    • Security
    • Web Dev
DZone > Big Data Zone > When Spark Jobs On Cassandra Start Hanging

When Spark Jobs On Cassandra Start Hanging

How to use Spark to survive node crashes in Apache Cassandra.

Ryan Svihla user avatar by
Ryan Svihla
·
Apr. 22, 16 · Big Data Zone · Tutorial
Like (4)
Save
Tweet
5.22K Views

Join the DZone community and get the full member experience.

Join For Free

So this was hyper obvious once I saw the executor logs and the database schema, but this had me befuddled at first and the change in behavior with one node should have made it obvious.

The code was simple, read a bunch of information, do some minor transformations and flush to Cassandra. This was nothing crazy. But during the users fault tolerance testing, the job would just seamingly hang indefinitely when a node was down.

If One Node Takes Down Your App, Do You Have Any Replicas?

That was it, in fact that’s always it, if something myseriously “just stops” usually you have a small cluster and no replicas (RF 1). Now one may ask why anyone would ever have 1 replica with Cassandra and while I concede it is a very fair question, this was the case here.

Example if I have RF1 and three nodes, when I wrote a row it’s only going to go to 1 of those nodes. If it dies, then how would I retrieve the data? Wouldn’t it just fail? Ok yeah wait a minute why is the job not failing?

It’s the Defaults!

This is a bit of misdirection, the other nodes were timing out (probably slow IO layer). If we read the connector docs we get query.retry.count which gives us 10 retries, and the default read.timeout_ms is 120000 (which confusingly is also the write timeout), so 1.2 million milliseconds or 20 minutes to fail a task that is timing out. If you retry that task 4 times (which is the Spark default) it could take you 80 minutes to fail the job, this is of course assuming all the writes timeout.

The Fix

Short term

  • Don’t let any nodes fall over
  • drop retry down to 10 seconds, this will at least let you fail fast
  • drop output.batch.size.bytes down. Default is 1kb (originally I messed this up and made this 1mb, thanks Guda Uma Shanker for the keen eye, 1mb would be a horrible default), half until you stop having issues.

Long term

  • Use a proper RF of 3
  • I still think the default retry of 120 seconds is way too high. Try 30 seconds at least.
  • Get better IO usually local SSDs will get you where you need to go.
career Database

Published at DZone with permission of Ryan Svihla, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Composable Architecture
  • Your Old Laptop Is Your New Database Server
  • The Most Popular Kubernetes Alternatives and Competitors
  • How to Submit a Post to DZone

Comments

Big Data Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • MVB Program
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends:

DZone.com is powered by 

AnswerHub logo