Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Efficient Retry/Backoff Mechanisms

DZone's Guide to

Efficient Retry/Backoff Mechanisms

· DevOps Zone
Free Resource

The Nexus Suite is uniquely architected for a DevOps native world and creates value early in the development pipeline, provides precise contextual controls at every phase, and accelerates DevOps innovation with automation you can trust. Read how in this ebook.

In day to day life as a systems engineer there aren’t too many opportunities to build solutions based purely on theoretical studies or research papers – generally solutions are built for you and you “just” need to make them work, tie up the spaghetti, shave the yak and so on. Entering the world of programming and software development though, a lot of new problems come up that you can find your own solution to. On one hand this can be extremely liberating – speaking from experience, having for the first time in my career being employed as a software engineer for just over a year now. You can let your creativity go wild, dream up whatever crazy solution and then implement it and see if it works.

On the other hand, there is a vast body of existing work out there to build on, almost exclusively by people smarter than you are and who have put a lot of effort and thought into solving a specific problem. Some good examples in recent times are Paxos and similarly Raft consensus algorithms, F1 Distributed Database, BigTable, Map/Reduce and many many more. A great blog that I try to keep on top of is Werner Vogels’ All Things Distributed, where he maintains a mix of AWS-related announcement-style posts and musings on the various research papers he is reading while on flights between engagements. Needless to say that the Google Research Papers are another valuable source of information, as well as a lot of things coming out of the various SIG* organizations (e.g. SIGOPS, SIGGRAPH).

But back to my point. Having recently coded up a few systems which hurl copious amounts of requests at various AWS services, I very quickly found out that you WILL get errors and need to handle them in a smart way. Not retrying is not an option, as is retrying indefinitely with no limit. Linearly-spaced retries with a limit tend to also suffer from abusing the target system beyond what it can tolerate, before giving up entirely and perhaps prematurely. Once you start to implement a system with exponential backoff similar to how most multi-access medium algorithms work (think Ethernet with CSMA/CD), you are finally starting to get somewhere.

Someone at $EMPLOYER recently linked to this paper which details backoff mechanisms utilising a Fibonacci sequence. It is very interesting work, and taking the time to read and understand such research allows you to build stronger, more resilient software with characteristics far exceeding what you could have come up with on your own. This in itself I find fascinating and addictive. I strongly recommend giving it a read and utilizing this approach to building software.



The DevOps Zone is brought to you in partnership with Sonatype Nexus.  See how the Nexus platform infuses precise open source component intelligence into the DevOps pipeline early, everywhere, and at scale. Read how in this ebook

Topics:

Published at DZone with permission of Oliver Hookins, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}