Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

SQL Test Data and Non-Evenly Distributed Randoms

DZone's Guide to

SQL Test Data and Non-Evenly Distributed Randoms

· Java Zone
Free Resource

Microservices! They are everywhere, or at least, the term is. When should you use a microservice architecture? What factors should be considered when making that decision? Do the benefits outweigh the costs? Why is everyone so excited about them, anyway?  Brought to you in partnership with IBM.

Need to generate test data in your SQL database? The team over at Periscope has had a couple of blog posts recently reminding us that an evenly random distribution is not always the most useful solution.

As pointed out in their first post on the matter, Beyond Random() — Normal Distributions in SQL, even distributions rarely simulate actual data. A more realistic distribution is the normal distribution, for which the folks at Periscope recommend using the Marsaglia Polar Method, which "converts a pair of uniformly distributed random numbers into a pair of normally distributed random numbers." In the post, they show the steps for using SQL to input random numbers using generate_series into the Marsaglia formulas:


This formula creates a Gaussian bell curve like this:

(Credit: Periscope.io)

In a subsequent blog post, Periscope goes over another distribution type: the Poisson Distribution. They explain the Poisson distribution like this:

Let's say you typically sell 5 widgets per day. How likely is it that you'll sell 5 widgets tomorrow? What about between 4 and 6 widgets tomorrow? Obviously we can't just guess randomly. And the normal distribution won't help either.

Fortunately, this is what the Poisson Distribution is for. Its formula is:

Our Poisson Distribution formula takes 3 inputs:

  • R: Our known rate, in this case 5.
  • e: Euler's Number, 2.71828.
  • k: tomorrow's expected rate.

This creates a distribution that looks like this:

(Credit: Periscope.io)

Periscope's blog entries both give specific details on using these distributions for test data in SQL. It's worth a look; you can check out their full blog at https://periscope.io/blog.


Discover how the Watson team is further developing SDKs in Java, Node.js, Python, iOS, and Android to access these services and make programming easy. Brought to you in partnership with IBM.

Topics:

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}