Let's Talk About Data Generators

DZone 's Guide to

Let's Talk About Data Generators

In this article, learn more about data generators and look at generator patterns, pattern repositories, reverse regular expressions, and more.

· Database Zone ·
Free Resource

Having relevant data in your database is crucial for application testing, but it’s not always very easy to get. Real databases are nearly impossible to get due to data privacy, so the only real option is a data generator. 

I’m going to present some of the features that I consider important for a data generator. You can keep them in mind while you’re browsing for a generator or when you’re building your own. 

Generator Patterns

Generator Patterns define how the data will look like in the table. Some of the most used patterns are:

  • Boolean  this data type is one of the simplest and it’s used to define if a value is true or false

  • Numeric Patterns — the numeric patterns include many other data types (integer, long, double, etc.)

  • Sequential — will generate a sequence of values starting from one value and ending with another one

  • Date and Time Stamp — obviously, this will generate date and time formats

  • Text Patterns — used to generate from one-word texts (first name, country, city) to more complex phrases (using Reverse Regular Expression Text)

Pattern Repository

If the data generator that you use integrates a pattern repository it will save you a lot of time. Usually, the repository has the most used data patterns (first name, address, email, zip code, etc.). Some of the generators will deduce the column data type from the name of the column.

Reverse Regular Expressions

Being able to compose Reverse Regular Expressions enables you to generate your own data patterns. 

The Reverse Regular Expressions are based on the Java standards for regular expressions. They can generate even strings of text.

Values From File

Sometimes, it may be the case that you want to randomly import data from a specific file. 

Values From Primary Key

Let’s suppose that you have two tables with a foreign key. The parent table already has data in it, but the child table doesn’t. Using this pattern, the generator will extract the existing values from the parents’ primary key to ensure the validity of the foreign key. 

Programmatic Language 

Having the possibility to use programmatic language in data generation expands the possibilities. 

I’m going to give you some examples of what you can do just by using Groovy:

  • Generate more complex values, like a JSON object
  • Combine the values from two columns
  • Create custom computations from two columns in the same table


A tool that satisfied all my requests in terms of data generation is the Random Data Generator integrated with DbSchema. I already use this tool for designing my databases, so this powerful data generator comes as a huge plus for me. 

It integrates everything I presented above and it’s very flexible. 

If you need only the data generator, there are some nice tools on the market, like Datprof or Mockaroo (free), but they are not so complex. 

Pattern Repository, Reverse Regular Expressions, data generaror for mysql, data generation, database, database design, database designing, primary key

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}