Let's Talk About Data Generators
In this article, learn more about data generators and look at generator patterns, pattern repositories, reverse regular expressions, and more.
Join the DZone community and get the full member experience.Join For Free
Having relevant data in your database is crucial for application testing, but it’s not always very easy to get. Real databases are nearly impossible to get due to data privacy, so the only real option is a data generator.
I’m going to present some of the features that I consider important for a data generator. You can keep them in mind while you’re browsing for a generator or when you’re building your own.
Generator Patterns define how the data will look like in the table. Some of the most used patterns are:
Boolean — this data type is one of the simplest and it’s used to define if a value is true or false
Numeric Patterns — the numeric patterns include many other data types (integer, long, double, etc.)
Sequential — will generate a sequence of values starting from one value and ending with another one
Date and Time Stamp — obviously, this will generate date and time formats
Text Patterns — used to generate from one-word texts (first name, country, city) to more complex phrases (using Reverse Regular Expression Text)
If the data generator that you use integrates a pattern repository it will save you a lot of time. Usually, the repository has the most used data patterns (first name, address, email, zip code, etc.). Some of the generators will deduce the column data type from the name of the column.
Reverse Regular Expressions
Being able to compose Reverse Regular Expressions enables you to generate your own data patterns.
The Reverse Regular Expressions are based on the Java standards for regular expressions. They can generate even strings of text.
Values From File
Sometimes, it may be the case that you want to randomly import data from a specific file.
Values From Primary Key
Let’s suppose that you have two tables with a foreign key. The parent table already has data in it, but the child table doesn’t. Using this pattern, the generator will extract the existing values from the parents’ primary key to ensure the validity of the foreign key.
Having the possibility to use programmatic language in data generation expands the possibilities.
I’m going to give you some examples of what you can do just by using Groovy:
- Generate more complex values, like a JSON object
- Combine the values from two columns
- Create custom computations from two columns in the same table
A tool that satisfied all my requests in terms of data generation is the Random Data Generator integrated with DbSchema. I already use this tool for designing my databases, so this powerful data generator comes as a huge plus for me.
It integrates everything I presented above and it’s very flexible.
If you need only the data generator, there are some nice tools on the market, like Datprof or Mockaroo (free), but they are not so complex.
Opinions expressed by DZone contributors are their own.