DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
Building Scalable Real-Time Apps with AstraDB and Vaadin
Register Now

Trending

  • What to Pay Attention to as Automation Upends the Developer Experience
  • How Web3 Is Driving Social and Financial Empowerment
  • Microservices With Apache Camel and Quarkus (Part 2)
  • Application Architecture Design Principles

Trending

  • What to Pay Attention to as Automation Upends the Developer Experience
  • How Web3 Is Driving Social and Financial Empowerment
  • Microservices With Apache Camel and Quarkus (Part 2)
  • Application Architecture Design Principles
  1. DZone
  2. Data Engineering
  3. Databases
  4. Generating Test Data from the Mysql> Prompt

Generating Test Data from the Mysql> Prompt

Peter Zaitsev user avatar by
Peter Zaitsev
·
Sep. 18, 14 · Interview
Like (0)
Save
Tweet
Share
5.41K Views

Join the DZone community and get the full member experience.

Join For Free

Originally Written by Justin Swanhart

There are a lot of tools that generate test data.  Many of them have complex XML scripts or GUI interfaces that let you identify characteristics about the data. For testing query performance and many other applications, however, a simple quick and dirty data generator which can be constructed at the MySQL command line is useful.

First, let’s talk about what kind of data you can easily create with MySQL function calls:

You can generate a decimal number between zero and another number using the MySQL RAND() function like the following query (here between 0 and 10000):

SELECT RAND() * 10000;

Similarly, you can generate a random integer by adding the FLOOR() function:

SELECT FLOOR(RAND() * 10000)

You can generate a random string of 32 characters using MD5():

SELECT MD5(RAND() * 10000)

You can return a random integer between 500 and 1000 with the following:

SELECT FLOOR( 500 + RAND() * (1000 - 500))

You can return a random string from a list of strings by using a table to hold the list. A subselect can select a random name from the list of names.

create table names(id int auto_increment primary key, name varchar(20));
insert into names (name) values ('Justin','Jerry','James','Josh','Julien');
select (select name from names where id = 1 + rand() * 4);

Now we can generate a “fact” table with many rows using fairly simple SQL statements.

First create a table to generate data into:

CREATE TABLE fact (
  dim1 int,
  dim2 int,
  name varchar(20),
  hash varchar(32),
  measure1 double
);

Seed the table with one initial row:

INSERT INTO fact
VALUES (1,1,'Justin',md5(''), .1);

Now grow the table by selecting from the table but providing new random values for the inserted rows:

INSERT INTO fact
SELECT FLOOR(1+ rand()*9999),
       FLOOR(1 + rand()*499),
       (select name from names where id = 1 + rand() * 4),
       MD5(1+rand()*9999),
       rand()
 FROM fact;

As you repeat the INSERT … SELECT, the table will grow exponentially. You may want to add a LIMIT clause to the INSERT … SELECT to reduce the amount of data generated as the table grows.

You will create a table with an even data distribution for each column. You can then add some queries to add skew, either using INSERT … SELECT or UPDATE, for example:

INSERT INTO fact
SELECT 1,1,'Justin',md5(''), .1
  FROM fact
 LIMIT 10000;

That will skew the values by creating many rows with the same data as our initial row.

Using these simple tools, you can generate a data set that is great for testing purposes. For example, dim1 might be a customer_id and dim2 a product_id, and you would populate those tables with 10000 and 500 rows, respectively.


Test data Database MySQL sql

Published at DZone with permission of Peter Zaitsev, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Trending

  • What to Pay Attention to as Automation Upends the Developer Experience
  • How Web3 Is Driving Social and Financial Empowerment
  • Microservices With Apache Camel and Quarkus (Part 2)
  • Application Architecture Design Principles

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com

Let's be friends: