DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
The Latest "Software Integration: The Intersection of APIs, Microservices, and Cloud-Based Systems" Trend Report
Get the report
  1. DZone
  2. Data Engineering
  3. Databases
  4. Provisioning SQL Databases With Initial Sets of Data Using Java and MockNeat

Provisioning SQL Databases With Initial Sets of Data Using Java and MockNeat

Look at how we can programmatically provision a relational database with initial sets of data by programmatically generating and executing complex SQL Inserts statements.

Andrei Ciobanu user avatar by
Andrei Ciobanu
·
Sep. 13, 18 · Tutorial
Like (3)
Save
Tweet
Share
4.50K Views

Join the DZone community and get the full member experience.

Join For Free

MockNeat is a "faker"-like Java library that is used to mock complex Java objects or to generate arbitrary data for a wide range of applications.

In this tutorial, I am going to talk about how we can programmatically provision a relational database with initial sets of data by programmatically generating and executing complex SQL Inserts statements.

For the demo, we are going to use MySQL and a sample SQL schema (HR App) that can be found here:

Image title


The schema contains 7 tables with different relantionships between them. 

1. Setting up the Database

If you don't have MySQL installed, you can download the community edition it from here. 

If you prefer to use docker as a quick alternative, you can set up and start MySQL with the following commands:

#This will pull the official mysql docker image 
docker pull mysql

After the download finishes, you can start the MySQL container like this:

# This is will start a mysql instance on port 3306 (no password will be required)
docker run -p 3306:3306 -e MYSQL_ALLOW_EMPTY_PASSWORD=yes -d mysql --default-authentication-plugin=mysql_native_password

At this point, you should have MySQL up and running. 

The easiest way to work and connect to MySQL is to install MySQL Worbench. By default, the IDE comes with a pre-defined connection to localhost:3306 (where 3306 is the default port on which MySQL operates). 

Image title

The next step is to create the HR Schema. The DDL script I've used to create the tables is the following.

On purporse, I didn't include any constraints (like FKs) in the definition of the tables. Those constraints can be added after the actual inserts are generated and executed. 

DROP SCHEMA IF EXISTS hr;

CREATE SCHEMA hr COLLATE = utf8_general_ci;

USE hr;

CREATE TABLE regions (
region_id INT (11) UNSIGNED NOT NULL,
region_name VARCHAR(25),
PRIMARY KEY (region_id)
);

CREATE TABLE countries (
country_id CHAR(2) NOT NULL,
country_name VARCHAR(40),
region_id INT (11) UNSIGNED NOT NULL,
PRIMARY KEY (country_id)
);


CREATE TABLE locations (
location_id INT (11) UNSIGNED NOT NULL AUTO_INCREMENT,
street_address VARCHAR(40),
postal_code VARCHAR(12),
city VARCHAR(30) NOT NULL,
state_province VARCHAR(25),
country_id CHAR(2) NOT NULL,
PRIMARY KEY (location_id)
);

CREATE TABLE departments (
department_id INT (11) UNSIGNED NOT NULL,
department_name VARCHAR(30) NOT NULL,
manager_id INT (11) UNSIGNED,
location_id INT (11) UNSIGNED,
PRIMARY KEY (department_id)
);

CREATE TABLE jobs (
job_id VARCHAR(10) NOT NULL,
job_title VARCHAR(35) NOT NULL,
min_salary DECIMAL(8, 0) UNSIGNED,
max_salary DECIMAL(8, 0) UNSIGNED,
PRIMARY KEY (job_id)
);

CREATE TABLE employees (
employee_id INT (11) UNSIGNED NOT NULL,
first_name VARCHAR(20),
last_name VARCHAR(25) NOT NULL,
email VARCHAR(25) NOT NULL,
phone_number VARCHAR(20),
hire_date DATE NOT NULL,
job_id VARCHAR(10) NOT NULL,
salary DECIMAL(8, 2) NOT NULL,
commission_pct DECIMAL(2, 2),
manager_id INT (11) UNSIGNED,
department_id INT (11) UNSIGNED,
PRIMARY KEY (employee_id)
);

CREATE TABLE job_history (
employee_id INT (11) UNSIGNED NOT NULL,
start_date DATE NOT NULL,
end_date DATE NOT NULL,
job_id VARCHAR(10) NOT NULL,
department_id INT (11) UNSIGNED NOT NULL
);

ALTER TABLE job_history ADD UNIQUE INDEX (
employee_id,
start_date
);

If everything went well, you should see a confirmation message:

Image title

2. Generate the SQL Inserts for the Schema Using MockNeat

The installation page for MockNeat can be found here.

The library is published as a Maven repository in jcenter(), so you can easily include it in your gradle builds:

repositories {
    // something else
    jcenter()
}

dependencies {
      // something else
     compile 'net.andreinc.mockneat:mockneat:0.2.3'
}

Or if you prefer maven:

<repositories>
    <repository>
        <id>jcenter</id>
        <url>https://jcenter.bintray.com/</url>
    </repository>
</repositories>

<dependencies>
    <dependency>
        <groupId>net.andreinc.mockneat</groupId>
        <artifactId>mockneat</artifactId>
        <version>0.2.3</version>
    </dependency>
</dependencies>

My recommendation is to always check the official github page or the jcenter() page to get the latest version.

We can start generating the DML Insert Statements using the sqlInserts() method.

The first step is to create a MockNeat instance (or re-using one of the predefined ones):

MockNeat m = MockNeat.threadLocal();

Afterwards we can jump into generating the the actual SQL inserts for the first table: "regions".

// ----- Regions -----
String[] regionNames = new String[] {"Europe", "Americas", "Asia", "Middle East and Africa" };
SQLTable regions = m.sqlInserts()
                    .tableName("regions") 
                    .column("region_id", m.intSeq().start(1)) // A sequence 1,2,3...
                  // At each step we iterate for the values in the regionNames[] array
                    .column("region_name", m.seq(regionNames), TEXT_BACKSLASH) 
                    .table(regionNames.length) // The number of rows
                    .val();

The SQLTable toString() method is overwritten to generate all the corresponding SQL inserts associated with the table. So if we print the "regions" object, we will obtain something like:

System.out.println(regions);

/**

Output:

INSERT INTO regions (region_id, region_name) VALUES (1, 'Europe');
INSERT INTO regions (region_id, region_name) VALUES (2, 'Americas');
INSERT INTO regions (region_id, region_name) VALUES (3, 'Asia');
INSERT INTO regions (region_id, region_name) VALUES (4, 'Middle East and Africa');

**/

The next step is to generate the data for the "countries" table. MockNeat already defines two dictionaries containing names and iso codes for 241 countries (DictType.COUNTRY_NAME and DictType.COUNTRY_ISO_CODE_2). What we can do is to iterate over the data contained in those dictionaries using the seq() method. 

int numCountries = 241;
SQLTable countries = m.sqlInserts()
                      .tableName("countries")
                      .column("country_id",  m.seq(COUNTRY_ISO_CODE_2), TEXT_BACKSLASH)
                      .column("country_name", m.seq(COUNTRY_NAME), TEXT_BACKSLASH)
                      .column("region_id", regions.fromColumn("region_id"))
                      .table(numCountries)
                      .val();

For the "region_id", we will randomly take data from the regions table using the fromColumn() method. This approach is particularly useful when we want to generate relational data and enforce Foreign Key constraints.

If we were to print the rows associated with SQLTable, we will obtain something like this:

System.out.println(countries);

/**

Output

NSERT INTO countries (country_id, country_name, region_id) VALUES ('AF', 'Afghanistan', 2);
INSERT INTO countries (country_id, country_name, region_id) VALUES ('AL', 'Albania', 1);
INSERT INTO countries (country_id, country_name, region_id) VALUES ('DZ', 'Algeria', 1);
INSERT INTO countries (country_id, country_name, region_id) VALUES ('AS', 'American Samoa', 4);
INSERT INTO countries (country_id, country_name, region_id) VALUES ('AD', 'Andorra', 3);
INSERT INTO countries (country_id, country_name, region_id) VALUES ('AO', 'Angola', 2);
INSERT INTO countries (country_id, country_name, region_id) VALUES ('AI', 'Anguilla', 3);
....and so on
**/

Generating data for the "locations" table will involve writing our own custom MockUnits (arbitrary data generators). 

We will need an arbitrary data generator for the "postal_code" and "street_address" column.

MockUnitString streetAddressGen = 
        m.fmt("#{num} #{noun} #{end}")
         .param("num", m.ints().range(10, 2000))
         .param("noun", m.words().nouns().format(CAPITALIZED))
         .param("end", m.from(new String[]{"Ave", "St", "Blvd", "Rd"}));

So using the fmt() method, we will be able to generate street addresses with the following format:

System.out.println(streetAddressGen.val());
System.out.println(streetAddressGen.val());
System.out.println(streetAddressGen.val());

/** 

Output:

1009 Maroons Blvd
286 Berberine Ave
1454 Eatables Ave

**/

Using a similar approach, we will write our own MockUnit to generate data for the "postal_code" column.

MockUnitString postalCodeGen = 
        m.fmt("#{word1} #{word2}")
         .param("word1", m.strings().size(3).format(UPPER_CASE))
         .param("word2", m.strings().size(3).format(UPPER_CASE));

The format of the postal codes is:

System.out.println(postalCodeGen.val());
System.out.println(postalCodeGen.val());
System.out.println(postalCodeGen.val());

/** 

Output:

ENO OYW
OTQ NKZ
BCE TB8

**/

Now we can reuse our generators to actually generate data for the "locations" table:

int numLocations = 100;
SQLTable locations = 
        m.sqlInserts()
         .tableName("locations")
         .column("location_id", m.intSeq().start(1000).increment(100))
         .column("street_address", streetAddressGen, TEXT_BACKSLASH)
         .column("postal_code", postalCodeGen, TEXT_BACKSLASH)
         .column("city", m.cities().us(), TEXT_BACKSLASH)
         .column("state_province", m.cities().capitals(), TEXT_BACKSLASH)
         .column("country_id", countries.fromColumn("country_id"), TEXT_BACKSLASH)
         .table(numLocations)
         .val();

If we were to print the "locations" object, the SQL inserts will look like:

System.out.println(locations);
/** 

Output:

INSERT INTO locations (location_id, street_address, postal_code, city, state_province, country_id) VALUES (1000, '15 Olms St', '9Z3 DOA', 'Weaver', 'Ulaanbaatar', 'MS');
INSERT INTO locations (location_id, street_address, postal_code, city, state_province, country_id) VALUES (1100, '205 Dah Rd', 'M4D D7X', 'Land of Pines', 'Baghdad', 'CR');
INSERT INTO locations (location_id, street_address, postal_code, city, state_province, country_id) VALUES (1200, '848 Mirliton Blvd', 'JSA ARO', 'Millersburg', 'Prague', 'MT');
INSERT INTO locations (location_id, street_address, postal_code, city, state_province, country_id) VALUES (1300, '1585 Supplementer Rd', '97R RF7', 'Sparta', 'São Tomé', 'PY');
... and so on

**/

The next step is to start generating "departments" information. This will be quite straightforward. The only prerequisite will be to have a list of manager IDs already generated. We will use this list for other tables in the future.

// Manager ids
// Generating a list of 20 manager ids in the range [1..1000).
List<Integer> managerIds = m.ints()
.range(1, 1000)
.list(() -> new ArrayList<>(), 20)
.val();

// We will have 10 departments
int depNum = 10;

SQLTable departments = m.sqlInserts()
  .tableName("departments")
  .column("department_id", m.intSeq().start(0).increment(10))
  .column("department_name",m.seq(DEPARTMENTS), TEXT_BACKSLASH)
  .column("manager_id", m.from(managerIds))
  .column("location_id", locations.fromColumn("location_id"))
  .table(depNum)
  .val();

After the departments are generated, we can start generating the actual "jobs":

// ----- JOBS -----
String[] jobNames = new String[] {"Analyst",
                "Consultant",
                "Senior Consultant",
                "Manager",
                "Software Architect",
                "Senior Manager",
                "Director"};

String[] jobIds = new String[] { "A",
                "C",
                "SC",
                "M",
                "SA",
                "SM",
                "D" };

int numJobs = jobNames.length;
int minSalary = 2000;
int maxSalary = 5000;

SQLTable jobs = m.sqlInserts()
                .tableName("jobs")
                .column("job_id", m.seq(jobIds), TEXT_BACKSLASH)
                .column("job_title", m.seq(jobNames), TEXT_BACKSLASH)
                .column("min_salary", minSalary + "")
                .column("max_salary", maxSalary + "")
                .table(numJobs)
                .val();

System.out.println(jobs);

The last and the most complex table is the "employees". Here, we will need to write additional coding logic to make sure every employee has a managerId associated with it.

//  ----- EMPLOYEES -----
int numEmployes = 1000;

SQLTable employees = m.sqlInserts()
  .tableName("employees")
  .column("employee_id", m.intSeq())
  .column("first_name", m.names().first(), TEXT_BACKSLASH)
  .column("last_name", m.names().last(), TEXT_BACKSLASH)
  .column("email", m.emails().domain("corp.com"), TEXT_BACKSLASH)
  .column("phone_number", m.regex("\\+30 [0-9]{9}"), TEXT_BACKSLASH)
  .column("hire_date",
        m.localDates()
         .past(LocalDate.of(2000, 1, 1))
         .display(BASIC_ISO_DATE), TEXT_BACKSLASH)
  .column("job_id", jobs.fromColumn("job_id"), TEXT_BACKSLASH)
  .column("salary", m.ints().range(minSalary, maxSalary))
  .column("commission_pct", "NULL")
  .column("manager_id", "NULL" /* TO UPDATE LATER */)
  .column("department_id", departments.fromColumn("department_id"))
  .table(numEmployes)
  .val();

// Updating employees with their manager ID
// - One person has no manager ID - it should be a director level person
// - Manager id and employee_id should be different

// We update all people with a manager_id
employees.updateAll((i, insert) -> {
            Integer employeeId = parseInt(insert.getValue("employee_id"));
            Integer managerId = Integer.MIN_VALUE;
            while(employeeId == (managerId = m.from(managerIds).val()));
            insert.setValue("manager_id", managerId + "");
});

// One of the directors doesn't have a manager id as he is the CO
employees.selectFirstWhere(sqlInsert -> sqlInsert.getValue("job_id").equals("D"))
 .get()
         .setValue("manager_id", "NULL");

System.out.println(employees);

3. Putting It All Together

If we were to put all together, the resulting script would look like:

import net.andreinc.mockneat.MockNeat;
import net.andreinc.mockneat.abstraction.MockUnitString;
import net.andreinc.mockneat.unit.text.sql.SQLTable;
import org.junit.Test;

import java.time.LocalDate;
import java.util.ArrayList;
import java.util.List;

import static java.lang.Integer.parseInt;
import static java.time.format.DateTimeFormatter.BASIC_ISO_DATE;
import static net.andreinc.mockneat.types.enums.DictType.COUNTRY_ISO_CODE_2;
import static net.andreinc.mockneat.types.enums.DictType.COUNTRY_NAME;
import static net.andreinc.mockneat.types.enums.DictType.DEPARTMENTS;
import static net.andreinc.mockneat.types.enums.StringFormatType.CAPITALIZED;
import static net.andreinc.mockneat.types.enums.StringFormatType.UPPER_CASE;
import static net.andreinc.mockneat.unit.text.sql.escapers.MySQL.TEXT_BACKSLASH;

public class SQLInsertsTest {


    public static void main(String[] args) {
        MockNeat m = MockNeat.threadLocal();

        // ----- Regions -----
        String[] regionNames = new String[] {"Europe",
                "Americas",
                "Asia",
                "Middle East and Africa" };
        SQLTable regions = m.sqlInserts()
                .tableName("regions")
                .column("region_id",
                        m.intSeq().start(1))
                .column("region_name",
                        m.seq(regionNames), TEXT_BACKSLASH)
                .table(regionNames.length)
                .val();
        System.out.println(regions);

        // ----- Countries -----
        int numCountries = 241;
        SQLTable countries = m.sqlInserts()
                .tableName("countries")
                .column("country_id",  m.seq(COUNTRY_ISO_CODE_2), TEXT_BACKSLASH)
                .column("country_name", m.seq(COUNTRY_NAME), TEXT_BACKSLASH)
                .column("region_id", regions.fromColumn("region_id"))
                .table(numCountries)
                .val();
        System.out.println(countries);

        // ----- Locations -----
        int numLocations = 100;

        MockUnitString streetAddressGen = m.fmt("#{num} #{noun} #{end}")
                .param("num",
                        m.ints().range(10, 2000))
                .param("noun",
                        m.words().nouns().format(CAPITALIZED))
                .param("end",
                        m.from(new String[]{"Ave", "St", "Blvd", "Rd"}));

        MockUnitString postalCodeGen = m.fmt("#{word1} #{word2}")
                .param("word1",
                        m.strings().size(3).format(UPPER_CASE))
                .param("word2",
                        m.strings().size(3).format(UPPER_CASE));


        SQLTable locations = m.sqlInserts()
                .tableName("locations")
                .column("location_id",
                        m.intSeq().start(1000).increment(100))
                .column("street_address",
                        streetAddressGen, TEXT_BACKSLASH)
                .column("postal_code",
                        postalCodeGen, TEXT_BACKSLASH)
                .column("city",
                        m.cities().us(), TEXT_BACKSLASH)
                .column("state_province",
                        m.cities().capitals(), TEXT_BACKSLASH)
                .column("country_id",
                        countries.fromColumn("country_id"), TEXT_BACKSLASH)
                .table(numLocations)
                .val();

        System.out.println(locations);

        // ----- Departments -----

        // Manager ids
        List<Integer> managerIds = m.ints()
                .range(1, 1000)
                .list(() -> new ArrayList<>(), 20)
                .val();

        int depNum = 10;

        SQLTable departments = m.sqlInserts()
                .tableName("departments")
                .column("department_id",
                        m.intSeq().start(0).increment(10))
                .column("department_name",
                        m.seq(DEPARTMENTS), TEXT_BACKSLASH)
                .column("manager_id",
                        m.from(managerIds))
                .column("location_id",
                        locations.fromColumn("location_id"))
                .table(depNum)
                .val();

        System.out.println(departments);

        // ----- JOBS -----
        String[] jobNames = new String[] {"Analyst",
                "Consultant",
                "Senior Consultant",
                "Manager",
                "Software Architect",
                "Senior Manager",
                "Director"};

        String[] jobIds = new String[] { "A",
                "C",
                "SC",
                "M",
                "SA",
                "SM",
                "D" };

        int numJobs = jobNames.length;
        int minSalary = 2000;
        int maxSalary = 5000;


        SQLTable jobs = m.sqlInserts()
                .tableName("jobs")
                .column("job_id",
                        m.seq(jobIds), TEXT_BACKSLASH)
                .column("job_title",
                        m.seq(jobNames), TEXT_BACKSLASH)
                .column("min_salary",
                        minSalary + "")
                .column("max_salary",
                        maxSalary + "")
                .table(numJobs)
                .val();

        System.out.println(jobs);

        //  ----- EMPLOYEES -----

        int numEmployes = 1000;

        SQLTable employees = m.sqlInserts()
                .tableName("employees")
                .column("employee_id",
                        m.intSeq())
                .column("first_name",
                        m.names().first(), TEXT_BACKSLASH)
                .column("last_name",
                        m.names().last(), TEXT_BACKSLASH)
                .column("email",
                        m.emails().domain("corp.com"), TEXT_BACKSLASH)
                .column("phone_number",
                        m.regex("\\+30 [0-9]{9}"), TEXT_BACKSLASH)
                .column("hire_date",
                        m.localDates()
                                .past(LocalDate.of(2000, 1, 1))
                                .display(BASIC_ISO_DATE), TEXT_BACKSLASH)
                .column("job_id",
                        jobs.fromColumn("job_id"), TEXT_BACKSLASH)
                .column("salary",
                        m.ints().range(minSalary, maxSalary))
                .column("commission_pct",
                        "NULL")
                .column("manager_id",
                        "NULL" /* TO UPDATE LATER */)
                .column("department_id",
                        departments.fromColumn("department_id"))
                .table(numEmployes)
                .val();

        // Updating employees with their manager ID
        // - One person has no manager ID - it should be a director level person
        // - Manager id and employee_id should be different

        // We update all people with a manager_id
        employees.updateAll((i, insert) -> {
            Integer employeeId = parseInt(insert.getValue("employee_id"));
            Integer managerId = Integer.MIN_VALUE;

            while(employeeId == (managerId = m.from(managerIds).val()));

            insert.setValue("manager_id", managerId + "");
        });

        // One of the directors doesn't have a manager id as he is the CO
        employees.selectFirstWhere(sqlInsert -> sqlInsert.getValue("job_id").equals("D"))
                .get()
                .setValue("manager_id", "NULL");

        System.out.println(employees);
    }
}

Thanks for reading! Let me know what you think by writing in the comments section!

Database Relational database sql Data (computing) MySQL Java (programming language)

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • 10 Things to Know When Using SHACL With GraphDB
  • Custom Validators in Quarkus
  • What Is API-First?
  • Best Navicat Alternative for Windows

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: