DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Related

  • Navigating NoSQL: A Pragmatic Approach for Java Developers
  • Java and MongoDB Integration: A CRUD Tutorial [Video Tutorial]
  • Architecture and Code Design, Pt. 2: Polyglot Persistence Insights To Use Today and in the Upcoming Years
  • Architecture and Code Design, Pt. 1: Relational Persistence Insights to Use Today and On the Upcoming Years

Trending

  • Unveiling Supply Chain Transformation: IIoT and Digital Twins
  • Java Stream API: 3 Things Every Developer Should Know About
  • Essential Steps to Building a Robust Cybersecurity Team
  • 5 Popular Standalone JavaScript Spreadsheet Libraries
  1. DZone
  2. Data Engineering
  3. Databases
  4. Making Pivot Tables With Java Streams From Databases

Making Pivot Tables With Java Streams From Databases

You can create Pivot Tables with data from a database in pure Java, without writing SQL. Learn how you can leverage Java Streams for analyzing database content.

By 
Per-Åke Minborg user avatar
Per-Åke Minborg
·
May. 26, 18 · Tutorial
Likes (11)
Comment
Save
Tweet
Share
21.0K Views

Join the DZone community and get the full member experience.

Join For Free

Raw data from database rows and tables does not provide much insight to human readers. Instead, humans are much more likely to see data patterns if we perform some kind of aggregation on the data before it is presented to us. A pivot table is a specific form of aggregation where we can apply operations like sorting, averaging, or summing, and also often a grouping of column values.

In this article, I will show how you can compute pivot tables of data from a database in pure Java without writing a single line of SQL. You can easily reuse and modify the examples in this article to fit your own specific needs.

In the examples below, I have used open-source Speedment, which is a Java Stream ORM, and the open-source Sakila film database content for MySQL. Speedment works for any major relational database type such as MySQL, PostgreSQL, Oracle, MariaDB, Microsoft SQL Server, DB2, AS400, and more.

Pivoting

I will construct a Map of Actor objects and, for each Actor, a corresponding List of film ratings of films that a particular Actor has appeared in. Here is an example of how a pivot entry for a specific Actor might look like expressed verbally:

"John Doe participated in 9 films that were rated 'PG-13' and 4 films that were rated 'R'."

We are going to compute pivot values for all actors in the database. The Sakila database has three tables of interest for this particular application:

  1. "film" containing all the films and how the films are rated (e.g. "PG-13", "R", etc.).

  2. "actors" containing (made up) actors (e.g. "MICHAEL BOLGER", "LAURA BRODY", etc.).

  3. "film_actor" which links films and actors together in a many-to-many relationship.

The first part of the solution involves joining these three tables together. Joins are created using Speedment's JoinComponent, which can be obtained like this:

// Visit https://github.com/speedment/speedment
// to see how a Speedment app is created. It is easy!
Speedment app = ...;

JoinComponent joinComponent = app.getOrThrow(JoinComponent.class);


Once we have the JoinComponent, we can start defining Join relations that we need to compute our pivot table:

Join<Tuple3<FilmActor, Film, Actor>> join = joinComponent
        .from(FilmActorManager.IDENTIFIER)
        .innerJoinOn(Film.FILM_ID).equal(FilmActor.FILM_ID)
        .innerJoinOn(Actor.ACTOR_ID).equal(FilmActor.ACTOR_ID)
        .build(Tuples::of);


build() takes a method reference Tuples::of that will resolve to a constructor that takes three entities of type FilmActor, Film and Actor, and that will create a compound immutable Tuple3 comprising those specific entities. Tuples are built into Speedment.

Armed with our Join object, we now can create our pivot Map using a standard Java Stream obtained from the Join object:

Map<Actor, Map<String, Long>> pivot = join.stream()
    .collect(
        groupingBy(
            // Applies Actor as a first classifier
            Tuple3::get2,
            groupingBy(
                // Applies rating as second level classifier
                tu -> tu.get1().getRating().get(),
                counting() // Counts the elements 
                )
            )
        );


Now that the pivot Map has been computed, we can print its content like this:

// pivot keys: Actor, values: Map<String, Long>
pivot.forEach((k, v) -> { 
    System.out.format(
        "%22s  %5s %n",
        k.getFirstName() + " " + k.getLastName(),
        V
    );
});


This will produce the following output:

        MICHAEL BOLGER  {PG-13=9, R=3, NC-17=6, PG=4, G=8} 
           LAURA BRODY  {PG-13=8, R=3, NC-17=6, PG=6, G=3} 
     CAMERON ZELLWEGER  {PG-13=8, R=2, NC-17=3, PG=15, G=5}
...


Mission completed! In the code above, the method Tuple3::get2 will retrieve the third element from the tuple (an Actor) whereas the method tu.get1() will retrieve the second element from the tuple (a Film).

Speedment will render SQL code automatically from Java and convert the result to a Java Stream. If we enable Stream logging, we can see exactly how the SQL was rendered:

SELECT 
    A.`actor_id`,A.`film_id`,A.`last_update`, 
    B.`film_id`,B.`title`,B.`description`,
    B.`release_year`,B.`language_id`,B.`original_language_id`,
    B.`rental_duration`,B.`rental_rate`,B.`length`,
    B.`replacement_cost`,B.`rating`,B.`special_features`,
    B.`last_update`, C.`actor_id`,C.`first_name`,
    C.`last_name`,C.`last_update`
FROM 
    `sakila`.`film_actor` AS A
INNER JOIN 
    `sakila`.`film` AS B ON (B.`film_id` = A.`film_id`) 
INNER JOIN 
    `sakila`.`actor` AS C ON (C.`actor_id` = A.`actor_id`)


Joins With Custom Tuples

As we noticed in the example above, we have no actual use of the FilmActor object in the Stream since it is only used to link Film and Actor entities together during the Join phase. Also, the generic Tuple3 had general get0(), get1() and get2() methods that did not say anything about what they contained.

All this can be fixed by defining our own custom "tuple" called ActorRating like this:

private static class ActorRating {
    private final Actor actor;
    private final String rating;

    public ActorRating(FilmActor fa, Film film, Actor actor) {
        // fa is not used. See below why
        this.actor = actor;
        this.rating = film.getRating().get();
    }

    public Actor actor() {
        return actor;
    }

    public String rating() {
        return rating;
    }

}


When Join objects are built using the build() method, we can provide a custom constructor that we want to apply to the incoming entities from the database. This is a feature that we are going use as depicted below:

Join<ActorRating> join = joinComponent
    .from(FilmActorManager.IDENTIFIER)
    .innerJoinOn(Film.FILM_ID).equal(FilmActor.FILM_ID)
    .innerJoinOn(Actor.ACTOR_ID).equal(FilmActor.ACTOR_ID)
    .build(ActorRating::new); // Use a custom constructor

Map<Actor, Map<String, Long>> pivot = join.stream()
    .collect(
        groupingBy(
            ActorRating::actor,
            groupingBy(
                ActorRating::rating,
                counting()
            )
         )
    );


In this example, we proved a class with a constructor (the method reference ActorRating:new gets resolved to new ActorRating(fa, actor, film)) that just discards the linking FilmActor object altogether. The class also provided better names for its properties, which made the code more readable. The solution with the custom ActorRating class will produce exactly the same output result as the first example, but it looks much nicer when used. I think the effort of writing a custom tuple is worth the extra effort over using generic tuples in most cases.

Using Parallel Pivoting

One cool thing with Speedment is that it supports the Stream method parallel() out-of-the-box. So, if you have a server with many CPUs, you can take advantage of all those CPU cores when running database queries and joins. This is how parallel pivoting would look like:

Map<Actor, Map<String, Long>> pivot = join.stream()
    .parallel()  // Make our Stream parallel
    .collect(
        groupingBy(
            ActorRating::actor,
            groupingBy(
                ActorRating::rating,
                counting()
            )
         )
    );


We only have to add a single line of code to get parallel aggregation. The default parallel split strategy kicks in when we reach 1024 elements. Thus, parallel pivoting will only take place on tables or joins larger than this. It should be noted that the Sakila database only contains 1000 films, so we would have to run the code on a bigger database to actually be able to benefit from parallelism.

Take it for a Spin!

In this article, we have shown how you can compute pivot data from a database in Java without writing a single line of SQL code. Visit Speedment open-source on GitHub to learn more.

Read more about other features in the the User's Guide.

Database Relational database Java (programming language) Stream (computing)

Published at DZone with permission of Per-Åke Minborg, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Navigating NoSQL: A Pragmatic Approach for Java Developers
  • Java and MongoDB Integration: A CRUD Tutorial [Video Tutorial]
  • Architecture and Code Design, Pt. 2: Polyglot Persistence Insights To Use Today and in the Upcoming Years
  • Architecture and Code Design, Pt. 1: Relational Persistence Insights to Use Today and On the Upcoming Years

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • [email protected]

Let's be friends: