DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
  1. DZone
  2. Data Engineering
  3. Databases
  4. Making Pivot Table Using Java Streams

Making Pivot Table Using Java Streams

Pivot tables are great for summarizing data. Fortunately, you can use Streams to prepare and sort your data so you can put it into a more readable format.

Jay Sridhar user avatar by
Jay Sridhar
CORE ·
Apr. 12, 17 · Tutorial
Like (32)
Save
Tweet
Share
41.85K Views

Join the DZone community and get the full member experience.

Join For Free

Today, let's see how we can implement a pivot table using Java 8 Streams. Raw data by itself does not deliver much insight to humans. We need some kind of data aggregation to discern patterns in raw data. A pivot table is one such instrument. Other more visual methods of aggregation include graphs and charts.

In previous articles, we showed how to use Java 8 Streams to perform SQL-like slicing and dicing of raw data. Today’s article will build on some of those examples, so if the material here appears a bit dense to you, I suggest you browse those two articles first.

If you do not like this “raw data” approach to creating a pivot table and would like Excel to be your medium of choice, we have you covered.

CSV Data Represented as a POJO

We use the following POJO representing baseball players and their salaries.

public class Player {
    private int year;
    private String teamID;
    private String lgID;
    private String playerID;
    private int salary;

    // defined getters and setters here
}


The data comes from a simple CSV file – no quoted fields, no multi-line fields, and no commas inside a field. We can use a simple Regex pattern to parse the CSV and load the data into a List. The data looks like this and contains about 26428 rows:

yearID,teamID,lgID,playerID,salary
1985,ATL,NL,barkele01,870000
1985,ATL,NL,bedrost01,550000
1985,ATL,NL,benedbr01,545000
1985,ATL,NL,campri01,633333
1985,ATL,NL,ceronri01,625000
...


The CSV data is loaded using Streams with code similar to:

Pattern pattern = Pattern.compile(",");
try (BufferedReader in = new BufferedReader(new FileReader(filename));){
    List<Player> players = in
        .lines()
        .skip(1)
        .map(line -> {
                String[] arr = pattern.split(line);
                return new Player(Integer.parseInt(arr[0]),
                                  arr[1],
                                  arr[2],
                                  arr[3],
                                  Integer.parseInt(arr[4]));
            })
        .collect(Collectors.toList());
}


Define the Pivot Columns Class

The following class defines the class that we use as the container for the pivot columns. These are the columns that the data is grouped by. If using SQL, these columns would appear in the “GROUP BY” clause.

public class YearTeam
{
    public int year;
    public String teamID;

    public YearTeam(int year,String teamID) {
        this.year = year;
        this.teamID = teamID;
    }

    @Override
    public boolean equals(Object other)
    {
        if ( other == null ) return false;
        if ( this == other ) return true;
        if ( other instanceof YearTeam ) {
            YearTeam yt = (YearTeam)other;
            if ( year == yt.year && teamID.equals(yt.teamID) )
                return true;
        }
        return false;
    }

    @Override
    public int hashCode()
    {
        int hash = 1;
        hash = hash * 17 + year;
        hash = hash * 31 + teamID.hashCode();
        return hash;
    }

    @Override
    public String toString()
    {
        StringBuilder sbuf = new StringBuilder();
        sbuf.append('[').append(year).append(", ").append(teamID)
            .append(']');
        return sbuf.toString();
    }
}


The fields are defined with the “public” accessor merely for convenience. For your application, you can make it private and add getters and/or setters as needed.

The class overrides equals() and hashCode(), since it will be used as the key when storing in a Map.

Grouping Data Using Streams

And here we read the CSV data, create a POJO for each row, and group the data using the pivot columns class defined above.

Map<YearTeam,List<Player>> grouped = in
    .lines()
    .skip(1)
    .map(line -> {
            String[] arr = pattern.split(line);
            return new Player(Integer.parseInt(arr[0]),
                              arr[1],
                              arr[2],
                              arr[3],
                              Integer.parseInt(arr[4]));
        })
    .collect(Collectors.groupingBy(x-> new YearTeam(x.getYear(), x.getTeamID())));


At this point, the data has been collected in a Map properly grouped by the specified columns.

Printing the Pivot Table as a CSV

Let us print the data in the pivot table as a CSV so we can load it into Excel for comparison. While printing the data, we apply the aggregate function summingLong(). Java 8 Streams also provide averagingLong() which gives you the average. Need more, or all of it at once? summarizingLong() dumps all of it in your face. Enjoy!

CSV Column Headers

We use values of the teamID as column headers. Collect and print them as shown. We use a TreeSet here to have it sorted alphabetically.

Set<String> teams = grouped
    .keySet()
    .stream()
    .map(x -> x.teamID)
    .collect(Collectors.toCollection(TreeSet::new));

System.out.print(',');
teams.stream().forEach(t -> System.out.print(t + ","));
System.out.println();


Printing the Data

And here is the complete pivot table being built and printed. For each year and team, we extract the list of players, perform a summing operation and print the total.

Set<Integer> years = grouped
    .keySet()
    .stream()
    .map(x -> x.year)
    .collect(Collectors.toSet());

years
    .stream()
    .forEach(y -> {
            System.out.print(y + ",");
            teams.stream().forEach(t -> {
                    YearTeam yt = new YearTeam(y, t);
                    List<Player> players = grouped.get(yt);
                    if ( players != null ) {
                        long total = players
                            .stream()
                   .collect(Collectors.summingLong(Player::getSalary));
                        System.out.print(total);
                    }
                    System.out.print(',');
                });
            System.out.println();
        });


Comparing Output in Excel

Loading the CSV output in Excel shows the data as follows:

Comparing it with Excel’s own pivot table, it shows the data to be identical. (For some reason, column “MON” has appeared first in Excel as shown below. Maybe it is one of the many “features” available. The values are identical though.)

And that, my friends, is one way you can prepare a pivot table from inside Java using plain Collections. Go find some cool uses for it!

Summary

A pivot table is a useful data summarizing tool. It is available in most data analytics software including Excel., Here we learned how to build the same data structure using Java 8 Streams. We used grouping-by and summing to achieve the functionality.

Database Pivot table Stream (computing) Java (programming language) Data (computing) sql

Published at DZone with permission of Jay Sridhar, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Software Maintenance Models
  • All the Cloud’s a Stage and All the WebAssembly Modules Merely Actors
  • Developer Productivity: The Secret Sauce to Building Great Dev Teams
  • LazyPredict: A Utilitarian Python Library to Shortlist the Best ML Models for a Given Use Case

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: