DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

SBOMs are essential to circumventing software supply chain attacks, and they provide visibility into various software components.

Related

  • Introducing Graph Concepts in Java With Eclipse JNoSQL
  • Jakarta NoSQL 1.0: A Way To Bring Java and NoSQL Together
  • Handling Embedded Data in NoSQL With Java
  • Simplify NoSQL Database Integration in Java With Eclipse JNoSQL 1.1.3

Trending

  • Top NoSQL Databases and Use Cases
  • MCP Client Agent: Architecture and Implementation
  • Why API-First Frontends Make Better Apps
  • Cloud Hardware Diagnostics for AI Workloads
  1. DZone
  2. Data Engineering
  3. Databases
  4. Java 8 Steaming (groupBy Example)

Java 8 Steaming (groupBy Example)

Learn more about Java 8 Streams and how they can be applied to aggregate functions.

By 
Bhupender G user avatar
Bhupender G
·
May. 07, 19 · Tutorial
Likes (9)
Comment
Save
Tweet
Share
29.2K Views

Join the DZone community and get the full member experience.

Join For Free

In this article, I would like to share few scenarios where we will see how the Java 8 Stream and related functions can be useful for reading a file and applying some aggregate functions, which we generally do in our day-to-day SQL/queries on the database side.

So, I am taking an example of employee CSV files, which is shown below:

EMP_NAME,EMP_ID,DEPT_ID,SAL
EMP2,1,10,300
EMP3,2,20,50
EMP4,3,20,25
EMP5,4,11,100
EMP6,5,10,200
EMP7,6,10,200
EMP1,7,10,500
EMP8,8,11,101


I would like to start with a use case that we execute on the database side, generally.

Scenario A

Assuming the above records are present in the Employee table, we get the total salary of employees at the departmental level.

Hint: GROUP BY enables you to use aggregate functions on groups of data returned from a query, as shown below.

SQL :: SELECT DEPT_ID, SUM(SAL) FROM EMP_TABLE GROUP BY DEPT_ID;


Now, we can see that the only difference is that we have the records in our CSV files instead of the database rows.

So, what we need to do is, basically, take Java objects representing records from the CSV file. Each line in the file can be assumed to have the record that we used in the database.

So, we will start with our resource or plain object, which isEmployee.

public class Employee {

private String empName;
private long empID;
private long deptID;
private long salary;

// have your setters and getters as well

}


Now, let's have our test program (main function) perform an aggregate function.

Step 1: CSV File

Read the CSV file and map it to the Java object. This method will convert all rows to a List of employees

private static List<Employee> mapCSV(String empCSVFilePath) {
List<Employee> empList = new ArrayList<Employee>();

try {
    File inputF = new File(empCSVFilePath);
    InputStream inputFS = new FileInputStream(inputF);
    BufferedReader br = new BufferedReader(new InputStreamReader(inputFS));

// Skip the header since its just coloumn name in table and in CSV file but not the data.
    empList = br.lines().skip(1).map(csv2EmpObj).collect(Collectors.toList());
    br.close();
} catch (IOException e) {
    System.err.println(e.getMessage());
}

return empList;
}


Step 2: Mapper Function

Write the mapper function,  csv2EmpObj, to bind all of your rows to a plain Java object:

private static Function<String, Employee> csv2EmpObj = (line) -> {
    String[] record = line.split(",");// This can be delimiter which
    Employee employee = new Employee();
    if (record[0] != null && record[0].trim().length() > 0) {
        employee.setEmpName(record[0]);
    }
    if (record[1] != null && record[1].trim().length() > 0) {
        employee.setEmpID(Long.valueOf(record[1]));
    }
    if (record[2] != null && record[2].trim().length() > 0) {
        employee.setDeptID(Long.valueOf(record[2]));
    }
    if (record[3] != null && record[3].trim().length() > 0) {
        employee.setSalary(Long.valueOf(record[3]));
    }
    return employee;
};


Step 3: Apply Java 8 Stream Features

Keep in mind that the requirement was to get the sum of the salary using groupBy department (dept_id).

So, we will first have the streams of the collection (list of employees), which we had from our earlier step, and then, we apply the following steaming API/methods.

First, we have the collect method. This is basically the result based on the Collector used (input, functions)

When determining the Collector to use, your input/accumulator will be dep_id and the reduction/result would be the sum of the salary.

So, the function will look like below:

public static Map<Long, Long> totalSalaryForDepartment(List<Employee> empList) {
    return empList.stream()
    .collect(Collectors.groupingBy(
    Employee::getDeptID,
    Collectors.summingLong(Employee::getSalary)));
}


Step 4: Print It

Your main function of the test program will look like this:

public static void main(String[] args) {

    String empFileName = "C:\\tmp\\Employee.csv";
    List<Employee> listOFEmployees = process(empFileName); // Step 1 & Step 2.
    Map<Long, Long> groupByDeptID = totalSalaryForDepartment(listOFEmployees); // Step 3.
    groupByDeptID.entrySet().stream().forEach(System.out::println); // Step 4.

  //

System.out.println("Employee with max salary : "+mostExpensiveResource(listOFEmployees));
}


Scenario B 

Find the employee with the maximum salary:

public static Employee mostExpensiveResource(List<Employee> employees) {
    return employees.stream()
    .collect(Collectors.<Employee>maxBy(
    Comparator.comparing(Employee::getSalary))).get();
}

// you can tweak to find minimum one as well and  there are other scenarios like filtering, average etc.


The best practices have not been followed; this is just an example to showcase day-to-day use cases of an aggregate function through the Java 8 streaming API.

Happy streaming!

Database Java (programming language)

Opinions expressed by DZone contributors are their own.

Related

  • Introducing Graph Concepts in Java With Eclipse JNoSQL
  • Jakarta NoSQL 1.0: A Way To Bring Java and NoSQL Together
  • Handling Embedded Data in NoSQL With Java
  • Simplify NoSQL Database Integration in Java With Eclipse JNoSQL 1.1.3

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • [email protected]

Let's be friends: