DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
View Events Video Library
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Modern Digital Website Security: Prepare to face any form of malicious web activity and enable your sites to optimally serve your customers.

Containers Trend Report: Explore the current state of containers, containerization strategies, and modernizing architecture.

Low-Code Development: Learn the concepts of low code, features + use cases for professional devs, and the low-code implementation process.

E-Commerce Development Essentials: Considering starting or working on an e-commerce business? Learn how to create a backend that scales.

Related

  • How to Convert Excel and CSV Documents to HTML in Java
  • Techniques You Should Know as a Kafka Streams Developer
  • Express Hibernate Queries as Type-Safe Java Streams
  • The Complete Guide to Stream API and Collectors in Java 8

Trending

  • My Top Picks of Re:Invent 2023
  • Unlocking Seamless Customer Relationship Management With React Integration Features
  • Navigating API Governance: Best Practices for Product Managers
  • Securing REST APIs With Nest.js: A Step-by-Step Guide
  1. DZone
  2. Data Engineering
  3. Databases
  4. How to Read a Large CSV File With Java 8 and Stream API

How to Read a Large CSV File With Java 8 and Stream API

Scenario: you have to parse a large CSV file (~90MB), practically read the file, and create one java object for each of the lines. What do you do?

Eugen Hoble user avatar by
Eugen Hoble
·
Sep. 28, 16 · Tutorial
Like (27)
Save
Tweet
Share
258.0K Views

Join the DZone community and get the full member experience.

Join For Free

Scenario: you have to parse a large CSV file (~90MB), practically read the file, and create one Java object for each of the lines. In real life, the CSV file contains around 380,000 lines.

Assumption: you already know the path of the CSV file before using the code below.

The following code will read the file and create one Java object per line.

private List<YourJavaItem> processInputFile(String inputFilePath) {

    List<YourJavaItem> inputList = new ArrayList<YourJavaItem>();

    try{

      File inputF = new File(inputFilePath);
      InputStream inputFS = new FileInputStream(inputF);
      BufferedReader br = new BufferedReader(new InputStreamReader(inputFS));

      // skip the header of the csv
      inputList = br.lines().skip(1).map(mapToItem).collect(Collectors.toList());
      br.close();
    } catch (FileNotFoundException|IOException e) {
      ....
    }

    return inputList ;
}

Some explanation about the above code might be needed:

  •  lines() : returns a stream object.

  •  skip(1) : skips the first line in the CSV file, which in this case is the header of the file.

  •  map(mapToItem) : calls the mapToItem  function for each line in the file.

  •  collect(Collectors.toList()) : creates a list containing all the items created by mapToItem  function.

Now, mapToItem  function looks like this:

private Function<String, YourJavaItem> mapToItem = (line) -> {

  String[] p = line.split(COMMA);// a CSV has comma separated lines

  YourJavaItem item = new YourJavaItem();

  item.setItemNumber(p[0]);//<-- this is the first column in the csv file
  if (p[3] != null && p[3].trim().length() > 0) {
    item.setSomeProeprty(p[3]);
  }
  //more initialization goes here

  return item;
}

Performance Consideration

From the testing I've done, it seems that reading a 90 MB CSV file using the way described above will take around 700 ms when running from inside Eclipse. 

It is probably even faster in production.

Not bad. Happy coding!

CSV API Java (programming language) Stream (computing)

Opinions expressed by DZone contributors are their own.

Related

  • How to Convert Excel and CSV Documents to HTML in Java
  • Techniques You Should Know as a Kafka Streams Developer
  • Express Hibernate Queries as Type-Safe Java Streams
  • The Complete Guide to Stream API and Collectors in Java 8

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: