DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Java Stream API: 3 Things Every Developer Should Know About
  • Optimizing Java Applications: Parallel Processing and Result Aggregation Techniques
  • Functional Approach To String Manipulation in Java
  • How to Convert Excel and CSV Documents to HTML in Java

Trending

  • Ingesting Fixed-Width Mainframe Files Into Delta Lake: The Details Nobody Writes Down
  • Beyond Partitioning and Z-Order: A Deep Dive into Liquid Clustering for Unity Catalog Managed Tables
  • Building Enterprise-Grade Real-Time IoT Dashboards with Vue 3, MQTT, and Kafka
  • OpenAPI From Code With Spring and Java: A Recipe for Your CI
  1. DZone
  2. Data Engineering
  3. Databases
  4. How to Read a Large CSV File With Java 8 and Stream API

How to Read a Large CSV File With Java 8 and Stream API

Scenario: you have to parse a large CSV file (~90MB), practically read the file, and create one java object for each of the lines. What do you do?

By 
Eugen Hoble user avatar
Eugen Hoble
·
Sep. 28, 16 · Tutorial
Likes (27)
Comment
Save
Tweet
Share
264.1K Views

Join the DZone community and get the full member experience.

Join For Free

Scenario: you have to parse a large CSV file (~90MB), practically read the file, and create one Java object for each of the lines. In real life, the CSV file contains around 380,000 lines.

Assumption: you already know the path of the CSV file before using the code below.

The following code will read the file and create one Java object per line.

private List<YourJavaItem> processInputFile(String inputFilePath) {

    List<YourJavaItem> inputList = new ArrayList<YourJavaItem>();

    try{

      File inputF = new File(inputFilePath);
      InputStream inputFS = new FileInputStream(inputF);
      BufferedReader br = new BufferedReader(new InputStreamReader(inputFS));

      // skip the header of the csv
      inputList = br.lines().skip(1).map(mapToItem).collect(Collectors.toList());
      br.close();
    } catch (FileNotFoundException|IOException e) {
      ....
    }

    return inputList ;
}

Some explanation about the above code might be needed:

  •  lines() : returns a stream object.

  •  skip(1) : skips the first line in the CSV file, which in this case is the header of the file.

  •  map(mapToItem) : calls the mapToItem  function for each line in the file.

  •  collect(Collectors.toList()) : creates a list containing all the items created by mapToItem  function.

Now, mapToItem  function looks like this:

private Function<String, YourJavaItem> mapToItem = (line) -> {

  String[] p = line.split(COMMA);// a CSV has comma separated lines

  YourJavaItem item = new YourJavaItem();

  item.setItemNumber(p[0]);//<-- this is the first column in the csv file
  if (p[3] != null && p[3].trim().length() > 0) {
    item.setSomeProeprty(p[3]);
  }
  //more initialization goes here

  return item;
}

Performance Consideration

From the testing I've done, it seems that reading a 90 MB CSV file using the way described above will take around 700 ms when running from inside Eclipse. 

It is probably even faster in production.

Not bad. Happy coding!

CSV API Java (programming language) Stream (computing)

Opinions expressed by DZone contributors are their own.

Related

  • Java Stream API: 3 Things Every Developer Should Know About
  • Optimizing Java Applications: Parallel Processing and Result Aggregation Techniques
  • Functional Approach To String Manipulation in Java
  • How to Convert Excel and CSV Documents to HTML in Java

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook