DZone
Java Zone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
  • Refcardz
  • Trend Reports
  • Webinars
  • Zones
  • |
    • Agile
    • AI
    • Big Data
    • Cloud
    • Database
    • DevOps
    • Integration
    • IoT
    • Java
    • Microservices
    • Open Source
    • Performance
    • Security
    • Web Dev
DZone > Java Zone > Collections Library for millions of elements

Collections Library for millions of elements

Peter Lawrey user avatar by
Peter Lawrey
·
Aug. 12, 11 · Java Zone · Interview
Like (0)
Save
Tweet
14.18K Views

Join the DZone community and get the full member experience.

Join For Free

If you want to efficiently store large collections of data in memory. This library can dramatically reduce Full GC times and reduce memory consumption as well.

When you have a data type which can be represented by an interface and you want a List for this type.

List<Iinterfacetype> list = new HugeArrayBuilder<Interfacetype>() {}.create();

The type needs to be described using an interface so its represented can be changed. (Using generated byte code)

The HugeArrayBuilder builds generated classes for the InterfaceType on demand. The builder can be configured to change how the HugeArrayList is created.

A more complex example is
HugeArrayList hugeList = new HugeArrayBuilder() {{
        allocationSize = 1024*1024;
        classLoader = myClassLoader;
    }}.create();

How does the library differ

  • Uses long for sizes and indecies.
  • Uses column based data making the per element overhead minimal and speed up scans over a single or small set of attributes. This reduces memory usages by 2x or more.
  • Stores attributes in direct memory as much as possible, reducing heap usage dramatically, 10x or more.
  • Allocates blocks of data for consistent add() times, rather than exponentially growing an underlying store (with exponentially increasing delays on a grow of capacity)

Performance comparison

This test compares using HugeCollection vs ArrayList of JavaBeans. The class has 12 fields of various types which can be converted to primitives in different ways. By storing these primitives (or objects encoded as primitives), the time to perform GCs is almost eliminated.


With 20 GB of free memory, only 250 million JavaBeans could be created and a HugeCollection can store 500 million elements.

Classes with fields which cannot be encoded as primitives are supported in version 0.1.1 and still show an improvement in GC times.




In both cases, the amount of memory used was halved.

The project web site

HugeCollections on Google Code

The source

The best way to get the whole source, including tests is to use subversion.
svn co http://vanilla-java.googlecode.com/svn/trunk/collections/ vanilla-java-collections
cd vanilla-java-collections
mvn test

 

From http://vanillajava.blogspot.com/2011/08/collections-library-for-millions-of.html

Library Element

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Augmented Analytics: The Future of Business Intelligence
  • CSS Position: Relative vs Position Absolute
  • Data Visualization of Healthcare Expenses by Country Using Web Scraping in Python
  • 9 Strategies to Improve Your Software Development Process

Comments

Java Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • MVB Program
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends:

DZone.com is powered by 

AnswerHub logo