DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
View Events Video Library
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Integrating PostgreSQL Databases with ANF: Join this workshop to learn how to create a PostgreSQL server using Instaclustr’s managed service

Mobile Database Essentials: Assess data needs, storage requirements, and more when leveraging databases for cloud and edge applications.

Monitoring and Observability for LLMs: Datadog and Google Cloud discuss how to achieve optimal AI model performance.

Automated Testing: The latest on architecture, TDD, and the benefits of AI and low-code tools.

Related

  • Troubleshooting Memory Leaks With Heap Profilers
  • Java Memory Management
  • What's Wrong With Small Objects in Java?
  • How to Prevent Your Java Collections From Wasting Memory

Trending

  • TypeScript: Useful Features
  • How TIBCO Is Evolving Its Platform To Embrace Developers and Simplify Cloud Integration
  • Running Unit Tests in GitHub Actions
  • AWS Amplify: A Comprehensive Guide
  1. DZone
  2. Coding
  3. Languages
  4. A collection with billions of entries

A collection with billions of entries

Peter Lawrey user avatar by
Peter Lawrey
·
Aug. 10, 11 · Interview
Like (0)
Save
Tweet
Share
16.42K Views

Join the DZone community and get the full member experience.

Join For Free

There are a number of problems with having a large number of records in memory. One way around this is to use direct memory, but this is too low level for most developers. Is there a way to make this more friendly?

Limitations of large numbers of objects

  • The overhead per object is between 12 and 16 bytes for 64-bit JVMs. If the object is relatively small, this is significant and could be more than the data itself.
  • The GC pause time increases with the number of objects. Pause times can be around one second per GB of objects.
  • Collections and arrays only support two billion elements

Huge collections

One way to store more data and still follow object orientated principles is have wrappers for direct ByteBuffers.  This can be tedious to write, but very efficient.

What would be ideal is to have these wrappers generated automatically.

Small JavaBean Example

This is an example of JavaBean which would have far more overhead than actual data contained.
interface MutableByte {
    public void setByte(byte b);

    public byte getByte();
}

It is also small enough that I can create billions of these on my machine. This example creates a List<Bytes> with 16 billion elements.

final long length = 16_000_000_000L;
HugeArrayList<MutableByte> hugeList = new HugeArrayBuilder<MutableByte>() {{
    allocationSize = 4 * 1024 * 1024;
    capacity = length;
}}.create();

List<MutableByte> list = hugeList;
assertEquals(0, list.size());

hugeList.setSize(length);

// add a GC to see what the GC times are like.
System.gc();

assertEquals(Integer.MAX_VALUE, list.size());
assertEquals(length, hugeList.longSize());

byte b = 0;
for (MutableByte mb : list)
    mb.setByte(b++);

b = 0;
for (MutableByte mb : list) {
    byte b2 = mb.getByte();
    byte expected = b++;
    if (b2 != expected)
        assertEquals(expected, b2);
}
From start to finish, the heap memory used is as follows. with -verbosegc
0 sec - 3100 KB used
[GC 9671K->1520K(370496K), 0.0020330 secs]
[Full GC 1520K->1407K(370496K), 0.0063500 secs]
10 sec - 3885 KB used
20 sec - 4428 KB used
30 sec - 4428 KB used
  ... deleted ...
1380 sec - 4475 KB used
1390 sec - 4476 KB used
1400 sec - 4476 KB used
1410 sec - 4476 KB used
The only GC is one triggered explicitly. Without the System.gc(); no GC logs appear.

After 20 sec, the increase in memory used is from logging how much memory was used.

Conclusion

The library is relatively slow. Each get or set takes about 40 ns which really adds up when there are so many calls to make. I plan to work on it so it is much faster. ;)

On the upside, it wouldn't be possible to create 16 billion objects with the memory I have, nor could it be put in an ArrayList, so having it a little slow is still better than not working at all.

 

From http://vanillajava.blogspot.com/2011/08/collection-with-billions-of-entries.html

Object (computer science) Memory (storage engine) 64-bit Data (computing) garbage collection Overhead (computing) Element Machine dev

Opinions expressed by DZone contributors are their own.

Related

  • Troubleshooting Memory Leaks With Heap Profilers
  • Java Memory Management
  • What's Wrong With Small Objects in Java?
  • How to Prevent Your Java Collections From Wasting Memory

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: