DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • High-Performance Java Serialization to Different Formats
  • Did You Know the Fastest Way of Serializing a Java Field Is Not Serializing It at All?
  • What Is Ant, Really?
  • The Complete Guide to Stream API and Collectors in Java 8

Trending

  • Mastering Fluent Bit: Installing and Configuring Fluent Bit on Kubernetes (Part 3)
  • Build Your First AI Model in Python: A Beginner's Guide (1 of 3)
  • Unlocking AI Coding Assistants: Generate Unit Tests
  • A Deep Dive Into Firmware Over the Air for IoT Devices
  1. DZone
  2. Coding
  3. Languages
  4. Java: Why a Set Can Contain Duplicate Elements

Java: Why a Set Can Contain Duplicate Elements

Learn why Sets and Maps can contain duplicate elements and keys and how to avoid the problem in applications with object reuse.

By 
Per-Åke Minborg user avatar
Per-Åke Minborg
·
Dec. 21, 21 · Tutorial
Likes (4)
Comment
Save
Tweet
Share
12.5K Views

Join the DZone community and get the full member experience.

Join For Free

In low-latency applications, the creation of unnecessary objects is often avoided by reusing mutable objects to reduce memory pressure and thus the load on the garbage collector. This makes the application run much more deterministically and with much less jitter. However, care must be taken as to how these reused objects are used, or else unexpected results might manifest themselves, for example in the form of a Set containing duplicate elements such as [B, B].

HashCode and Equals

Java’s built-in ByteBuffer provides direct access to heap and native memory using 32-bit addressing. Chronicle Bytes is a 64-bit addressing open-source drop-in replacement allowing much larger memory segments to be addressed. Both these types provide a hashCode() and an equals() method that depends on the byte contents of the objects’ underlying memory segment. While this can be useful in many situations, mutable objects like these should not be used in most of Java’s built-in Set types and not as a key in most built-in Map types.

Note: In reality, only 31 and 63 bits may be used as an effective address offset (e.g. using int and long offset parameters)

Mutable Keys

Below, a small code example is presented illustrating the problem with reused mutable objects. The code shows the use of Bytes but the very same problem exists for ByteBuffer.

Java
 
Set<CharSequence> set = new HashSet<>();
Bytes<?> bytes = Bytes.from("A");
set.add(bytes);

// Reuse
bytes.writePosition(0);

// This mutates the existing object already
// in the Set
bytes.write("B");

// Adds the same Bytes object again but now under
// another hashCode()
set.add(bytes);

System.out.println(“set = “ + set);


The code above will first add an object with “A” as content meaning that the set contains [A]. Then the content of that existing object will be modified to “B”, which has the side effect of changing the set to contain [B] but will leave the old hash code value and the corresponding hash bucket unchanged (effectively becoming stale). Lastly, the modified object is added to the set again but now under another hash code leading to the previous entry for that very same object will remain!

As a result, rather than the perhaps anticipated [A, B], this will produce the following output:

Plain Text
 
set = [B, B]


ByteBuffer and Bytes Objects as Keys in Maps

When using Java’s ByteBuffer objects or Bytes objects as keys in maps or as elements in sets, one solution is using an IdentityHashMap or Collections.newSetFromMap(new IdentityHashMap<>()) to protect against the mutable object peculiarities described above. This makes the hashing of the objects agnostic to the actual byte content and will instead use the System.identityHashCode() which never changes during the object's life.

Another alternative is to use a read-only version of the objects (for example by invoking ByteBuffer.asReadOnlyBuffer()) and refrain from holding any reference to the original mutable object that could provide a back-door to modifying the supposedly read-only object’s content.

Chronicle Map and Chronicle Queue

Chronicle Map is an open-source library that works a bit differently than the built-in Java Map implementations in the way that objects are serialized and put in off-heap memory, opening up for ultra-large maps that can be larger than the RAM memory allocated to the JVM and allows these maps to be persisted to memory-mapped files so that applications can restart much faster.

The serialization process has another less known advantage in the way that it actually allows reusable mutable objects as keys because the content of the object is copied and is effectively frozen each time a new association is put into the map. Subsequent modifications of the mutable object will therefore not affect the frozen serialized content allowing unrestricted object reuse.

Open-source Chronicle Queue works in a similar fashion and can provide queues that can hold terabytes of data persisted to secondary storage and, for the same reason as Chronicle Map, allows object reuse of mutable elements.

Conclusions

It is dangerous to use mutable objects, such as Bytes and ByteBuffer where the hashCode() depends on the content of the object, in some Map and Set implementations.

An IdentityHashMap protects against the corruption of maps and sets due to object mutation but makes these structures agnostic to the actual byte contents.

Read-only versions of previously modified memory segment objects might provide an alternate solution.

Chronicle Map and Chronicle Queue allow unrestricted use of mutable objects, opening up the path to deterministic low-latency operations.

Resources

Chronicle homepage

Chronicle Bytes on GitHub (open-source)

Chronicle Map on GitHub (open-source)

Chronicle Queue on GitHub (open-source)

Object (computer science) Java (programming language) Element Open source

Opinions expressed by DZone contributors are their own.

Related

  • High-Performance Java Serialization to Different Formats
  • Did You Know the Fastest Way of Serializing a Java Field Is Not Serializing It at All?
  • What Is Ant, Really?
  • The Complete Guide to Stream API and Collectors in Java 8

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!