Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Repositories Anti-Pattern: Duplicate Nodes

DZone's Guide to

Repositories Anti-Pattern: Duplicate Nodes

· Java Zone ·
Free Resource

Verify, standardize, and correct the Big 4 + more– name, email, phone and global addresses – try our Data Quality APIs now at Melissa Developer Portal!

The original Sun blueprints encouraged developers to make a DAO for every entity, then when graphs got large, code became a huge pile of goo where various entities were rolled up into build logic that was hard-coded, ugly and error prone. The move to repositories in the middle of the last decade encouraged the idea that there were certain key entities that the app would manipulate most of the time that could be persisted and then through cascade, their subobjects would be persisted. While this works great, the one problem that tends to come up quite often is that people get up at the top of the graph, order cascade on everything, and one of the subobject nodes is not unique, yet a new instance will be created each time.

First thing to wonder is why can‘t Hibernate detect that these things are dupes. So for example, suppose I am persisting Issues and they have a Reporter (which is a User). If I parse the issue information, including the reporter, and set it, then try to persist this instance of Issue, why can‘t Hibernate detect that that Reporter already exists? The quick answer might be ‘how does it know what your concept of equality is?‘ But then, can‘t I just implement Equals and provide that answer? Well, it‘s not that simple because equals is only going to work with 2 objects that are in memory.

What we need is some logic that says ‘wait, this object might already exist in the database so we need something that says ‘wait, I have this Reporter, let me check and see if there‘s one in the db like this.‘

Invariably, when this is not done, the table becomes a mess then someone has to come in and inject said repository and perform a lookup rather than creating a new instance. Seems like maybe a better solution is to create an interceptor. The idea would be that any Repository could be outfitted with a duplicateDetector. It would have a simple interface (done through generics: T createOrUpdate(T newInstance);. The only rules would be that we would have to have a get on the outer object. Here‘s the flow:

  1. IssuesRepository is instantiated.
  2. Reporter duplicate detector is injected.
  3. Interceptor is created so we will get called before the Issue is persisted.
  4. Inside interceptor, we would look and see if we have any dupe detectors, we do, in a map where it‘s registered by type: Reporter.
  5. We use reflection to get the object from the outer object, e.g. make a method invocation one time to call getReporter on Issue.
  6. Pass that to the detector and call the set to replace it (wait, set method, that sucks, guess I could just dither the props directly).

As good as Hibernate is, isn‘t it time to get away from wrappers and translators?? I am looking forward to an object database. The real death of SQL is at hand from other directions: the rise of analytics is showing us that SQL is really only an operational way station and basing everything on it because you are scared you won‘t get performance without it is ultimately not going to be enough to keep it in the pole position.

 

From http://www.jroller.com/robwilliams/entry/repositories_anti_pattern_duplicate_nodes

Developers! Quickly and easily gain access to the tools and information you need! Explore, test and combine our data quality APIs at Melissa Developer Portal – home to tools that save time and boost revenue. Our APIs verify, standardize, and correct the Big 4 + more – name, email, phone and global addresses – to ensure accurate delivery, prevent blacklisting and identify risks in real-time.

Topics:

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}