Persisting Natural Key Entities With Spring Data JPA
Learn a handy trick to persist your natural key entities using Spring Data. See how that compares to surrogate keys and what other tricks that permits.
Join the DZone community and get the full member experience.
Join For FreeFrom the dawn of time, there has been an ongoing discussion about using surrogate keys vs. natural keys in your database tables. I don’t want to take any side here — just write about one consequence you’ll face when persisting a natural key entity with Spring Data repository.
From the perspective of this article, the difference between a natural key and a surrogate key is that a natural key is assigned by an application (e.g. a personalId for a Worker, which can’t be assigned by any generator strategy), whereas a surrogate key in many cases is assigned by a database (e.g. by a sequence or an identity table). Of course, a surrogate key can be assigned by an application as well (like an application generated UUID) — these cases will also face problems that are mentioned below for natural key entities.
The Difference in the Number of Queries
If we create a Worker with a surrogate key assigned by a database (id):
class SurrogateKeyWorker {
@Id
@GeneratedValue(strategy = IDENTITY)
private Long id;
private String personalId;
}
And try to persist it:
def "should save with one query for new object"() {
given:
SurrogateKeyWorker worker = new SurrogateKeyWorker(PERSONAL_ID);
when:
workerRepository.save(worker)
then:
queryStatistics.nrOfQueries() == 1
}
We’ll end up with one generated query, which is exactly what was expected.
insert into surrogate_key_worker (id, personal_id) values (null, ?)
We might, however, decide that the Worker.id property is not needed and treat personalId as an application-assigned natural key.
@Entity
class NaturalKeyWorker {
@Id
private String personalId;
NaturalKeyWorker(String personalId) {
this.personalId = personalId;
}
}
Then — if we repeat our test — we’ll observe that instead of one query we got two:
select naturalkey0_.personal_id as personal1_0_0_ from natural_key_worker naturalkey0_ where naturalkey0_.personal_id=?
insert into natural_key_worker (personal_id) values (?)
It’s quite easy to find out that the org.springframework.data.jpa.repository.support.SimpleJpaRepository.save()
method decides whether to persist or merge entity by checking if this entity is new or not.
@Transactional
public <S extends T> S save(S entity) {
if (entityInformation.isNew(entity)) {
em.persist(entity);
return entity;
} else {
return em.merge(entity);
}
}
However, entityInformation.isNew(entity)
is as simple as assuming that an entity is not new when it has not null id. This will be true for ids assigned by the database, but not for these assigned by an app. My Worker got his id assigned on creation — long before he was persisted.
Delivering isNew Info
Of course, with a framework like Spring, there is a way to handle it. In the documentation, we can read about options for dtecting whether an entity is new or not:
Since option three is discouraged (though, I don’t think that natural key is that rare of a thing) let’s try to go with the second.
We’ll need to implement a Persistable
interface and somehow decide if an entity is new or not. A very naive implementation might look like the one below — deciding by passing an additional argument. Notice that when Worker is obtained by repo, it is instantiated by a default, no args constructor, so isNew = false
;
@Entity
class PersistableWorker implements Persistable {
@Id
private String personalId;
@Setter
private String surname;
@Transient
private boolean isNew = false;
PersistableWorker(String personalId, boolean isNew) {
this.personalId = personalId;
this.isNew = isNew;
}
@Override
public Serializable getId() {
return personalId;
}
@Override
public boolean isNew() {
return isNew;
}
}
If I rerun my test now, I will get only one query persisting my object.
Controlling the Flow
So why is isNew()
checked in save()
method in the first place? Well , if the object is not new, then you don’t want to insert but update it, so you need to retrieve it first like in the example below
def "should save with two queries for existing object"() {
given:
PersistableWorker worker = new PersistableWorker(PERSONAL_ID, true)
workerRepository.save(worker)
queryStatistics.clear()
when:
worker = new PersistableWorker(PERSONAL_ID, false) // 1
worker.surname = 'Smith' // 2
workerRepository.save(worker) //3
then:
queryStatistics.nrOfQueries() == 2
}
select persistabl0_.personal_id as personal1_1_0_, persistabl0_.surname as surname2_1_0_ from persistable_worker persistabl0_ where persistabl0_.personal_id=?
update persistable_worker set surname=? where personal_id=?
The When section shows scenario when
- Existing worker is created (e.g. by data taken from obtained DTO).
- A name is changed.
- And merged to the database (select + insert).
It works fine, except this is a scenario I never face.
What I would do would be:
def "should save with one query for existing object"() {
given:
PersistableWorker worker = new PersistableWorker(PERSONAL_ID, true);
workerRepository.save(worker)
when:
doInTx {
worker = workerRepository.findOne(PERSONAL_ID)
queryStatistics.clear()
worker.surname = 'Smith'
}
then:
queryStatistics.nrOfQueries() == 1
workerRepository.findOne(PERSONAL_ID).surname == 'Smith'
}
I don’t need to call save()
if I retrieved my Worker in the same transaction that I’m changing it because dirty checking mechanism will do that for me. So I never use save()
for an update operation — but only for an insert. If so, then instead of calling save()
and having the framework decide for me which operation to perform I would like to be able to call persist()
by myself.
There is no persist()
method in Spring Data org.springframework.data.repository.CrudRepository
, but it’s quite easy to add it. What you need to do is instead of extending CrudRepository with your repos, create your own base interface with a simple implementation.
@NoRepositoryBean
public interface BaseRepository<T, ID extends Serializable> extends CrudRepository<T, ID> {
<S extends T> S persist(S entity);
}
public class BaseRepositoryImpl<T, ID extends Serializable> extends SimpleJpaRepository<T, ID> implements BaseRepository<T, ID> {
private final EntityManager entityManager;
@SuppressWarnings("unchecked")
public BaseRepositoryImpl(JpaMetamodelEntityInformation jpaMetamodelEntityInformation, Object o) {
super(jpaMetamodelEntityInformation.getJavaType(), (EntityManager) o);
this.entityManager = (EntityManager) o;
}
@Override
@Transactional
public <S extends T> S persist(S entity) {
entityManager.persist(entity);
return entity;
}
}
Then, register it with:
@EnableJpaRepositories(repositoryBaseClass = BaseRepositoryImpl.class)
Now, extending Persistable with Worker is not needed, as the workerRepository.persist(worker)
operation results only in one insert. All of that just makes me wonder why Spring Data JPA decided to hide the entityManager.persist()
method in the first place?
Full Examples can be found: here.
Published at DZone with permission of Dawid Kublik, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments