Time for another chapter of Enterprise Granny. As I mentioned in chapter 5 there are a few things I left out intentionally while designing the domain model and the JPA-based persistence. Even worse, I included a small, but very tricky mistake... why? Because it's a very common error and hence I believe it's worth discussing it in detail!
The problem space I want to discuss today is object-relational mapping (ORM), which is a prime example of the [leaky abstractions](/community/(http:/www.joelonsoftware.com/articles/LeakyAbstractions.html) topic I brought up in one of my recent blog post. Looking at the abstraction layers involved in our persistence functionality one notices the following layers: relational databases use SQL to interact with the DB engine, JDBC is used to connect to databases from Java applications, JPA sits on top of JDBC and maps our object-oriented data model to a relational DB model. We went one step further and added Spring Data JPA on top. From a developer perspective this hides a lot of the complexity involved and we are free to focus on our application logic without worrying about the nitty-gritty details.
WRONG!
That's exactly the point I want to make here: we do have to worry about the nitty-gritty details, because if we don't truly understand what each of these abstraction layers does or how they work, we are NOT in control. ORM is far from being trivial and there's plenty of room for errors if you're not aware of them. One of the most obvious differences between relational and object-oriented data models is the way they handle identities. As we all know, a relational database uses the concept of primary keys to uniquely identify a row within a table. In Java however, there's no such concept of a unique identifier per se and by default any object is only considered equal to another object if they are referencing the same address space in memory.
So, how-to fix this paradigm mismatch? I'm afraid that there's no clear answer and it is a recurring discussion whether or not it's considered good practice to use the primary key as an object's identifier or not. For example the Hibernate (one of the most popular ORM frameworks in Java) documentation suggests:
"Never use the database identifier to implement equality; use a business key, a combination of unique, usually immutable, attributes. The database identifier will change if a transient object is made persistent. If the transient instance (usually together with detached instances) is held in a
Set
, changing the hashcode breaks the contract of theSet
. Attributes for business keys don't have to be as stable as database primary keys, you only have to guarantee stability as long as the objects are in the sameSet
."
(Emphasis added by me)
The key take-away here is the fact that if the key identifier is provided by the database (e.g. via auto increment) then the ID will change once the object gets persisted. That was one of the reasons why I opted to use a GUID as the primary key, but I did two mistakes:
- the GUID is created initially once the object is persisted, hence it still changes within the life-time of the object
- I did not overwrite the
equals()
andhashCode()
methods accordingly
I have fixed both errors with this commit: https://github.com/SAP/cloud-enterprise-granny/commit/55db327e615066efad03e72c15481c83cfe43678
The new BaseObject now immediately generates a new GUID/UUID as soon as a new object is instantiated (and hence that ID is no longer changed through its life-cycle) and we have created a final (!!!) implementation of the equals
and hashCode()
methods respectively. I also added a toString()
method - mainly for debugging purposes. In case you want more details about this problem space I can strongly recommend the following blog post:
http://www.onjava.com/pub/a/onjava/2006/09/13/dont-let-hibernate-steal-your-identity.html
Note: I used some functionality provided by the Apache Commons Lang3 library for some of the aforementioned methods and hereby created a dependency to this library within my domain model. Usually I'm quite reluctant to introduce dependencies in my domain model, which is why I mention it explicitly here. However, in this case dealing with the Apache Common Lang library I came to the conclusion that this functionality really should have been part of the Java language in the first place (as the name suggests) and I wouldn't want to miss it anyway, hence I accepted the trade-off.
Speaking about all these basics, let me close by recommending a book every serious Java developer should have read: Effective Java by Joshua Bloch. It's a must-read as it addresses some of the most common errors and misconceptions as well as revealing and discussing some of the design flaws of the language itself. With that I call it a day - have a great weekend! In the next chapter we'll talk about another evergreen: unit tests ! ;)
Have fun coding!