Introduction
Many years ago, I used to work as a team leader, and, one day, the General Manager asked me to take a look at a project that was in big trouble.
The application in question had been developed by a team of software developers for over 9 months, and the client had just tested in a production-like environment.
The client got very upset when realizing that the application was barely crawling. For instance, I was told that a query had been running for 10 hours without showing any signs of stopping.
After analyzing the project, I identified many areas that could have been improved, and that’s how my passion for high-performance data access was born.
JPA and Hibernate
It was 2004 when I first heard of Hibernate. Back then, I was working on a .NET project for my college thesis and wasn’t very happy with ADO.NET at the time. Therefore, I started reading about NHibernatem, which was still in Beta at the time. NHibernate was trying to adapt the Hibernate 2 project from Java to .NET, and even the Beta version at the time was a much better alternative to ADO.NET.
From that moment, Hibernate got really popular. In fact, the Java Persistence API, which emerged in 2006, is very much based on Hibernate.
Thanks to JPA, Hibernate’s popularity grew even larger as most Java EE or Spring projects used it either directly or indirectly. Even today, most Spring Boot projects use Hibernate too, via the Spring Data JPA module.
Logging SQL statements
When using a data access framework where all queries must be stated explicitly, it’s obvious what SQL queries will be executed by the application.
On the other hand, JPA and Hibernate execute SQL statements based on the entity state transitions operated by the data access layer code.
For this reason, it’s very important to always log the SQL statement generated by JPA and Hibernate.
The best way to log SQL statements is to use a JDBCDataSource
orDriver
proxy, as explained in this article.
Domain Model
Let’s consider you are mapping a post
parent table and the post_comment
child table. There is a one-to-many table relationship between the post
and post_comment
tables via the post_id
Foreign Key column in the post_comment
table.
You could map the post
and post_comment
tables as JPA entities in the following way:
@Entity(name = "Post") @Table(name = "post") public class Post { @Id private Long id; private String title; public Long getId() { return id; } public Post setId(Long id) { this.id = id; return this; } public String getTitle() { return title; } public Post setTitle(String title) { this.title = title; return this; } } @Entity(name = "PostComment") @Table(name = "post_comment") public class PostComment { @Id private Long id; @ManyToOne private Post post; private String review; public PostComment setId(Long id) { this.id = id; return this; } public Post getPost() { return post; } public PostComment setPost(Post post) { this.post = post; return this; } public String getReview() { return review; } public PostComment setReview(String review) { this.review = review; return this; } }
Notice that theNow, let’s assume we are adding threePost
andPostComment
use a fluent-style API. For more details about the advantages of using this strategy, check out this article.
Post
entities into our database, each Post
containing three PostComment
child entities:
doInJPA(entityManager -> { long pastId = 1; long commentId = 1; for (long i = 1; i <= 3; i++) { Post post = new Post() .setId(pastId++) .setTitle( String.format( "High-Performance Java Persistence, part %d", i ) ); entityManager.persist(post); for (int j = 0; j < 3; j++) { entityManager.persist( new PostComment() .setId(commentId++) .setPost(post) .setReview( String.format( "The part %d was %s", i, reviews[j] ) ) ); } } });
Fetching data
Let’s assume you want to load aPostComment
from the database. For that, you can call the find
JPA method as follows:
PostComment comment = entityManager.find( PostComment.class, 1L );When executing the
find
method, Hibernate generates the following SQL query:
SELECT pc.id AS id1_1_0_, pc.post_id AS post_id3_1_0_, pc.review AS review2_1_0_, p.id AS id1_0_1_, p.title AS title2_0_1_ FROM post_comment pc LEFT OUTER JOIN post p ON pc.post_id=p.id WHERE pc.id=1
Where did that LEFT OUTER JOIN
come from?
Well, this is because the @ManyToOne
association in PostComment
uses the default fetching strategy, which is FetchType.EAGER
.
So, Hibernate has to do the LEFT OUTER JOIN
as the mapping says it should always initialize the post
association when fetching the PostComment
entity.
Now, look what happens when you execute a JPQL query to fetch the same PostComment
entity:
PostComment comment = entityManager .createQuery( "select pc " + "from PostComment pc " + "where pc.id = :id", PostComment.class) .setParameter("id",1L) .getSingleResult();Instead of a
LEFT OUTER JOIN
, we have a secondary query now:
SELECT pc.id AS id1_1_, pc.post_id AS post_id3_1_, pc.review AS review2_1_ FROM post_comment pc WHERE pc.id = 1 SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 1Now, there was a single extra query this time, but if we fetch all
PostComment
entities associated with a given Post
title:
List comments = entityManager .createQuery( "select pc " + "from PostComment pc " + "join pc.post p " + "where p.title like :titlePatttern", PostComment.class) .setParameter( "titlePatttern", "High-Performance Java Persistence%" ) .getResultList(); assertEquals(9, comments.size());Hibernate will issue 4 queries now:
SELECT pc.id AS id1_1_, pc.post_id AS post_id3_1_, pc.review AS review2_1_ FROM post_comment pc JOIN post p ON pc.post_id=p.id WHERE p.title LIKE 'High-Performance Java Persistence%' SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 1 SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 2 SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 3
There are four SQL queries this time. The first one is for the actual JPQL query that filters the post_comment
table records while the remaining three are for fetching the Post
entity eagerly.
Reviewing and validating all these @ManyToOne
associations and making sure they are always using FetchTYpe.LAZY
will take time. More, you cannot guarantee that one day, someone else will come and change a given association from FetchTYpe.LAZY
to FetchTYpe.EAGER
.
Detecting performance problems automatically
A much better approach to solving this issue is to use Hypersistence Optimizer. After setting the Maven dependency:<dependency> <groupId>io.hypersistence</groupId> <artifactId>hypersistence-optimizer</artifactId> <version>${hypersistence-optimizer.version}</version> </dependency>All you need to do is add the following code to any of your integration tests:
public void testNoPerformanceIssues() { ListEventHandler listEventHandler = new ListEventHandler(); new HypersistenceOptimizer( new JpaConfig(entityManagerFactory()) .addEventHandler(listEventHandler) ).init(); assertTrue(listEventHandler.getEvents().isEmpty()); }That’s it! Now, if you try to run the tests, the suite will fail with the following error:
ERROR [main]: Hypersistence Optimizer - CRITICAL - EagerFetchingEvent - The [post] attribute in the [io.hypersistence.optimizer.config.PostComment] entity uses eager fetching. Consider using a lazy fetching which, not only that is more efficient, but it is way more flexible when it comes to fetching data. For more info about this event, check out this User Guide link - https://vladmihalcea.com/hypersistence-optimizer/docs/user-guide/#EagerFetchingEvent java.lang.AssertionError at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertTrue(Assert.java:52) at io.hypersistence.optimizer.config.FailFastOnPerformanceIssuesTest.testNoPerformanceIssues(FailFastOnPerformanceIssuesTest.java:41)Awesome, right? You cannot even build the project with performance issues like this one sneaking in your data access code.
Conclusion
Using JPA and Hibernate is very convenient, but you need to pay extra attention to the underlying SQL statements that get generated on your behalf.
With a tool like Hypersistence Optimizer, you can finally spend your time focusing on your application requirements instead of chasing JPA and Hibernate performance issues.