How jOOQ Leverages Generic Type Safety in its DSL

In this year’s Java Advent Calendar, we’re thrilled to have been asked to feature a mini-series showing you a couple of advanced and very interesting topics that we’ve been working on when developing jOOQ.

The series consists of:

Don’t miss any of these!

How jOOQ leverages generic type safety in its DSL

Few Java developers are aware of this, but SQL is a very type safe language. In the Java ecosystem, if you’re using JDBC, you’re operating on dynamically constructed SQL strings, which are sent to the server for execution – or failure. Some IDEs may have started to be capable of introspecting parts of your static SQL, but often you’re concatenating predicates to form a very dynamic query:

String sql = "SELECT a, b, c FROM table WHERE 1 = 1";

if (someCondition)
sql += " AND id = 3";

if (someOtherCondition)
sql += " AND value = 42";

These concatenations quickly turn nasty and are one of the reasons why Java developers don’t really like SQL

SQL as written via JDBC. Image (c) by Greg Grossmeier. License CC-BY-SA 2.0

But interestingly, PL/SQL or T-SQL developers never complain about SQL in this way. In fact, they feel quite the opposite. Look at how SQL is nicely embedded in a typical PL/SQL block:

BEGIN

-- The record type of "rec" is inferred by the compiler
FOR rec IN (

-- This compiles only when I have matching
-- degrees and types of both UNION subselects!
SELECT first_name, last_name FROM customers
UNION
SELECT first_name, last_name FROM staff
)
LOOP

-- This compiles only if rec really has
-- first_name and last_name columns
INSERT INTO people (first_name, last_name)

-- Obviously, VALUES must match the above target table
VALUES (rec.first_name, rec.last_name);
END LOOP;
END;

Now, we can most certainly discuss syntax. Whether you like SQL’s COBOLesque syntax or not is a matter of taste and a matter of habit, too. But one thing is clear, SQL is absolutely type safe, and most sane people would consider that a very good thing. Read The Inconvenient Truth About Dynamic vs. Static Typing for more details.

The same can be achieved in Java!

JDBC’s lack of type safety is a brilliant feature for the low-level API that JDBC is. At some point, we need an API that can simply send SQL strings over the wire without knowing anything about the wire protocol, and retrieve back cursors of arbitrary / unknown type. However, if we don’t execute our SQL directly via JDBC, but maintain a type safe SQL AST (Abstract Syntax Tree) prior to query execution, then we might actually anticipate the returned type of our statements.

jOOQ’s DSL API (Domain-specific language) works exactly like that. When you create SQL statements with jOOQ, you’re implicitly creating an AST both for your Java compiler, but also for your runtime environment. Here’s how that works:

DSL.using(configuration)
.select(CUSTOMERS.FIRST_NAME, CUSTOMERS.LAST_NAME).from(CUSTOMERS)
.union(
select(STAFF.FIRST_NAME, STAFF.LAST_NAME ).from(STAFF))
.fetch();

If we look closely at what the above query really does, we’ll see that we’re calling one of several overloaded select() methods on jOOQ’s DSLContext class, namely DSLContext.select(Field, Field), the one that takes two argument columns.

The whole API looks like this, and we’ll see immediately after why this is so useful:

<T1> SelectSelectStep<Record1<T1>> 
select(Field<T1> field1);
<T1, T2> SelectSelectStep<Record2<T1, T2>>
select(Field<T1> field1, Field<T2> field2);
<T1, T2, T3> SelectSelectStep<Record3<T1, T2, T3>>
select(Field<T1> field1, Field<T2> field2, Field<T3> field3);
// and so on...

So, by explicitly passing two columns to the select() method, you have chosen the second one of the above methods that returns a DSL type that is parameterised with Record2, or more specifically, with Record2<String, String>. Yes, the String parameter bindings are inferred from the very columns that we passed to the select() call, because jOOQ’s code generator reverse-engineers your database schema and generates those classes for you.

The generated Customers class really looks like this (simplified):

// All table references are listed here:
class Tables {
Customers CUSTOMERS = new Customers();
Staff STAFF = new Staff();
}

// All tables have an individual class each, with columns inside:
class Customers {
final Field<String> FIRST_NAME = ...
final Field<String> LAST_NAME = ...
}

As you can see, all type information is already available to you, automatically, as you have defined those types only once in the database. No need to define them again in Java.

Generic type information is ubiquitous

The interesting part is the UNION. The union() method on the DSL API simply looks like this:

public interface SelectUnionStep<R extends Record> {
SelectOrderByStep<R> union(Select<? extends R> select);
}

If we go back to our statement, we can see that the type of the object upon which we call union() is really this type:

SelectUnionStep<Record2<String, String>>

… thus, the method union() that we’re calling is really expecting an argument of this type:

union(Select<? extends Record2<String, String>> select);

… which essentially means that we’ll get a compilation error if we don’t provide two string columns also in the second subselect:

DSL.using(configuration)
.select(CUSTOMERS.FIRST_NAME, CUSTOMERS.LAST_NAME).from(CUSTOMERS)
.union(
// ^^^^^ doesn't compile, wrong argument type!
select(STAFF.FIRST_NAME).from(STAFF))
.fetch();

or also:

DSL.using(configuration)
.select(CUSTOMERS.FIRST_NAME, CUSTOMERS.LAST_NAME).from(CUSTOMERS)
.union(
// ^^^^^ doesn't compile, wrong argument type!
select(STAFF.FIRST_NAME, STAFF.DATE_OF_BIRTH).from(STAFF))
.fetch();

Static type checking helps finding bugs early

… indeed! All of the above bugs can be found at compile-time because your Java compiler will not accept the wrong SQL statements. When writing dynamic SQL, this can be incredibly subtle, as the different UNION subselects may not be created all at the same place. You may have a complex DAO that generates the SQL across several methods. With this kind of generic type safety, you can continue to do so, safely.

As mentioned before, this extends through the whole API. Check out…

IN predicates

This compiles:

// Get all customers whose first name corresponds to a staff first name
DSL.using(configuration)
.select().from(CUSTOMERS)
.where(CUSTOMERS.FIRST_NAME.in(
select(STAFF.FIRST_NAME).from(STAFF)
))
.fetch();

This doesn’t compile:

DSL.using(configuration)
.select().from(CUSTOMERS)
.where(CUSTOMERS.FIRST_NAME.in(
// ^^ wrong argument type!
select(STAFF.FIRST_NAME, STAFF.LAST_NAME).from(STAFF)
))
.fetch();

But this compiles:

// Get all customers whose first and last names both correspond
// to a staff first and last names
DSL.using(configuration)
.select().from(CUSTOMERS)
.where(row(CUSTOMERS.FIRST_NAME, CUSTOMERS.LAST_NAME).in(
select(STAFF.FIRST_NAME, STAFF.LAST_NAME).from(STAFF)
))
.fetch();

Notice the use of row() to construct a row value expression, an extremely useful but little known SQL feature.

INSERT statements

This compiles:

DSL.using(configuration)
.insertInto(CUSTOMERS, CUSTOMERS.FIRST_NAME, CUSTOMERS.LAST_NAME)
.values("John", "Doe")
.execute();

This doesn’t compile:

DSL.using(configuration)
.insertInto(CUSTOMERS, CUSTOMERS.FIRST_NAME, CUSTOMERS.LAST_NAME)
.values("John")
// ^^^^^^ Invalid number of arguments
.execute();

Conclusion

Internal domain-specific languages can express a lot of type safety in Java, almost as much as the external language really implements. In the case of SQL – which is a very type safe language – this is particularly true and interesting.

jOOQ has been designed to create as little cognitive friction as possible for any Java developer who wants to write embedded SQL in Java, i.e. the Java code will look and feel exactly like the SQL code that it represents. At the same time, jOOQ has been designed to offer as much compile-time type safety as possible in the Java language (or also in Scala, Groovy, etc.).

jOOQ is free and Open Source for use with Open Source databases, and it offers commercial licensing for use with commercial databases.

For more information about jOOQ or jOOQ’s DSL API, consider these resources:

Stay tuned for tomorrow’s article “How jOOQ allows for fluent functional-relational interactions in Java 8”
This post is part of the Java Advent Calendar and is licensed under the Creative Commons 3.0 Attribution license. If you like it, please spread the word by sharing, tweeting, FB, G+ and so on!

Type-safing the Observer with mutually recursive bounds

The Observer is known as a behavioral pattern, as it’s used to form relationships between objects at run-time.  The definition provided in the original Gang of Four book on Design Patterns states:

   Defines a one-to-many dependency between objects so that when one object changes state, all its dependents are notified and updated automatically.

The Java library implements a non-generic version of the Subject-Observer pattern in the package java.util with the class Observable and the interface Observer. The Observable class contains methods to register observers (addObserver), to indicate that the observable has changed (setChanged), and to notify all observers of any changes (notifyObservers), among others. The notifyObservers method may accept an arbitrary argument of type Object that is to be broadcast to all the observers. The Observer interface specifies the update method that is called by notifyObservers. This method takes two parameters: the first one, of type Observable, is the subject that has changed; the second, of type Object, is the broadcast argument.
We could, of course, create our own classes and not use those provided by the library, but isn’t it wonderful when someone else takes care of maintaining our code? So we’re going to use the provided library for this example.

Let’s see an example. Suppose we have a family with 4 members, mom, dad, a baby and a dog. When the little baby cries, mom changes his diaper, so we have the class Baby as an Observable and the class Mother as an Observer of the Baby:

public class Baby extends Observable {
   private String name;
   public Baby(String name) {
    this.name = name;
   }
   public void babyIsCrying() {
    setChanged();
    notifyObservers(name);
   }
}

public class Mother implements Observer {
   @Override
   public void update(Observable subject, Object param) {
    String babyName = (String) param;
    System.out.println(“Let’s change that diaper on “ + babyName + “!”);
  }
}


The father takes the dog for a walk when it barks, so the class Dog is another Observable and the class Father its Observer:
public class Dog extends Observable {
   private String name;
   public Dog(String name) {
    this.name = name;
   }
   public void barks() {
    setChanged();
    notifyObservers(name);
   }
}
public class Father implements Observer {
   @Override
   public void update(Observable o, Object arg) {
    String dogName = (String) arg;
    System.out.println(dogName + “, let’s go to a walk!”);
   }
}

At a first glance, everything seems to be working ok, but What if somebody understands wrongly the relationships and adds the Mother as an Observer of the Dog(the compiler wouldn’t complain about it and the runtime will sillently change the diaper on the dog and wouldn’t even notify us about the horrible mistake we did).
Let’s test it:
public class TestingLegacy {
   public static void main(String[] args) {
   Baby david = new Baby(“David”);
    Mother mom = new Mother();

    Dog rover = new Dog(“Rover”);
    Father john = new Father();

    // test mother-baby relationship
    david.addObserver(mom);
    david.babyIsCrying();
    // test father-dog relationship
    rover.addObserver(john);
    rover.barks();

    // delete observers to test wrong relatinships
    david.deleteObservers();
    rover.deleteObservers();

    System.out.println(“Add a wrong relationship and test.”);

    // add the father as the baby’s observer
    david.addObserver(john);
    // add the mother as the dog’s observer
    rover.addObserver(mom);

    david.babyIsCrying();
    rover.barks();
   }
}

The console outputs:
Let’s change that diaper on David!
Rover, let’s go to a walk!
Add a wrong relationship and test.
David, let’s go to a walk!
Let’s change that diaper on Rover!

To ensure that a Subject-Observer relationship is well established, we should use one of the nicest Java 5 feature: generics. They add stability to the code by making more of your bugs detectable at compile time. But there is a small problem, because Observable and Observer are part of the java docs, we can’t and shouldn’t change them. So to be able to add bounds, we will create stub classes with generic signatures but no bodies. We compile the generic client against the generic signatures, but run the code against the legacy class files. This technique is appropriate when the source is not released, or when others are responsible for maintaining the source.
So we can replace Observable and Observer with the type parameters S and O (for Subject and Observer). Then within the update method of the observer, you may call on any method supported by the subject S without first requiring a cast. Also, The appearance of Object in a method signature often indicates an opportunity to generify. So we should add a type parameter, A, corresponding to the argument type.
Here is our first version of stubs with two generic params:
public class Observable<O extends Observer<?, A>, A> {..}
public interface Observer<S extends Observable<?, A>, A> {…}

Now when creating a Mother-Baby relationship things work just fine. What about Mother-Dog? the compiler let’s us do the link, the runtime however throws a ClassCastExeption for this time.
class WrongBoundedBaby extends Observable<WrongBoundedMother, String>
class WrongBoundedMother implements Observer<Dog, String>

Exception in thread “main” java.lang.ClassCastException: bounds.WrongBoundedBaby cannot be cast to bounds.Dog
   at bounds.WrongBoundedMother.update(WrongBoundedMother.java:1)
   at java.util.Observable.notifyObservers(Observable.java:142)
   at bounds.WrongBoundedBaby.babyIsCrying(WrongBoundedBaby.java:28)
   at bounds.TestingLegacy.main(TestingLegacy.java:38)


So, not good enough, as the bug won’t be detected until run-time. We need S to be in scope in Observable so that it can be passed as a parameter to Observer, and we need O to be in scope in Observer so that it can be passed as a parameter to Observable.
class Observable<S extends Observable<S, O, A>, O extends Observer<S, O, A>, A>
interface Observer<S extends Observable<S, O, A>, O extends Observer<S, O, A>, A>

This is the last version of our classes:
class Baby extends Observable<Baby, Mother, String>
class Mother implements Observer<Baby, Mother, String>
class Dog extends Observable<Dog, Father, String>
class Father implements Observer<Dog, Father, String>

// COMPILE- error when adding the father as the baby’s observer
// david.addObserver(john);
// COMPILE- error when adding the mother as the dog’s observer
// rover.addObserver(mom);

The Java Generics and Collections book defines this type of bound as mutually recursive.

In conclusion, the Observer is a great design pattern with many implementations, but a junior dev can be easily tricked to use it wrongly. Adding the mutually recursive bounds helps us idetifying the problem at compile time and sleeping good at night when knowing that we did a great job.

For full source code, please see https://github.com/CrisIf/generics .


This post is part of the Java Advent Calendar and is licensed under the Creative Commons 3.0 Attribution license. If you like it, please spread the word by sharing, tweeting, FB, G+ and so on!