How jOOQ Allows for Fluent Functional-Relational Interactions in Java 8

In this year’s Java Advent Calendar, we’re thrilled to have been asked to feature a mini-series showing you a couple of advanced and very interesting topics that we’ve been working on when developing jOOQ.

The series consists of:

Don’t miss any of these!

How jOOQ allows for fluent functional-relational interactions in Java 8

In yesterday’s article, we’ve seen How jOOQ Leverages Generic Type Safety in its DSL when constructing SQL statements. Much more interesting than constructing SQL statements, however, is executing them.

Yesterday, we’ve seen a sample PL/SQL block that reads like this:

FOR rec IN (
SELECT first_name, last_name FROM customers
SELECT first_name, last_name FROM staff
INSERT INTO people (first_name, last_name)
VALUES rec.first_name, rec.last_name;

And you won’t be surprised to see that the exact same thing can be written in Java with jOOQ:

for (Record2<String, String> rec :, CUSTOMERS.LAST_NAME).from(CUSTOMERS)
) {
.values(rec.getValue(CUSTOMERS.FIRST_NAME), rec.getValue(CUSTOMERS.LAST_NAME))

This is a classic, imperative-style PL/SQL inspired approach at iterating over result sets and performing actions 1-1.

Java 8 changes everything!

With Java 8, lambdas appeared, and much more importantly, Streams did, and tons of other useful features. The simplest way to migrate the above foreach loop to Java 8’s “callback hell” would be the following, CUSTOMERS.LAST_NAME).from(CUSTOMERS)
.forEach(rec -> {
.values(rec.getValue(CUSTOMERS.FIRST_NAME), rec.getValue(CUSTOMERS.LAST_NAME))

This is still very simple. How about this. Let’s fetch a couple of records from the database, stream them, map them using some sophisticated Java function, reduce them into a batch update statement! Whew… here’s the code:

.where(, 3))
.map(book -> book.setTitle(book.getTitle().toUpperCase()))
dsl.batch(update(BOOK).set(BOOK.TITLE, (String) null).where(BOOK.ID.eq((Integer) null))),
(batch, book) -> batch.bind(book.getTitle(), book.getId()),
(b1, b2) -> b1

Awesome, right? Again, with comments

// Here, we simply select a couple of books from the database
.where(, 3))

// Now, we stream the result as a Java 8 Stream

// Now we map all book titles using the "sophisticated" Java function
.map(book -> book.setTitle(book.getTitle().toUpperCase()))

// Now, we reduce the books into a batch update statement...

// ... which is initialised with empty bind variables
dsl.batch(update(BOOK).set(BOOK.TITLE, (String) null).where(BOOK.ID.eq((Integer) null))),

// ... and then we bind each book's values to the batch statement
(batch, book) -> batch.bind(book.getTitle(), book.getId()),

// ... this is just a dummy combiner function, because we only operate on one batch instance
(b1, b2) -> b1

// Finally, we execute the produced batch statement

Awesome, right? Well, if you’re not too functional-ish, you can still resort to the “old ways” using imperative-style loops. Perhaps, your coworkers might prefer that:

BatchBindStep batch = dsl.batch(update(BOOK).set(BOOK.TITLE, (String) null).where(BOOK.ID.eq((Integer) null))),

for (BookRecord book :
.where(, 3))
) {
batch.bind(book.getTitle(), book.getId());


So, what’s the point of using Java 8 with jOOQ?

Java 8 might change a lot of things. Mainly, it changes the way we reason about functional data transformation algorithms. Some of the above ideas might’ve been a bit over the top. But the principal idea is that whatever is your source of data, if you think about that data in terms of Java 8 Streams, you can very easily transform (map) those streams into other types of streams as we did with the books. And nothing keeps you from collecting books that contain changes into batch update statements for batch execution.

Another example is one where we claimed that Java 8 also changes the way we perceive ORMs. ORMs are very stateful, object-oriented things that help manage database state in an object-graph representation with lots of nice features like optimistic locking, dirty checking, and implementations that support long conversations. But they’re quite terrible at data transformation. First off, they’re much much inferior to SQL in terms of data transformation capabilities. This is topped by the fact that object graphs and functional programming don’t really work well either.

With SQL (and thus with jOOQ), you’ll often stay on a flat tuple level. Tuples are extremely easy to transform. The following example shows how you can use an H2 database to query for INFORMATION_SCHEMA meta information such as table names, column names, and data types, collect those information into a data structure, before mapping that data structure into new CREATE TABLE statements:

.fetch() // jOOQ ends here
.stream() // Streams start here
r -> r.getTableName(),
r -> r,
(table, columns) -> {
// Just emit a CREATE TABLE statement
"CREATE TABLE " + table + " (");

// Map each "Column" type into a String
// containing the column specification,
// and join them using comma and
// newline. Done!
.map(col -> " " + col.getName() +
" " + col.getType())


The above statement will produce something like the following SQL script:


That’s data transformation! If you’re as excited as we are, read on in this article how this example works exactly.


Java 8 has changed everything in the Java ecosystem. Finally, we can implement functional, transformative algorithms easily using Streams and lambda expressions. SQL is also a very functional and transformative language. With jOOQ and Java 8, you can extend data transformation directly from your type safe SQL result into Java data structures, back into SQL. These things aren’t possible with JDBC. These things weren’t possible prior to Java 8.

jOOQ is free and Open Source for use with Open Source databases, and it offers commercial licensing for use with commercial databases.

For more information about jOOQ or jOOQ’s DSL API, consider these resources:

Stay tuned for tomorrow’s article “How jOOQ helps pretend that your stored procedures are a part of Java”
This post is part of the Java Advent Calendar and is licensed under the Creative Commons 3.0 Attribution license. If you like it, please spread the word by sharing, tweeting, FB, G+ and so on!

Managing Package Dependencies with Degraph

A large part of the art of software development is keeping the complexity of a system as low as possible. But what is complexity anyway? While the exact semantics vary quite a bit, depending on who you ask, probably most agree that it has a lot to do with the number of parts in a system and their interactions.

Consider a marble in space, i.e a planet, moon or star. Without any interaction this is as boring as a system can get. Nothing happens. If the marble moves, it keeps moving in exactly the same way. To be honest there isn’t even a way to determine if it is moving. Boooring.

Add a second marble to the system and let them attract each other, like earth and moon. Now the system is a more interesting. The two objects circle each other if they aren’t too fast. Somewhat interesting.

Now add a third object. In the general case things go so interesting that we can’t even predict what is going to happen. The whole system didn’t just became complex it became chaotic. You now have a three body problem In the general case this problem cannot be solved, i.e. we cannot predict what will happen with the system. But there are some special cases. Especially the case where two of the objects a very close to each other like earth and moon and the third one is so far away that the two first object behave just like one. In this case you approximate the system with two particle systems.

But what has this to do with Java? This sounds more like physics.

I think software development is similar in some aspects. A complete application is way to complicated to be understood as a whole. To fight this complexity we divide the system into parts (classes) that can be understood on their own, and that hide their inner complexity so that when we look at the larger picture we don’t have to worry about every single code line in a class, but only about the class as one entity. This is actually very similar to what physicists do with systems.

But let’s look at the scale of things. The basic building block of software is the code line. And to keep the complexity in check we bundle code lines that work together in methods. How many code lines go into a single method varies, but it is in the order of 10 lines of code.
Next you gather methods into classes. How many methods go into a single class? Typically in the order of 10 methods!

And then? We bundle 100-10000 classes in a single jar! I hope I’m not the only one who thinks something is amiss.

I’m not sure what comes out of project jigsaw, but currently Java only offers packages as a way to bundle classes. Package aren’t a powerful abstraction, yet it is the only one we have, so we better use it.

Most teams do use packages, but not in a very well structured, but ad hoc way. The result is similar to trying to consider moon and sun as on part of the system, and the earth as the other part. The result might work, but it is probably as intuitive as Ptolemy’s planetary model. Instead decide on criteria how you want to differentiate your packages. I personally call them slicings, inspired by an article by Oliver Gierke. Possible slicings in order of importance are:

  • the deployable jar file the class should end up in
  • the use case / feature / part of the business model the class belongs to
  • the technical layer the class belongs to

The packages this results in will look like this: <domain>.<deployable>.<domain part>.<layer>

It should be easy to decide where a class goes. And it should also keep the packages at a reasonable size, even when you don’t use the separation by technical layer.

But what do you gain from this? It is easier to find classes, but that’s about it. You need one more rule to make this really worth while: There must be no cyclic dependencies!

This means, if a class in a package A references a class in package B no class in B may reference A. This also applies if the reference is indirect via multiple other packages. But that is still not enough. Slices should be cycle free as well, so if a domain part X references a different domain part Y, the reverse dependency must not exist!

This will in deed put some rather strict rules on your package and dependency structure. The benefit of this is, that it becomes very flexible.

Without such a structure splitting your project in multiple parts will probably be rather difficult. Ever tried to reuse part of an application in a different one, just to realize that you basically have to include most of the the application in order to get it to compile? Ever tried to deploy different parts of an application to different servers, just to realize you can’t? It certainly happend to me before I used the approach mentioned above. But with this more strict structure, the parts you may want to reuse, will almost on their own end up on the end of the dependency chain so you can take them and bundle them in their own jar, or just copy the code in a different project and have it compile in very short time.

Also while trying to keep your packages and slices cycle free you’ll be forced to think hard, what each package involved is really about. Something that improved my code base considerably in many cases.

So there is one problem left: Dependencies are hard to see. Without a tool, it is very difficult to keep a code base cycle free. Of course there are plenty of tools that check for cycles, but cleaning up these cycles is tough and the way most tools present these cycles doesn’t help very much. I think what one needs are two things:

  1. a simple test, that can run with all your other tests and fails when you create a dependency circle.
  2. a tool that visualizes all the dependencies between classes, while at the same time showing in which slice each class belongs.

Surprise! I can recommend such a great tool: Degraph! (I’m the author, so I might be biased)

You can write tests in JUnit like this:

.withSlicing("module", "de.schauderhaft.(*).*.**")
.withSlicing("layer", "de.schauderhaft.*.(*).**"),

The test will analyze everything in the classpath that starts with de.schauderhaft. It will slice the classes in two ways: By taking the third part of the package name and by taking the forth part of the package name. So a class name de.schauderhaft.customer.persistence.HibernateCustomerRepository ends up in the module customer and in the layer persistence. And it will make sure that modules, layers and packages are cycle free.

And if it finds a dependency circle, it will create a graphml file, which you can open using the free graph editor yed. With a little layouting you get results like the following where the dependencies that result in circular dependencies are marked in red.

Again for more details on how to achieve good usable layouts I have to refer to the documentation of Degraph.

Also note that the graphs are colored mainly green with a little red, which nicely fits the season!

This post is part of the Java Advent Calendar and is licensed under the Creative Commons 3.0 Attribution license. If you like it, please spread the word by sharing, tweeting, FB, G+ and so on!

Thread local storage in Java

One of the rarely known features among developers is Thread-local storage.  The idea is simple and need for it comes in  scenarios where we need data that is … well local for the thread. If we have two threads we that refer to the same global variable but we wanna them to have separate value independently initialized of each other.

Most major programming languages have implementation of the concept. For example C++11 has even the thread_local keyword, Ruby has chosen an API approach .

Java has also an implementation of the concept with  java.lang.ThreadLocal<T> and its subclass java.lang.InheritableThreadLocal<T> since version 1.2, so nothing new and shiny here.
Let’s say that for some reason we need to have an Long specific for our thread. Using Thread local that would simple be

public class ThreadLocalExample {

  public static class SomethingToRun implements Runnable {

    private ThreadLocal threadLocal = new ThreadLocal();

    public void run() {
      System.out.println(Thread.currentThread().getName() + " " + threadLocal.get());

      try {
      } catch (InterruptedException e) {

      System.out.println(Thread.currentThread().getName() + " " + threadLocal.get());

  public static void main(String[] args) {
    SomethingToRun sharedRunnableInstance = new SomethingToRun();

    Thread thread1 = new Thread(sharedRunnableInstance);
    Thread thread2 = new Thread(sharedRunnableInstance);


One possible sample run of the following code will result into :

Thread-0 null

Thread-0 132466384576241

Thread-1 null

Thread-1 132466394296347
At the beginning the value is set to null to both threads, obviously each of them works with separate values since after setting the value to System.nanoTime() on Thread-0 it will not have any effect on the value of Thread-1 exactly as we wanted, a thread scoped long variable.

One nice side effect is a case where the thread calls multiple methods from various classes. They will all be able to use the same thread scoped variable without major API changes. Since the value is not explicitly passed through one might argue it difficult to test and bad for design, but that is a separate topic altogether.

In what areas are popular frameworks using Thread Locals?

Spring being one of the most popular frameworks in Java uses ThreadLocals internally for many parts, easily shown by a simple github search. Most of the usages are related to the current’s user’s actions or information. This is actually one of the main uses for ThreadLocals in JavaEE world, storing information for the current request like in RequestContextHolder :

private static final ThreadLocal requestAttributesHolder = 
    new NamedThreadLocal<RequestAttributes>("Request attributes");
Or the current JDBC connection user credentials in UserCredentialsDataSourceAdapter.

If we get back on RequestContextHolder we can use this class to access all of the current request information for anywhere in our code.
Common use case for this is  LocaleContextHolder that helps us store the current user’s locale.
Mockito uses it to store the current “global” configuration and if we take a look at any framework out there there is a high chance we’ll find it as well.

Thread Locals and Memory Leaks

We learned this awesome little feature so let’s use it all over the place. We can do that but few google searches and we can find out that most out there say ThreadLocal is evil. That’s not exactly true, it is a nice utility but in some contexts it might be easy to create a memory leak.

“Can you cause unintended object retention with thread locals? Sure you can. But you can do this with arrays too. That doesn’t mean that thread locals (or arrays) are bad things. Merely that you have to use them with some care. The use of thread pools demands extreme care. Sloppy use of thread pools in combination with sloppy use of thread locals can cause unintended object retention, as has been noted in many places. But placing the blame on thread locals is unwarranted.” – Joshua Bloch

It is very easy to create a memory leak in your server code using ThreadLocal if it runs on an application server. ThreadLocal context is associated to the thread where it runs, and will be garbaged once the thread is dead. Modern app servers use pool of threads instead of creating new ones on each request meaning you can end up holding large objects indefinitely in your application.  Since the thread pool is from the app server our memory leak could remain even after we unload our application. The fix for this is simple, free up resources you do not need.

One other ThreadLocal misuse is API design. Often I have seen use of RequestContextHolder(that holds ThreadLocal) all over the place, like the DAO layer for example. Later on if one were to call the same DAO methods outside a request like and scheduler for example he would get a very bad surprise.
This create black magic and many maintenance developers who will eventually figure out where you live and pay you a visit. Even though the variables in ThreadLocal are local to the thread they are very much global in your code. Make sure you really need this thread scope before you use it.

More info on the topic

This post is part of the Java Advent Calendar and is licensed under the Creative Commons 3.0 Attribution license. If you like it, please spread the word by sharing, tweeting, FB, G+ and so on!

CMS Pipelines … for NetRexx on the JVM

This year I want to tell you about a new and exciting addition to NetRexx (which, incidentally just turned 19 years old the day before yesterday). NetRexx, as some of you know, is the first alternative language for the JVM, stems from IBM, and is free and open source since 2011 ( It is a happy marriage of the Rexx Language (Michael Cowlishaw, IBM, 1979) and the JVM. NetRexx can run compiled ahead of time, as .class files for maximum performance, or interpreted, for a quick development cycle, or very dynamic production of code. After the addition of Scripting in version 3.03 last year, the new release (3.04, somewhere at the end of 2014) include Pipes.

We know what pipes are, I hear you say, but what are Pipes? A Pipeline, also called a Hartmann Pipeline, is a concept that extends and improves pipes as they are known from Unix and other operating systems. The name pipe indicates an inter- process communication mechanism, as well as the programming paradigm it has introduced. Compared to Unix pipes, Hartmann Pipelines offer multiple input- and output streams, more complex pipe topologies, and a lot more, too much for this short article but worthy of your study.

Pipelines were first implemented on VM/CMS, one of IBM’s mainframe operating systems. This version was later ported to TSO to run under MVS and has been part of several product configurations. Pipelines are widely used by VM users, in a symbiotic relationship with REXX, the interpreted language that also has its origins on this platform. Pipes in the NetRexx version are compile by a special Pipes Compiler that has been integrated with NetRexx. The resulting code can run on every platform that has a JVM (Java Virtual Machine), including z/VM and z/OS for that matter. This portable version of Pipelines was started by Ed Tomlinson in 1997 under the name of njpipes, when NetRexx was still very new, and was open sourced in 2011, soon after the NetRexx translator itself. It was integrated into the NetRexx translator in 2014 and will be released integrated in the NetRexx distribution for the first time with version 3.04. It answers the eternal question posed to the development team by every z/VM programmer we ever met: “But … Does It Have Pipes?” It also marks the first time that a non-charge Pipelines product runs on z/OS. But of course most of you will be running Linux, Windows or OSX, where NetRexx and Pipes also run splendidly.

NetRexx users are very cautious of code size and peformance – for example because applications also run on limited JVM specifications as JavaME, in Lego Robots and on Androids and Raspberries, and generally are proud and protective of the NetRexx runtime, which weighs in at 37K (yes, 37 kilobytes, it even shrunk a few bytes over the years). For this reason, the Pipes Compiler and the Stages are packaged in the NetRexxF.jar – F is for Full, and this jar also includes the eclipse Java compiler which makes NetRexx a standalone package that only needs a JRE for development. There is a NetRexxC.jar for those who have a working Java SDK and only want to compile NetRexx. So we have NetRexxR.jar at 37K, NetRexxC.jar at 322K, and the full NetRexx kaboodle in 2.8MB – still small compared to some other JVM Languages.

The pipeline terminology is a metaphore derived from plumbing. Fitting two or more pipe segments together yield a pipeline. Water flows in one direction through the pipeline. There is a source, which could be a well or a water tower; water is pumped through the pipe into the first segment, then through the other segments until it reaches a tap, and most of it will end up in the sink. A pipeline can be increased in length with more segments of pipe, and this illustrates the modular concept of the pipeline. When we discuss pipelines in relation to computing we have the same basic structure, but instead of water that passes through the pipeline, data is passed through a series of programs (stages) that act as filters. Data must come from some place and go to some place. Analogous to the well or the water tower there are device drivers that act as a source of the data, where the tap or the sink represents the place the data is going to, for example to some output device as your terminal window or a file on disk, or a network destination. Just as water, data in a pipeline flows in one direction, by convention from the left to the right.

A program that runs in a pipeline is called a stage. A program can run in more than one place in a pipeline – these occurrences function independent of each other. The pipeline specification is processed by the pipeline compiler, and it must be contained in a character string; on the commandline, it needs to be between quotes, while when contained in a file, it needs to be between the delimiters of a NetRexx string. An exclamation mark (!) is used as stage separator, while the solid vertical bar | can be used as an option when specifiying the local option for the pipe, after the pipe name. When looking a two adjaced segments in a pipeline, we call the left stage the producer and the stage on the right the consumer, with the stage separator as the connector.

A device driver reads from a device (for instance a file, the command prompt, a machine console or a network connection) or writes to a device; in some cases it can both read and write. An example of a device drivers are diskr for diskread and diskw for diskwrite; these read and write data from and to files. A pipeline can take data from one input device and write it to a different device. Within the pipeline, data can be modified in almost any way imaginable by the programmer. The simplest process for the pipeline is to read data from the input side and copy it unmodified to the output side. The pipeline compiler connects these programs; it uses one program for each device and connects them together. All pipeline segments run on their own thread and are scheduled by the pipeline scheduler. The inherent characteristic of the pipeline is that any program can be connected to any other program because each obtains data and sends data throug a device independent standard interface. The pipeline usually processes one record (or line) at a time. The pipeline reads a record for the input, processes it and sends it to the output. It continues until the input source is drained.

Until now everything was just theory, but now we are going to show how to compile and run a pipeline. The executable script pipe is included in the NetRexx distribution to specify a pipeline and to compile NetRexx source that contains pipelines. Pipelines can be specified on the command line or in a file, but will always be compiled to a .class file for execution in the JVM.

 pipe ”(hello) literal ”hello world” ! console”

This specifies a pipeline consisting of a source stage literal that puts a string (“hello world”) into the pipeline, and a console sink, that puts the string on the screen. The pipe compiler will echo the source of the pipe to the screen – or issue messages when something was mistyped. The name of the classfile is the name of the pipe, here specified between parentheses. Options also go there. We call execute the pipe by typing:

java hello

Now we have shown the obligatory example, we can make it more interesting by adding a reverse stage in between:

pipe ”(hello) literal ”hello world” ! reverse ! console

When this is executed, it dutifully types “dlrow olleh”. If we replace the string after literal with arg(), we then can start the hello pipeline with a an argument to reverse: and we run it with:

java hello a man a plan a canal panama

and it will respond:

amanap lanac a nalp a nam a

which goes to show that without ignoring space no palindrome is very convincing – which we can remedy with a change to the pipeline: use the change stage to take out the spaces:

pipe”(hello) literal arg() ! change /” ”// ! console”

Now for the interesting parts. Whole pipeline topologies can be added, webservers can be built, relational databases (all with a jdbc driver) can be queried. For people that are familiar with the z/VM CMS Pipelines product, most of its reference manual is relevant for this implementation. We are working on the new documentation to go with NetRexx 3.04.

Pipes for NetRexx are the work of Ed Tomlinson, Jeff Hennick, with contributions by Chuck Moore, myself, and others. Pipes were the first occasion I have laid eyes on NetRexx, and I am happy they now have found their place in the NetRexx open source distribution. To have a look at it, download the NetRexx source from the Kenai site ( ) and build a 3.04 version yourself. Alternatively, wait until the 3.04 package hits
This post is part of the Java Advent Calendar and is licensed under the Creative Commons 3.0 Attribution license. If you like it, please spread the word by sharing, tweeting, FB, G+ and so on!

How and Why is Unsafe used in Java?


sun.misc.Unsafe has been in Java from at least as far back as Java 1.4 (2004).  In Java 9, Unsafe will be hidden along with many other, for-internal-use classes. to improve the maintainability of the JVM.  While it is still unclear exactly what will replace Unsafe, and I suspect it will be more than one thing which replaces it, it raises the question, why is it used at all?

Doing things which the Java language doesn’t allow but are still useful.

Java doesn’t allow many of the tricks which are available to lower level languages.  For most developers this is very good thing, and it not only saves you from yourself, it also saves you from your co-workers.  It also makes it easier to import open source code because you know there is limits to how much damage they can do.  Or at least there is limits to how much you can do accidentally. If you try hard enough you can still do damage.

But why would you even try, you might wonder?  When building libraries many (but not all) of the methods in Unsafe are useful and in some cases, there is no other way to do the same thing without using JNI, which is even more dangerous and you lose the “compile once, run anywhere”

Deserialization of objects

When deserializing or building an object using a framework, you make the assumption you want to reconstitute an object which existed before.  You expect that you will use reflection to either call the setters of the class, or more likely set the internal fields directly, even the final fields.  The problem is you want to create an instance of an object, but you don’t really need a constructor as this is likely to only make things more difficult and have side effects.
public class A implements Serializable {
private final int num;
public A(int num) {
System.out.println("Hello Mum");
this.num = num;

public int getNum() {
return num;

In this class, you should be able to rebuild and set the final field, but if you have to call a constructor and it might do things which don’t have anything to do with deserialization.  For these reasons many libraries use Unsafe to create instances without calling a constructor

Unsafe unsafe = getUnsafe();
Class aClass = A.class;
A a = (A) unsafe.allocateInstance(aClass);

Calling allocateInstance avoids the need to call the appropriate constructor, when we don’t need one.

Thread safe access to direct memory

Another use for Unsafe is thread safe access to off heap memory.  ByteBuffer gives you safe access to off heap or direct memory, however it doesn’t have any thread safe operations.  This is particularly useful if you want to share data between processes.

import sun.misc.Unsafe;

import java.lang.reflect.Field;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;

public class PingPongMapMain {
public static void main(String... args) throws IOException {
boolean odd;
switch (args.length < 1 ? "usage" : args[0].toLowerCase()) {
case "odd":
odd = true;
case "even":
odd = false;
System.err.println("Usage: java PingPongMain [odd|even]");
return; }
int runs = 10000000;
long start = 0;
System.out.println("Waiting for the other odd/even");
File counters = new File(System.getProperty(""), "counters.deleteme"); counters.deleteOnExit();

try (FileChannel fc = new RandomAccessFile(counters, "rw").getChannel()) {
MappedByteBuffer mbb =, 0, 1024);
long address = ((DirectBuffer) mbb).address();
for (int i = -1; i < runs; i++) {
for (; ; ) {
long value = UNSAFE.getLongVolatile(null, address);
boolean isOdd = (value & 1) != 0;
if (isOdd != odd)
// wait for the other side.
// make the change atomic, just in case there is more than one odd/even process
if (UNSAFE.compareAndSwapLong(null, address, value, value + 1))
if (i == 0) {
start = System.nanoTime();
System.out.printf("... Finished, average ping/pong took %,d ns%n",
(System.nanoTime() - start) / runs);

static final Unsafe UNSAFE;

static {
try {
Field theUnsafe = Unsafe.class.getDeclaredField("theUnsafe");
UNSAFE = (Unsafe) theUnsafe.get(null);
} catch (Exception e) {
throw new AssertionError(e);

When you run this in two programs, one with odd and the other with even. You can see that each process is changing data via  persisted shared memory.

In each program it maps the same are of the disks cache into the process.  There is actually only one copy of the file in memory.  This means the memory can be shared, provided you use thread safe operations such as the volatile and CAS operations.

The output on an i7-3970X is

Waiting for the other odd/even
… Finished, average ping/pong took 83 ns

That is 83 ns round trip time between two processes. When you consider System V IPC takes around 2,500 ns and IPC volatile instead of persisted, that is pretty quick.

Is using Unsafe suitable for work?

I wouldn’t recommend you use Unsafe directly.  It requires far more testing than natural Java development.  For this reason I suggest you use a library where it’s usage has been tested already.  If you wan to use Unsafe yourself, I suggest you thoughly test it’s usage in a stand alone library.  This limits how Unsafe is used in your application and give syou a safer, Unsafe.


It is interesting that Unsafe exists in Java, and you might to play with it at home.  It has some work applications especially in writing low level libraries, but in general it is better to use a library which uses Unsafe which has been tested than use it directly yourself.

About the Author.

Peter Lawrey has the most Java answers on StackOverflow. He is the founder of the Performance Java User’s Group, and lead developer of Chronicle Queue and Chronicle Map, two libraries which use Unsafe to share persisted data between processes.

Self-healing applications: are they real?

This post is an example about an application where the first solution to each and every IT problem – “have you tried turning it off and on again” – can actually do more harm than good. Instead, we have an application that can literally heal itself: it fails at the beginning, but starts running smoothly after some time. To give an example of such application in action, we recreated it in the most simple form possible, gathering inspiration from what is now a five-year old post from the Heinz Kabutz’ Java Newsletter:

package eu.plumbr.test;

public class HealMe {
private static final int SIZE = (int) (Runtime.getRuntime().maxMemory() * 0.6);

public static void main(String[] args) throws Exception {
for (int i = 0; i < 1000; i++) {

private static void allocateMemory(int i) {
try {
byte[] bytes = new byte[SIZE];

byte[] moreBytes = new byte[SIZE];

System.out.println("I allocated memory successfully " + i);

} catch (OutOfMemoryError e) {
System.out.println("I failed to allocate memory " + i);


The code above is allocating two bulks of memory in a loop. Each of those allocation is equal to 60% of the total available heap size. As the allocations occur sequentially in the same method, then one might expect this code to keep throwing java.lang.OutOfMemoryError: Java heap space errors and never successfully complete the allocateMemory() method.

Let us start with the static analysis of the source code:

  1. From the first fast examination, this code really cannot complete, because we try to allocate more memory than is available to JVM.
  2. If we look closer we can notice that the first allocation takes place in a scoped block, meaning that the variables defined in this block are visible only to this block. This indicates that the bytes should be eligible for GC after the block is completed. And so our code should in fact run fine right from the beginning, as at the time when it tries to allocate moreBytes the previous allocation bytes should be dead.
  3. If we now look into the compiled classfile, we will see the following bytecode:

    private static void allocateMemory(int);
    0: getstatic #3 // Field SIZE:I
    3: newarray byte
    5: astore_1
    6: getstatic #4 // Field java/lang/System.out:Ljava/io/PrintStream;
    9: aload_1
    10: arraylength
    11: invokevirtual #5 // Method java/io/PrintStream.println:(I)V
    14: getstatic #3 // Field SIZE:I
    17: newarray byte
    19: astore_1
    20: getstatic #4 // Field java/lang/System.out:Ljava/io/PrintStream;
    23: aload_1
    24: arraylength
    25: invokevirtual #5 // Method java/io/PrintStream.println:(I)V
    ---- cut for brevity ----

    Here we see, that on offsets 3-5 first array is allocated and stored into local variable with index 1. Then, on offset 17 another array is going to be allocated. But first array is still referenced by local variable and so the second allocation should always fail with OOM. Bytecode interpreter just cannot let GC clean up first array, because it is still strongly referenced.

Our static code analysis has shown us that for two underlying reasons, the presented code should not run successfully and in one case, it should. Which one out of those three is the correct one? Let us actually run it and see for ourselves. It turns out that both conclusions were correct. First, application fails to allocate memory. But after some time (on my Mac OS X with Java 8 it happens at iteration #255) the allocations start succeeding:

java -Xmx2g eu.plumbr.test.HealMe
I failed to allocate memory 0
I failed to allocate memory 1

… cut for brevity ...

I failed to allocate memory 254
I failed to allocate memory 255
I allocated memory successfully 256
I allocated memory successfully 257

Self-healing code is a reality! Skynet is near…

In order to understand what is really happening we need to think, what changes during program execution? The obvious answer is, of course, Just-In-Time compilation can occur. If you recall, Just-In-Time compilation is a JVM built-in mechanics to optimize code hotspots. For this, the JIT monitors the running code and when a hotspot is detected, JIT compiles your bytecode into native code, performing different optimizations such as method inlining and dead code elimination in the process.

Lets see if this is a case by turning on the following command line options and relaunching the program

-XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly -XX:+LogCompilation

This will generate a log file, in our case named hotspot_pid38139.log, where 38139 was the PID of your java process. In this file the following line can be found:

<task_queued compile_id='94' method='HealMe allocateMemory (I)V' bytes='83' count='256' iicount='256' level='3' stamp='112.305' comment='tiered' hot_count='256'/>

This means, that after executing “allocateMemory” methods 256 times, C1 compiler has decided to queue this method for C1 tier 3 compilation. You can get more information about tiered compilation’s levels and different thresholds here. And so our first 256 iterations were run in interpreted mode, where bytecode interpreter, being a simple stack machine, cannot know in advance, if some variable, bytes in this case, will be used further on or not. But JIT sees the whole method at once and so can deduce than bytes will not be used anymore and is, in fact, GC eligible. Thus the garbage collection can eventually take place and our program has magically self-healed. Now, I can only hope none of the readers should actually be responsible for debugging such a case in production. But in case you wish to make someone’s life a misery, introducing code like this to production would be a sure way to achieve this.

This post is part of the Java Advent Calendar and is licensed under the Creative Commons 3.0 Attribution license. If you like it, please spread the word by sharing, tweeting, FB, G+ and so on!

Own your heap: Iterate class instances with JVMTI

Today I want to talk about a different Java that most of us don’t see and use every day, to be more exact about lower level bindings, some native code and how to perform some small magic. Albeit we won’t get to the true source of magic on JVM, but performing some small miracles is within a reach of a single post.

I spend my days researching, writing and coding on the RebelLabs team at ZeroTurnaround, a company that creates tools for Java developers that mostly run as javaagents. It’s often the case that if you want to enhance the JVM without rewriting it or get any decent power on the JVM you have to dive into the beautiful world of Java agents. These come in two flavors: Java javaagents and native ones. In this post we’ll concentrate on the latter.

Note, this GeeCON Prague presentation by Anton Arhipov, who is an XRebel product lead, is a good starting point to learn about javaagents written entirely in Java: Having fun with Javassist.

In this post we’ll create a small native JVM agent, explore the possibility of exposing native methods into the Java application and find out how to make use of the Java Virtual Machine Tool Interface.

If you’re looking for a practical takeaway from the post, we’ll be able to, spoiler alert, count how many instances of a given class are present on the heap.

Imagine that you are Santa’s trustworthy hacker elf and the big red has the following challenge for you:
Santa: My dear Hacker Elf, could you write a program that will point out how many Thread objects are currently hidden in the JVM’s heap?
Another elf that doesn’t like to challenge himself would answer: It’s easy and straightforward, right?

return Thread.getAllStackTraces().size();

But what if we want to over-engineer our solution to be able to answer this question about any given class? Say we want to implement the following interface?

public interface HeapInsight {
int countInstances(Class klass);

Yeah, that’s impossible, right? What if you receive String.class as an argument? Have no fear, we’ll just have to go a bit deeper into the internals on the JVM. One thing that is available to JVM library authors is JVMTI, a Java Virtual Machine Tool Interface. It was added ages ago and many tools, that seem magical, make use of it. JVMTI offers two things:

  • a native API
  • an instrumentation API to monitor and transform the bytecode of classes loaded into the JVM.

For the purpose of our example, we’ll need access to the native API. What we want to use is the IterateThroughHeap function, which lets us provide a custom callback to execute for every object of a given class.

First of all, let’s make a native agent that will load and echo something to make sure that our infrastructure works.

A native agent is something written in a C/C++ and compiled into a dynamic library to be loaded before we even start thinking about Java. If you’re not proficient in C++, don’t worry, plenty of elves aren’t, and it won’t be hard. My approach to C++ includes 2 main tactics: programming by coincidence and avoiding segfaults. So since I managed to write and comment the example code for this post, collectively we can go through it. Note: the paragraph above should serve as a disclaimer, don’t put this code into any environment of value to you.

Here’s how you create your first native agent:


using namespace std;

JNIEXPORT jint JNICALL Agent_OnLoad(JavaVM *jvm, char *options, void *reserved)
cout << "A message from my SuperAgent!" << endl;
return JNI_OK;

The important part of this declaration is that we declare a function called Agent_OnLoad, which follows the documentation for the dynamically linked agents.

Save the file as, for example a native-agent.cpp and let’s see what we can do about turning into a library.

I’m on OSX, so I use clang to compile it, to save you a bit of googling, here’s the full command:

clang -shared -undefined dynamic_lookup -o -I /Library/Java/JavaVirtualMachines/jdk1.8.0.jdk/Contents/Home/include/ -I /Library/Java/JavaVirtualMachines/jdk1.8.0.jdk/Contents/Home/include/darwin native-agent.cpp

This creates an file that is a library ready to serve us. To test it, let’s create a dummy hello world Java class.

package org.shelajev;
public class Main {
public static void main(String[] args) {
System.out.println("Hello World!");

When you run it with a correct -agentpath option pointing to the, you should see the following output:

java org.shelajev.Main
A message from my SuperAgent!
Hello World!

Great job! We now have everything in place to make it actually useful. First of all we need an instance of jvmtiEnv, which is available through a JavaVM *jvm when we are in the Agent_OnLoad, but is not available later. So we have to store it somewhere globally accessible. We do it by declaring a global struct to store it.


using namespace std;

typedef struct {
jvmtiEnv *jvmti;
} GlobalAgentData;

static GlobalAgentData *gdata;

JNIEXPORT jint JNICALL Agent_OnLoad(JavaVM *jvm, char *options, void *reserved)
jvmtiEnv *jvmti = NULL;
jvmtiCapabilities capa;
jvmtiError error;

// put a jvmtiEnv instance at jvmti.
jint result = jvm->GetEnv((void **) &jvmti, JVMTI_VERSION_1_1);
if (result != JNI_OK) {
printf("ERROR: Unable to access JVMTI!n");
// add a capability to tag objects
(void)memset(&capa, 0, sizeof(jvmtiCapabilities));
capa.can_tag_objects = 1;
error = (jvmti)->AddCapabilities(&capa);

// store jvmti in a global data
gdata = (GlobalAgentData*) malloc(sizeof(GlobalAgentData));
gdata->jvmti = jvmti;
return JNI_OK;

We also updated the code to add a capability to tag objects, which we’ll need for iterating through the heap. The preparations are done now, we have the JVMTI instance initialized and available for us. Let’s offer it to our Java code via a JNI.

JNI stands for Java Native Interface, a standard way to include native code calls into a Java application. The Java part will be pretty straightforward, add the following countInstances method definition to the Main class:

package org.shelajev;

public class Main {
public static void main(String[] args) {
System.out.println("Hello World!");
int a = countInstances(Thread.class);
System.out.println("There are " + a + " instances of " + Thread.class);

private static native int countInstances(Class klass);

To accommodate the native method, we must change our native agent code. I’ll explain it in a minute, but for now add the following function definitions there:

extern "C"
JNICALL jint objectCountingCallback(jlong class_tag, jlong size, jlong* tag_ptr, jint length, void* user_data)
int* count = (int*) user_data;
*count += 1;

extern "C"
JNIEXPORT jint JNICALL Java_org_shelajev_Main_countInstances(JNIEnv *env, jclass thisClass, jclass klass)
int count = 0;
jvmtiHeapCallbacks callbacks;
(void)memset(&callbacks, 0, sizeof(callbacks));
callbacks.heap_iteration_callback = &objectCountingCallback;
jvmtiError error = gdata->jvmti->IterateThroughHeap(0, klass, &callbacks, &count);
return count;

Java_org_shelajev_Main_countInstances is more interesting here, its name follows the convention, starting with Java_ then the _ separated fully qualified class name, then the method name from the Java code. Also, don’t forget the JNIEXPORT declaration, which says that the function is exported into the Java world.

Inside the Java_org_shelajev_Main_countInstances we specify the objectCountingCallback function as a callback and call IterateThroughHeap with the parameters that came from the Java application.

Note that our native method is static, so the arguments in the C counterpart are:

JNIEnv *env, jclass thisClass, jclass klass

for an instance method they would be a bit different:

JNIEnv *env, jobj thisInstance, jclass klass

where thisInstance points to the this object of the Java method call.

Now the definition of the objectCountingCallback comes directly from the documentation. And the body does nothing more than incrementing an int.

Boom! All done! Thank you for your patience. If you’re still reading this, you’re ready to test all the code above.

Compile the native agent again and run the Main class. This is what I see:

java org.shelajev.Main
Hello World!
There are 7 instances of class java.lang.Thread

If I add a Thread t = new Thread(); line to the main method, I see 8 instances on the heap. Sounds like it actually works. Your thread count will almost certainly be different, don’t worry, it’s normal because it does count JVM bookkeeping threads, that do compilation, GC, etc.

Now, if I want to count the number of String instances on the heap, it’s just a matter of changing the argument class. A truly generic solution, Santa would be happy I hope.

Oh, if you’re interested, it finds 2423 instances of String for me. A pretty high number for such as small application. Also,

return Thread.getAllStackTraces().size();

gives me 5, not 8, because it excludes the bookkeeping threads! Talk about trivial solutions, eh?

Now you’re armed with this knowledge and this tutorial I’m not saying you’re ready to write your own JVM monitoring or enhancing tools, but it is definitely a start.

In this post we went from zero to writing a native Java agent, that compiles, loads and runs successfully. It uses the JVMTI to obtain the insight into the JVM that is not accessible otherwise. The corresponding Java code calls the native library and interprets the result.
This is often the approach the most miraculous JVM tools take and I hope that some of the magic has been demystified for you.

What do you think, does it clarify agents for you? Let me know! Find me and chat with me on twitter: @shelajev.

This post is part of the Java Advent Calendar and is licensed under the Creative Commons 3.0 Attribution license. If you like it, please spread the word by sharing, tweeting, FB, G+ and so on!

Mutation testing in Java with PIT and Gradle

A project might find itself in different stages of unit test line and branch coverage.
But what does that really say about a project? If it’s 100% covered, is it really tested?
Or as wikipedia puts it :

Tests can be created to verify the correctness of the implementation of a given software system, but the creation of tests still poses the question whether the tests are correct and sufficiently cover the requirements that have originated the implementation. 

Whether intentionally doing a poor job, or just being sloppy, coverage percentages just serve as a nice tool to lie bring reassurances to oneself or to the management about quality.
The real measure of quality are the tests quality AND the amount covered.
So this begs the question, besides reviewing the tests alongside the code and giving it some subjective grade, is there such a tool that can speak volumes about the quality of tests ?
The answer is Yes: Mutation testing
I won’t bore you with the details you can find out more on the provided links, but the basic idea is :
If you modify the code in some non-equivalent way and the test still pass, you have bad tests, and some programming error down the line won’t be caught by the tests.
Fortunately some nice folks are working on a really cool project called PIT (github), which is a mutation tester for Java.
I encourage you to read everything on the site including the media links and try it out.
For the purpose of this article I will provide an example project for showcasing mutation testing with PIT, including a gradle build script. This is made possible by the PIT gradle plugin 
Github for this post here : game-of-life-mutation-test

I choose to implement Conway’s Game of Life, inspired by the Global Day of Code Retreat. One of the tests is commented. Let’s assume in a real world scenario we would have written all the tests there, without that one.
Coverage tools report 100% line coverage and branch coverage ( ignoring the equals method )
Eclipse with EclEmma and Intellij Coverage report : 
At this point you might call it a day and a job well done. Test look preety thorough, and there are a lot of them, the code is tested, but are the tests solid?
Turns out they are not. If we run the PIT tester
gradle pitest

reports will be generated on the path

PIT has detected that if we negate the condition test pass all the same.
It would be easy to make this mistake, or some other developer to alter the code in such a subtle way and no one to notice until one day when the code will not work as expected in a production environment.
To solve this uncomment the glider test ( a more complex test for GoL ).
If you trully believe you have quality tests, consider running mutation testing for an extra verification.
This post is part of the Java Advent Calendar and is licensed under the Creative Commons 3.0 Attribution license. If you like it, please spread the word by sharing, tweeting, FB, G+ and so on!

Recompiling the Java runtime library with debug symbols

The JDK already comes with the sources for the Java runtime library (also known as “rt.jar”) and also the class files are compiled with “LineNumberTable” which makes it possible to set breakpoints on a specified line. However the class files are not compiled with “LocalVariableTable” which means that even when you’re stopped at a breakpoint, you can only look at the method parameters, not at the local variables (hint you can use javap to check if a certain class file contains LineNumberTable, LocalVariableTable, both or neither). Probably this was done to save space, which is understandable in the case of the JRE, but for the JDK it’s annoying.

Fortunately you can fix this! Just use the ant script at the end of this post (or get it here) to build rt_debug.jar. After having built it, you have multiple options to make use of it:

  • Put it in your jre/lib/endorsed directory (create it if doesn’t exists)
  • Specify it using -Xbootclasspath
  • Override jre/lib/rt.jar with it (after making a backup of course!)

I personally used the first method. The only downside is that classes appear twice in the Eclipse type explorer (once for rt.jar and once for rt_debug.jar. Perhaps in the future we will get the rt.jar for JDK compiled with full debug symbols by default.

Final note: the ant script relies on having the JAVA_HOME variable available, so you might need to specify it by hand in case it isn’t set. For example on Ubuntu 14.10 the full command to be run is:

JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 ant -f debug_jdk.xml

Ant script:

Credits for inspiration:

This post is part of the Java Advent Calendar and is licensed under the Creative Commons 3.0 Attribution license. If you like it, please spread the word by sharing, tweeting, FB, G+ and so on!

2013 – The Java/JVM Year in Review and a Look to the Future!

Hi all, It’s been an amazing year for Java/JVM developers!

For me its been a year of fine tuning the Adopt OpenJDK and Adopt a JSR programmes with the LJC and working hard at jClarity – The Java/JVM performance analysis company that I’ve now been demoted to CEO of ;-).

In terms of the overall ecosystem there’s been some great steps forward:

  1. The language and platform continues to evolve (e.g. Java 8 is looking in great shape!).
  2. The standards body continues to become more inclusive and open (e.g. More Java User Groups have joined the Executive Committee of the JCP).
  3. The community continues to thrive (record numbers of conferences, Java User Groups and GitHub commits).

As is traditional, I’ll try and take you through some of the trends of the past year that was and a small peek into the future as well.

2013 Trends

There were some major trends in the Java/JVM and software development arena in 2013. Mobile Apps have become the norm, virtualised infrastructure is everywhere and of course Java is still growing faster (in absolute terms) than any other language out there except for Javascript. Here are some other trends worth noting.

Functional Programming

Mainstream Java developers have adopted Scala, Groovy, Clojure and other JVM languages to write safer, more concise code for call back handlers, filters, reduces and a host of other operations on collections and event handlers. A massive wave of developers will of course embrace Java 8 and Lambdas when it comes out in early 2014.

Java moving beyond traditional web apps

In 2013 Java firmly entrenched itself into areas of software development outside of the traditional 3-tier web application. It is used realm of NoSQL (e.g. Hadoop), High performance Messaging systems (e.g. LMAX Disruptor), Polyglot Actor-Like Application Frameworks (e.g. Vert.x) and much much more.


2013 has started to see the slow decline of Java/JVM serverside templating systems as the rise of Javascript libraries and micro frameworks (e.g. AngularJS, Twitter Bootstrap) means that developers can finally have a client side tempalting framework that doesn’t suck and has real data binding to JSON/XML messages going to and from the browser. Websockets and asynchronous messaging are also nicely taken care of by the likes of SockJS.

2014+ – The Future

Java 8 – Lambdas

More than enough has been written about these, so I’ll offer up a practical tutorial and the pragmatic Java 8 Lambdas Book for 2014 that you should be buying!

Java 9/10 – VM improvements

Java 9 and 10 are unlikely to bring many changes to the language itself, apart from applying Java 7 and 8 features to the internal APIs, making the overall Java API a much more pleasant experience to use.

So what is coming? Well the first likely major feature will be some sort of packed object or value type representation. This is where you will be able to defined a Java ‘object’ as being purely a data type. What do I mean by this? Well it means that the ‘object’ will be able to be laid out in memory very efficiently and not have the overhead of hashCode, equals or other regalia that comes with being a Java Object. Think of it a little bit like a C/C++ struct. Here’s an example of what it might look like in source code:

final class PackedPoint extends PackedObject {
int x, y;

And the native representation:

// Native representation
struct Point {
int x, y;

There are loads of performance advantages to this, especially for immutable, well defined data structures such as points on a graph or within a 3D model. The JVM will be able to use far more efficient housekeeping mechanisms with respects to Memory Layout, Lookups, JIT and Garbage Collection.

Some sort of Reified generics will also come into the picture, but it’s more likely to be an extension of the work done of the ‘packed object’ idea and removing the need to unnecessary overhead in maintaining and the manipulation of collections that use the Object representation of primitives (e.g. List).

Java 9/10 – ME/SE convergence

It was very clear at JavaOne this year that Oracle are going to try and make Java a player in the IoT space. The opportunuites are endless, but the JVM has some way to go to find the right balance bewtwen being able to work on really small decvices (which is where Java ME and CDLC sit today) and providing a rich language features and a full stack for Java developers (which is where Java SE and EE sit today).

The merger between ME and SE begins in earnest for Java 8 and will hopefully be completed by Java 9. I encourage all of oyu to try out early versions of this on your Raspberry Pi!

But wait there’s more!

The way Java itself is being build and maintained is changing itself with a move towards more open development at OpenJDK. Features such as Tail Call Recursion and Co-routines are being discussed and there is a port on the way to have Java natively work on graphics cards.

There’s plenty to challenge yourself with and also have a lot of fun in 2014! Me? I’m going to try and get Java running on my Pi and maybe build that AngularJS front end for controlling hardware in my house…..

Happy Holidays everyone!

Martijn (@karianna) Java Champion, Speaker, Author, Cat Herder etc

Meta: this post is part of the Java Advent Calendar and is licensed under the Creative Commons 3.0 Attribution license. If you like it, please spread the word by sharing, tweeting, FB, G+ and so on!