Java Advent 2014 wrap-up

As they say, better late than never, here is the calendar with all of the 2014 posts:

Sunday Monday Tuesday Wednesday Thursday Friday Saturday
1 2 3 4 5
JDK 9 – a letter to Santa?!
6
Recompiling java runtime library with debug symbols
7
Mutation testing in java with PIT and gradle
8
Creating REST API with Spring Boot and MongoDB
9
Own your heap: Iterate class instances with JVMTI
10
Self healing applications are they real?
11
Developers want to be heard!
and
How and Why is Unsage used in Java?
12
Lightweight Integration with Java EE and Camel
13
A serpentine path to music
14
CMS pipelines for NetRexx on JVM
15
Thread local storage in Java
16
Managing Package Dependencies with Degraph
17
How jOOQ Leverages Generic Type Safety in its DSL
18
How jOOQ Allows for Fluent Functional-Relational Interactions in Java 8
19
How jOOQ Helps Pretend that Your Stored Procedures are a Part of Java
20
How is Java/JVM built? Adopt OpenJDK is your answer!
21
Doing microservices with micro-infra
22 A persistent KeyValue Server in 40 lines and a sad fact 23
The Java Ecosystem – My top 5 highlights of 2014
24
A Musical Finale
25 26 27 28
29 30 31 1 2 3 4

Merry Christmas everyone!

This is the first year of the Java Advent Project and I am really grateful to all the people that got involved, published articles, twitted, shared, +1ed, shared etc. etc.
It was an unbelievable journey and all the glory needs to go to the people that took some time from their loved ones to give us their wisdom. As they say, the Class of 2014 of Java Advent is comprised of (in the order of publishing date):

Thank you girls and guys for making it happen yet once more. And sorry for stressing and pushing you out. Also, last but not least thanks to Voxxed editors Lucy Carey and Mite Mitreski.

A Musical Finale

What could be more fitting than Christmas music for Christmas Eve?

In this post I want to discuss the joy of making music with Java and why/how I have come to use Python…

But first, let us celebrate the season!

We are all human and irrespective of our beliefs, it seems we all enjoy music of some form. For me some of the most beautiful music of all was written by Johan Sebastian Bach. Between 1708 and 1717 he wrote a set of pieces which are collectively called Orgelbüchlein (Little Organ Book). For this post and to celebrate the Java Advent Calendar I tasked Sonic Field to play this piece of music modelling the sounds of a 18th century pipe organ. If you did not know, yes some German organs of about that time really were able to produce huge sounds with reed pipes (for example, Passacaglia And Fugue the Trost Organ). The piece here is a ‘Choral Prelude’ which is based on what we would in English commonly call a Carol to be sung by an ensemble.

BWV 610 Jesu, meine Freude [Jesus, my joy]
This performance dedicated to the Java Advent Calendar
and created exclusively on the JVM using pure
mathematics.
How was this piece created?
Step one is to transcribe the score into midi. Fortunately, someone else already did this for me using automated score reading software. Not so fortunately, this software makes all sorts of mistakes which have to be fixed. The biggest issue with automatically generated midi files is that they end up with overlapped notes on the same channel; that is strictly impossible in midi and ends up with an ambiguous interpretation of what the sound should be. Midi considers audio as note on, note off. So Note On, Note On, Note Off, Note Off is ambiguous; does it mean:

One note overlapping the next or:
—————–
             —————

One note entirely contained in a longer note?
—————————-
             —-

Fortunately, tricks can be used to try and figure this out based on note length etc. The Java decoder always treats notes as fully contained. The Python method looks for very short notes which are contained in long ones and guesses the real intention was two long notes which ended up overlapped slightly. Here is the python (the Java is here on github).


def repareOverlapMidi(midi,blip=5):
print "Interpretation Pass"
mute=True
while mute:
endAt=len(midi)-1
mute=False
index=0
midiOut=[]
this=[]
next=[]
print "Demerge pass:",endAt
midi=sorted(midi, key=lambda tup: tup[0])
midi=sorted(midi, key=lambda tup: tup[3])
while index<endAt:
this=midi[index]
next=midi[index+1]
ttickOn,ttickOff,tnote,tkey,tvelocity=this
ntickOn,ntickOff,nnote,nkey,nvelocity=next

# Merge interpretation
finished=False
dif=(ttickOff-ttickOn)
if dif<blip and tkey==nkey and ttickOff>=ntickOn and ttickOff<=ntickOff:
print "Separating: ",this,next," Diff: ",(ttickOff-ntickOn)
midiOut.append([ttickOn ,ntickOn ,tnote,tkey,tvelocity])
midiOut.append([ttickOff,ntickOff,nnote,nkey,nvelocity])
index+=1
mute=True
elif dif<blip:
print "Removing blip: ",(ttickOff-ttickOn)
index+=1
mute=True
continue
else:
midiOut.append(this)
# iterate the loop
index+=1
if index==endAt:
midiOut.append(next)
if not mute:
return midiOut
midi=midiOut

[This AGPL code is on Github]

Then comes some real fun. If you know the original piece, you might have noticed that the introduction is not original. I added that in the midi editing software Aria Maestosa. It does not need to be done this way; we do not even need to use midi files. A lot of the music I have created in Sonic Field is just coded directly in Python. However, from midi is how it was done here.

Once we have a clean set of notes they need to be converted into sounds. That is done with ‘voicing’. I will talk a little about that to set the scene then we can get back into more Java code oriented discussion. After all, this is the Java advent calendar!

Voicing is exactly the sort of activity which brings Python to the fore. Java is a wordy language which has a large degree of strictness. It favours well constructed, stable structures. Python relies on its clean syntax rules and layout and the principle of least astonishment. For me, this Pythonic approach really helps with the very human process of making a sound:


def chA():
global midi,index
print "##### Channel A #####"
index+=1
midi=shorterThanMidi(midis[index],beat,512)
midi=dampVelocity(midi,80,0.75)
doMidi(voice=leadDiapason,vCorrect=1.0)
postProcess()

midi=longAsMidi(midis[index],beat,512)
midi=legatoMidi(midi,beat,128)
midi=dampVelocity(midi,68,0.75)
midi=dampVelocity(midi,80,0.75)
doMidi(voice=orchestralOboe,vCorrect=0.35,flatEnv=True,pan=0.2)
postProcessTremolate(rate=3.5)
doMidi(voice=orchestralOboe,vCorrect=0.35,flatEnv=True,pan=0.8)
postProcessTremolate(rate=4.5)

Above is a ‘voice’. Contrary to what one might think, a synthesised sound does not often consist of just one sound source. It consists of many. A piece of music might have many ‘voices’ and each voice will be a composite of several sounds. To create just the one voice above I have split the notes into long notes and short notes. Then the actual notes are created by a call to doMidi. This takes advantage of Python’s ‘named arguments with default values’ feature. Here is the signature for doMidi:


def doMidi(voice,vCorrect,pitchShift=1.0,qFactor=1.0,subBass=False,flatEnv=False,pure=False,pan=-1,rawBass=False,pitchAdd=0.0,decay=False,bend=True):

The most complex (unsurprisingly) voice to create is that
of a human singing. I have been working on this for
a long time and there is a long way to go; however, its
is a spectrogram of a piece of music which does
a passable job of sounding like someone singing.

The first argument is actually a reference to a function which will create the basic tone. The rest of the arguments describe how that tone will be manipulated in the note formation. Whilst an approach like this can be mimicked using a builder pattern in Java; this latter language does not lend it self to the ‘playing around’ nature of Python (at least for me).

For example, I could just run the script and add flatEvn=True to the arguments and run it again and compare the two sounds. It is an intuitive way of working.

Anyhow, once each voice has been composited from many tones and tweaked into the shape and texture we want, they turn up as a huge list of lists of notes which are all mixed together and written out to disk as a flat file format which is basically just a dump of the underlying double data. At this point it sounds terrible! Making the notes is often only half the story.

Voice Synthesis by Sonic Field
played specifically for this post.

You see, real sounds happen in a space. Our Choral is expected to be performed in a church. Notes played without a space around them sound completely artificial and lack any interest. To solve this we use impulse response reverberation. The mathematics behind this is rather complex and so I will not go into it in detail. However in the next section I will start to look at this as a perfect example of why Java is not only necessary but ideal as the back end to Python/Jython.

You seem to like Python Alex – Why Bother With Java?

My post might seem a bit like a Python sales job so far. What has been happening is simply a justification of using Python when Java is so good as a language (especially when written in a great IDE like Eclipse for Java). Let us look at something Python would be very bad indeed at. Here is the code for performing the Fast Fourier Transform, which is a the heart of putting sounds into a space.


package com.nerdscentral.audio.pitch.algorithm;

public class CacheableFFT
{

private final int n, m;

// Lookup tables. Only need to recompute when size of FFT changes.
private final double[] cos;
private final double[] sin;
private final boolean forward;

public boolean isForward()
{
return forward;
}

public int size()
{
return n;
}

public CacheableFFT(int n1, boolean isForward)
{
this.forward = isForward;
this.n = n1;
this.m = (int) (Math.log(n1) / Math.log(2));

// Make sure n is a power of 2
if (n1 != (1 << m)) throw new RuntimeException(Messages.getString("CacheableFFT.0")); //$NON-NLS-1$

cos = new double[n1 / 2];
sin = new double[n1 / 2];
double dir = isForward ? -2 * Math.PI : 2 * Math.PI;

for (int i = 0; i < n1 / 2; i++)
{
cos[i] = Math.cos(dir * i / n1);
sin[i] = Math.sin(dir * i / n1);
}

}

public void fft(double[] x, double[] y)
{
int i, j, k, n1, n2, a;
double c, s, t1, t2;

// Bit-reverse
j = 0;
n2 = n / 2;
for (i = 1; i < n - 1; i++)
{
n1 = n2;
while (j >= n1)
{
j = j - n1;
n1 = n1 / 2;
}
j = j + n1;

if (i < j)
{
t1 = x[i];
x[i] = x[j];
x[j] = t1;
t1 = y[i];
y[i] = y[j];
y[j] = t1;
}
}

// FFT
n1 = 0;
n2 = 1;

for (i = 0; i < m; i++)
{
n1 = n2;
n2 = n2 + n2;
a = 0;

for (j = 0; j < n1; j++)
{
c = cos[a];
s = sin[a];
a += 1 << (m - i - 1);

for (k = j; k < n; k = k + n2)
{
t1 = c * x[k + n1] - s * y[k + n1];
t2 = s * x[k + n1] + c * y[k + n1];
x[k + n1] = x[k] - t1;
y[k + n1] = y[k] - t2;
x[k] = x[k] + t1;
y[k] = y[k] + t2;
}
}
}
}
}

[This AGPL code is on Github]

It would be complete lunacy to implement this methematics in JPython (dynamic late binding would give unusably bad performance). Java does a great job of running it quickly and efficiently. In Java this runs just about as fast as it could in any language plus the clean, simple object structure of Java means that using the ‘caching’ system as straight forward. The caching comes from the fact that the cos and sin multipliers of the FFT can be re-used when the transform is the same length. Now, in the creation of reverberation effects (those effects which put sound into a space) FFT lengths are the same over and over again due to windowing. So the speed and object oriented power of Java have both fed into creating a clean, high performance implementation.

But we can go further and make the FFT parallelised:


def reverbInner(signal,convol,grainLength):
def rii():
mag=sf.Magnitude(+signal)
if mag>0:
signal_=sf.Concatenate(signal,sf.Silence(grainLength))
signal_=sf.FrequencyDomain(signal_)
signal_=sf.CrossMultiply(convol,signal_)
signal_=sf.TimeDomain(signal_)
newMag=sf.Magnitude(+signal_)
if newMag>0:
signal_=sf.NumericVolume(signal_,mag/newMag)
# tail out clicks due to amplitude at end of signal
return sf.Realise(signal_)
else:
return sf.Silence(sf.Length(signal_))
else:
-convol
return signal
return sf_do(rii)

def reverberate(signal,convol):
def revi():
grainLength = sf.Length(+convol)
convol_=sf.FrequencyDomain(sf.Concatenate(convol,sf.Silence(grainLength)))
signal_=sf.Concatenate(signal,sf.Silence(grainLength))
out=[]
for grain in sf.Granulate(signal_,grainLength):
(signal_i,at)=grain
out.append((reverbInner(signal_i,+convol_,grainLength),at))
-convol_
return sf.Clean(sf.FixSize(sf.MixAt(out)))
return sf_do(revi)

Here we have the Python which performs the FFT to produce impulse response reverberation (convolution reverb is another name for this approach). The second function breaks the sound into grains. Each grain is then processes individually and they all have the same length. This performs that windowing effect I talked about earlier (I use a triangular window which is not ideal but works well enough due to the long window size). If the grains are long enough, the impact of lots of little FFT calculation basically the same as the effect of one huge one. However, FFT is a nLog(n) process, so lots of little calculations is a lot faster than one big one. In effect, windowing make FFT become a linear scaling calculation.

Note that the granulation process is performed in a future. We define a closure called revi and pass it to sf_do() which is executed it at some point in the future base on demand and the number of threads available.  Next we can look at the code which performs the FFT on each grain – rii. That again is performed in a future. In other words, the individual windowed FFT calculations are all performed in futures. The expression of a parallel windowed FFT engine in C or FORTRAN ends up very complex and rather intractable. I have not personally come across one which is integrated into the generalised, thread pooled, future based schedular. Nevertheless, the combination of Jython and Java makes such a thing very easy to create.

How are the two meshed?

Now that I hope I have put a good argument for hybrid programming between a great dynamic language (in this case Python) and a powerful mid level static language (in this case Java) it is time to look at how the two are fused together. There are many ways of doing this but Sonic Field picks a very distinct approach. It does not offer a general interface between the two where lots of intermediate code is generated and each method in Java is exposed separately into Python; rather it uses a uniform single interface with virtual dispatch.

Sonic Field defines a very (aggressively) simple calling convention from Python into Java which initially might look like a major pain in the behind but works out to create a very flexible and powerful approach.

Sonic Field defines ‘operators’ which all implement the following interface:


/* For Copyright and License see LICENSE.txt and COPYING.txt in the root directory */
package com.nerdscentral.sython;

import java.io.Serializable;

/**
* @author AlexTu
*
*/
public interface SFPL_Operator extends Serializable
{

/**
* <b>Gives the key word which the parser will use for this operator</b>
*
* @return the key word
*/
public String Word();

/**
* <b>Operate</b> What ever this operator does when SFPL is running is done by this method. The execution loop all this
* method with the current execution context and the passed forward operand.
*
* @param input
* the operand passed into this operator
* @param context
* the current execution context
* @return the operand passed forward from this operator
* @throws SFPL_RuntimeException
*/
public Object Interpret(Object input, SFPL_Context context) throws SFPL_RuntimeException;
}
The word() method returns the name of the operator as it will be expressed in Python. The Interpret() method processes arguments passed to it from Python. As Sonic Field comes up it creates a Jython interpreter and then adds the operators to it. The mechanism for doing this is a little involved so rather than go into detail here, I will simply give links to the code on github:
The result is that every operator is exposed in Python as sf.xxx where xxx is the return from the word() method. With clever operator overloading and other syntactical tricks in Python I am sure that the approach could be refined. Right now, there are a lot of sf.xxx calls in Sonic Field Python ( I call it Synthon ) but I have not gotten around to improving on this simple and effective approach.

You might have noticed that everything passed into Java from Python is just ‘object’. This seems a bit crude at first take. However, as we touched on in the section of futures in the previous post, it offers many advantages because the translation from Jython to Java is orchestrated via the Caster object and a layer of Python which transparently perform many useful translations. For example, the code automatically translates multiple arguments in Jython to a list of objects in Java:


def run(self,word,input,args):
if len(args)!=0:
args=list(args)
args.insert(0,input)
input=args
trace=''.join(traceback.format_stack())
SFSignal.setPythonStack(trace)
ret=self.processors.get(word).Interpret(input,self.context)
return ret

Here we can see how the arguments are processed into a list  (which is Jython is implemented as an ArrayList) if there are more than one but are passed as a single object is there is only one. We can also see how the Python stack trace is passed into a thread local in  the Java SFSignal object. Should an SFSignal not be freed or be double collected, this Python stack is displayed to help debug the program.

Is this interface approach a generally good idea for Jython/Java Communication?

Definitely not! It works here because of the nature of the Sonic Field audio progressing architecture. We have processors which can be chained. Each processor has a simple input and output. The semantic content passed between Python and Java is quite limited. In more general purpose programming, this simple architecture, rather than being flexible and powerful, would be highly restrictive. In this case, the normal Jython interface with Java would be much more effective. Again, we can see a great example of this simplicity in the previous post when talking about threading (where Python access Java Future objects). Another example is the direct interaction of Python with SFData objects in this post on modelling oscillators in Python.


from com.nerdscentral.audio import SFData
...
data=SFData.build(len)
for x in range(0,length):
s,t,xs=osc.next()
data.setSample(x,s)

Which violated the programming model of Sonic Field by creating audio samples directly from Jython, but at the same time illustrates the power of Jython! It also created one of the most unusual soundscapes I have so far achieved with the technology:

Engines of war, sound modelling
from oscillators in Python.

Wrapping It Up

Well, that is all folks. I could ramble on for ever, but I think I have answered most if not all of the questions I set out in the first post. The key ones that really interest me are about creativity and hybrid programming. Naturally, I am obsessed with performance as I am by profession an optimisation consultant, but moving away from my day job, can Jython and Java be a creating environment and do they offer more creativity than pure Java?

Transition State Analysis using
hybrid programming

Too many years ago I worked on a similar hybrid approach in scientific computing. The GRACE software which I helped develop as part of the team at Bath was able to break new ground because it was easier to explore ideas in the hybrid approach than writing raw FORTRAN constantly. I cannot present in deterministic, reductionist language a consistent argument for why this applied then to science or now to music; nevertheless, experience from myself and others has show this to be a strong argument.

Whether you agree or disagree; irrespective of if you like the music or detest it; I wish you a very merry Christmas indeed.

This post is part of the Java Advent Calendar and is licensed under the Creative Commons 3.0 Attribution license. If you like it, please spread the word by sharing, tweeting, FB, G+ and so on!

The Java Ecosystem – My top 5 highlights of 2014

1. February the 1st – RedMonk Analyst firm declares that Java is more popular & diverse than ever!

The Java Ecosystem started off with a hiss and a roar in 2014 with the annual meeting of the Free Java room at FOSDEM. As well as the many fine deep technical talks on OpenJDK and related topics there was also a surprise presentation on the industry from Steve O’Grady (RedMonk Analyst). Steve gave a data lead insight into where Java ranked in terms of both popularity and scope at the start of 2014. The analysis on where Java is as a language is repeated on RedMonk’s Blog. The fact it remains a top two language didn’t surprise anyone, but it was the other angle that really surprised even those of us heavily involved in the ecosystem. Steve’s talk clearly showed that Java is aggressively diverse, appearing in industries such as social media, messaging, gaming, mobile, virtualisation, build systems and many more, not just Enterprise apps that people most commonly think about. Steve also showed that Java is being used heavily in new projects (across all of those industry sectors) which certainly killed the myth of Java being a legacy enterprise platform.

2. March the 18th – Java 8 arrives

The arrival of Java 8 ushered in a new Functional/OO hybrid direction for the language giving it a new lease of life. The adoption rates have been incredible (See Typesafe’s full report on this) it was clearly the release that Java developers were waiting for.

Some extra thoughts around the highlights of this release:

  • Lambdas (JSR 335) – There has been so much written about this topic already with a ton of fantastic books and tutorials to boot. For me the clear benefit to most Java developers was that they’re finally able to express the correct intent of behaviour with collections without all of the unnecessary boiler plate that imperative/OO constructs forced upon them. It boils down to the old adage of That there are only two problems in computer science, cache invalidation, naming things, and off-by-one errors. The new streams API for collection in conjunction with Lambdas certainly helps with the last two!
  • Project Nashorn (JSR 223, JEP 174) – The JavaScript runtime which allows developers to embed JavaScript code within their Java applications. Although I personally won’t be using this anytime soon, it was yet another boost to the JVM in terms of first class support for dynamically typed languages. I’m looking forward to this trend continuing!
  • Date and Time API (JSR 310, JEP 150) – This is sort of bread and butter API that a blue collar language like Java just needs to get right, and this time (take 3) they did! It’s been great to finally be able to work with timezones correctly and it also set a new precedence of Immutable First as a conscious design decision for new APIs in Java.

3. ~July – ARM 64 port (AArch64)

RedHat have lead the effort to get the ARMv8 64-bit architecture supported in Java. This is clearly an important step to keep Java truly “Run anywhere” and alongside SAP’s PowerPC/AIX port represents two major ports that are primarily maintained by non-Oracle participants in OpenJDK. If you’d like to help get involved see the project page for more details.

Java still has a way to go before becoming a major player in the embedded space, but the signs in 2014 were encouraging with Java SE Embedded featuring regularly on the Raspberry Pi and Java ME Embedded getting a much needed feature parity boost with Java SE APIs.

4. Sept/Oct – A Resurgence in the JCP & it’s 15th Anniversary

The Java Community Process (JCP) is the standards body that defines what goes into Java SE, Java EE and the Java ME. It re-invented itself as a much more open community based organisation in 2013 and continued that good work in 2014, reversing the dropping membership trend. Most importantly – it now truly represents the incredible diversity that the Java ecosystem has. You can see the make up of the existing Executive Committee – you can see that institutions like Java User Groups sitting alongside industry and end user heavyweights such as IBM, Twitter and Goldman Sachs.

Community Collaboration at an all time high & Microsoft joins OpenJDK.

The number of new joiners to OpenJDK (see Mani’s excellent post on this) was higher than ever. OpenJDK now represents a huge melting pot of major tech companies such as Red Hat, IBM, Oracle, Twitter and of course the shock entry this year of Microsoft.

The Adopt a JSR and Adopt OpenJDK programmes continue to bring more day to day developers involved in guiding the future of various APIs with regular workshops now being organised globally around the world to test new APis and ideas out early and feed that back in OpenJDK and the Java EE specifications in particular.

Community conferences & the number of Java User Groups continue rise in numbers, JavaOne in particular having it’s strongest year in recent memory. It was also heartening to see a large number of community efforts helping kids learn to code with after school and weekend programmes such as Devoxx for Kids.

What for 2015?

I’ll expect 2015 to be a little bit quieter in terms of changes for the core language or exciting new features to Java EE or Java ME as their next major releases aren’t due to 2016. On the community etc front I expect to see Java developers having to firmly embrace web/UI technologies such as AngularJS, More systems/Devops toolchains such as Docker, AWS, Puppet etc and of course migrate to Java 8 and all of the functional goodness it now brings! The community I’m sure will continue to thrive and the looming spectre of IoT will start to come into the mainstream as well. Java developers will likely have to wait until Java 9 to get a truly first class platform for embedded, but early adopters will want to begin taking a look at early builds throughout 2015. Java/JVM applications now tend to be complex, with many moving parts and distributed deployments. It can often take poor frustrated developers weeks to fix issues in production. To combat this there are a new wave of interesting analysis tools dealing with Java/JVM based applications and deployments. Oracle’s Mission Control is a powerful tool that can give lots of interesting insights into the JVM and other tools like Xrebel from ZeroTurnaround, jClarity’s Censum and Illuminate take the next step of applying machine learned analysis to the raw numbers. One final important note. Project Jigsaw is the modularisation story for Java 9 that will massively impact tool vendors and day to day developers alike. The community at large needs your help to help test out early builds of Java 9 and to help OpenJDK developers and tool vendors ensure that IDEs, build tools and applications are ready for this important change. You can join us in the Adoption Group at OpenJDK: http://adoptopenjdk.java.net I hope everyone has a great holiday break – I look forward to seeing the Twitter feeds and the GitHub commits flying around in 2015 :-).
Cheers,
Martijn (CEO – jClarity, Java Champion & Diabolical Developer)

This post is part of the Java Advent Calendar and is licensed under the Creative Commons 3.0 Attribution license. If you like it, please spread the word by sharing, tweeting, FB, G+ and so on!

How is Java / JVM built ? Adopt OpenJDK is your answer!

Introduction & history
As some of you may already know, starting with Java 7, OpenJDK is the Reference Implementation (RI) to Java. The below time line gives you an idea about the history of OpenJDK:
OpenJDK history (2006 till date)
If you have wondered about the JDK or JRE binaries that you download from vendors like Oracle, Red Hat, etcetera, then the clue is that these all stem from OpenJDK. Each vendor then adds some extra artefacts that are not open source yet due to security, proprietary or other reasons.


What is OpenJDK made of ?
OpenJDK is made up of a number of repositories, namely corba, hotspot, jaxp, jaxws, jdk, langtools, and nashorn. Between OpenjJDK8 and OpenJDK9 there have been no new repositories introduced, but lots of new changes and restructuring, primarily due to Jigsaw – the modularisation of Java itself [2] [3] [4] [5].
repo composition, language breakdown (metrics are estimated)
Recent history
OpenJDK Build Benchmarks – build-infra (Nov 2011) by Fredrik Öhrström, ex-Oracle, OpenJDK hero!

Fredrik Öhrström visited the LJC [16] in November 2011 where he showed us how to build OpenJDK on the three major platforms, and also distributed a four page leaflet with the benchmarks of the various components and how long they took to build. The new build system and the new makefiles are a result  of the build system being re-written (build-infra). 


Below are screen-shots of the leaflets, a good reference to compare our journey:

Build Benchmarks page 2 [26]

How has Java the language and platform built over the years ?

Java is built by bootstrapping an older (previous) version of Java – i.e. Java is built using Java itself as its building block. Where older components are put together to create a new component which in the next phase becomes the building block. A good example of bootstrapping can be found at Scheme from Scratch [6] or even on Wikipedia [7].


OpenJDK8 [8] is compiled and built using JDK7, similarly OpenJDK9 [9] is compiled and built using JDK8. In theory OpenJDK8 can be compiled using the images created from OpenJDK8, similarly for OpenJDK9 using OpenJDK9. Using a process called bootcycle images – a JDK image of OpenJDK is created and then using the same image, OpenJDK is compiled again, which can be accomplished using a make command option:


$ make bootcycle-images       # Build images twice, second time with newly built JDK


make offers a number of options under OpenJDK8 and OpenJDK9, you can build individual components or modules by naming them, i.e.


$ make [component-name] | [module-name]


or even run multiple build processes in parallel, i.e.


$ make JOBS=<n>                 # Run <n> parallel make jobs


Finally install the built artefact using the install option, i.e.


$ make install


Some myths busted
OpenJDK or Hotspot to be more specific isn’t completely written in C/C++, a good part of the code-base is good ‘ole Java (see the composition figure above). So you don’t have to be a hard-core developer to contribute to OpenJDK. Even the underlying C/C++ code code-base isn’t scary or daunting to look at. For example here is an extract of a code snippet from vm/memory/universe.cpp in the HotSpot repo –
.
.
.
Universe::initialize_heap()

if (UseParallelGC) {
   #ifndef SERIALGC
   Universe::_collectedHeap = new ParallelScavengeHeap();
   #else // SERIALGC
       fatal(UseParallelGC not supported in this VM.);
   #endif // SERIALGC

} else if (UseG1GC) {
   #ifndef SERIALGC
   G1CollectorPolicy* g1p = new G1CollectorPolicy();
   G1CollectedHeap* g1h = new G1CollectedHeap(g1p);
   Universe::_collectedHeap = g1h;
   #else // SERIALGC
       fatal(UseG1GC not supported in java kernel vm.);
   #endif // SERIALGC

} else {
   GenCollectorPolicy* gc_policy;

   if (UseSerialGC) {
       gc_policy = new MarkSweepPolicy();
   } else if (UseConcMarkSweepGC) {
       #ifndef SERIALGC
       if (UseAdaptiveSizePolicy) {
           gc_policy = new ASConcurrentMarkSweepPolicy();
       } else {
           gc_policy = new ConcurrentMarkSweepPolicy();
       }
       #else // SERIALGC
           fatal(UseConcMarkSweepGC not supported in this VM.);
       #endif // SERIALGC
   } else { // default old generation
       gc_policy = new MarkSweepPolicy();
   }

   Universe::_collectedHeap = new GenCollectedHeap(gc_policy);
}
.
.
.
(please note that the above code snippet might have changed since published here)
The things that appears clear from the above code-block are, we are looking at how pre-compiler notations are used to create Hotspot code that supports a certain type of GC i.e. Serial GC or Parallel GC. Also the type of GC policy is selected in the above code-block when one or more GC switches are toggled i.e. UseAdaptiveSizePolicy when enabled selects the Asynchronous Concurrent Mark and Sweep policy. In case of either Use Serial GC or Use Concurrent Mark Sweep GC are not selected, then the GC policy selected is Mark and Sweep policy. All of this and more is pretty clearly readable and verbose, and not just nicely formatted code that reads like English.


Further commentary can be found in the section called Deep dive Hotspot stuff in the Adopt OpenJDK Intermediate & Advance experiences [12] document.


Steps to build your own JDK or JRE
Earlier we mentioned about JDK and JRE images – these are no longer only available to the big players in the Java world, you and I can build such images very easily. The steps for the process have been simplified, and for a quick start see the Adopt OpenJDK Getting Started Kit [11] and Adopt OpenJDK Intermediate & Advance experiences [12] documents. For detailed version of the same steps, please see the Adopt OpenJDK home page [13]. Basically building a JDK image from the OpenJDK code-base boils down to the below commands:


(setup steps have been made brief and some commands omitted, see links above for exact steps)

$ hg clone http://hg.openjdk.java.net/jdk8/jdk8 jdk8  (a)…OpenJDK8
or
$ hg clone http://hg.openjdk.java.net/jdk9/jdk9 jdk9  (a)…OpenJDK9

$ ./get_source.sh                                    (b)
$ bash configure                                      (c)
$ make clean images                                   (d)

(setup steps have been made brief and some commands omitted, see links above for exact steps)

To explain what is happening at each of the steps above:
(a) We clone the openjdk mercurial repo just like we would using git clone ….
(b) Once we have step (a) completed, we change into the folder created, and run the get_source.sh command, which is equivalent to a git fetch or a git pull, since the step (a) only brings down base files and not all of the files and folders.
(c) Here we run a script that checks for and creates the configuration needed to do the compile and build process
(d) Once step (c) is success we perform a complete compile, build and create JDK and JRE images from the built artefacts


As you can see these are dead-easy steps to follow to build an artefact or JDK/JRE images [step (a) needs to be run only once].


Benefits
– contribute to the evolution and improvement of the Java the language & platform
– learn about the internals of the language and platform
– learn about the OS platform and other technologies whilst doing the above
– get involved in F/OSS projects
– stay on top the latest changes in the Java / JVM sphere
– knowledge and experience that helps professionally but also these are not readily available from other sources (i.e. books, training, work-experience, university courses, etcetera).
– advancement in career
– personal development (soft skills and networking)


Contribute
Join the Adopt OpenJDK [13] and Betterrev [15] projects and contribute by giving us feedback about everything Java including these projects. Join the Adoption Discuss mailing list [14] and other OpenJDK related mailing lists to start with, these will keep you updated with latest progress and changes to OpenJDK. Fork any of the projects you see and submit changes via pull-requests.


Thanks and support
Adopt OpenJDK [13] and umbrella projects have been supported and progressed with help of JCP [21], the Openjdk team [22], JUGs like London Java Community [16], SouJava [17] and other JUGs in Brazil, a number of JUGs in Europe i.e. BGJUG (Bulgarian JUG) [18], BeJUG (Belgium JUG) [19], Macedonian JUG [20], and a number of other small JUGs. We hope in the coming time more JUGs and individuals would get involved. If you or your JUG wish to participate please get in touch.

Credits
Special thanks to +Martijn Verburg (incepted Adopt OpenJDK), +Richard Warburton+Oleg Shelajev+Mite Mitreski, +Kaushik Chaubal and +Julius G for helping improve the content and quality of this post, and sharing their OpenJDK experience with us.


How to get started ?
Join the Adoption Discuss mailing list [14], go to the Adopt OpenJDK home page [13] to get started, followed by referring to the Adopt OpenJDK Getting Started Kit [11] and Adopt OpenJDK Intermediate & Advance experiences [12] documents.


Please share your comments here or tweet at @theNeomatrix369.


Resources
[8] OpenJDK8 
[17] SouJava 


This post is part of the Java Advent Calendar and is licensed under the Creative Commons 3.0 Attribution license. If you like it, please spread the word by sharing, tweeting, FB, G+ and so on!

Managing Package Dependencies with Degraph

A large part of the art of software development is keeping the complexity of a system as low as possible. But what is complexity anyway? While the exact semantics vary quite a bit, depending on who you ask, probably most agree that it has a lot to do with the number of parts in a system and their interactions.

Consider a marble in space, i.e a planet, moon or star. Without any interaction this is as boring as a system can get. Nothing happens. If the marble moves, it keeps moving in exactly the same way. To be honest there isn’t even a way to determine if it is moving. Boooring.

Add a second marble to the system and let them attract each other, like earth and moon. Now the system is a more interesting. The two objects circle each other if they aren’t too fast. Somewhat interesting.

Now add a third object. In the general case things go so interesting that we can’t even predict what is going to happen. The whole system didn’t just became complex it became chaotic. You now have a three body problem In the general case this problem cannot be solved, i.e. we cannot predict what will happen with the system. But there are some special cases. Especially the case where two of the objects a very close to each other like earth and moon and the third one is so far away that the two first object behave just like one. In this case you approximate the system with two particle systems.

But what has this to do with Java? This sounds more like physics.

I think software development is similar in some aspects. A complete application is way to complicated to be understood as a whole. To fight this complexity we divide the system into parts (classes) that can be understood on their own, and that hide their inner complexity so that when we look at the larger picture we don’t have to worry about every single code line in a class, but only about the class as one entity. This is actually very similar to what physicists do with systems.

But let’s look at the scale of things. The basic building block of software is the code line. And to keep the complexity in check we bundle code lines that work together in methods. How many code lines go into a single method varies, but it is in the order of 10 lines of code.
Next you gather methods into classes. How many methods go into a single class? Typically in the order of 10 methods!

And then? We bundle 100-10000 classes in a single jar! I hope I’m not the only one who thinks something is amiss.

I’m not sure what comes out of project jigsaw, but currently Java only offers packages as a way to bundle classes. Package aren’t a powerful abstraction, yet it is the only one we have, so we better use it.

Most teams do use packages, but not in a very well structured, but ad hoc way. The result is similar to trying to consider moon and sun as on part of the system, and the earth as the other part. The result might work, but it is probably as intuitive as Ptolemy’s planetary model. Instead decide on criteria how you want to differentiate your packages. I personally call them slicings, inspired by an article by Oliver Gierke. Possible slicings in order of importance are:

  • the deployable jar file the class should end up in
  • the use case / feature / part of the business model the class belongs to
  • the technical layer the class belongs to

The packages this results in will look like this: <domain>.<deployable>.<domain part>.<layer>

It should be easy to decide where a class goes. And it should also keep the packages at a reasonable size, even when you don’t use the separation by technical layer.

But what do you gain from this? It is easier to find classes, but that’s about it. You need one more rule to make this really worth while: There must be no cyclic dependencies!

This means, if a class in a package A references a class in package B no class in B may reference A. This also applies if the reference is indirect via multiple other packages. But that is still not enough. Slices should be cycle free as well, so if a domain part X references a different domain part Y, the reverse dependency must not exist!

This will in deed put some rather strict rules on your package and dependency structure. The benefit of this is, that it becomes very flexible.

Without such a structure splitting your project in multiple parts will probably be rather difficult. Ever tried to reuse part of an application in a different one, just to realize that you basically have to include most of the the application in order to get it to compile? Ever tried to deploy different parts of an application to different servers, just to realize you can’t? It certainly happend to me before I used the approach mentioned above. But with this more strict structure, the parts you may want to reuse, will almost on their own end up on the end of the dependency chain so you can take them and bundle them in their own jar, or just copy the code in a different project and have it compile in very short time.

Also while trying to keep your packages and slices cycle free you’ll be forced to think hard, what each package involved is really about. Something that improved my code base considerably in many cases.

So there is one problem left: Dependencies are hard to see. Without a tool, it is very difficult to keep a code base cycle free. Of course there are plenty of tools that check for cycles, but cleaning up these cycles is tough and the way most tools present these cycles doesn’t help very much. I think what one needs are two things:

  1. a simple test, that can run with all your other tests and fails when you create a dependency circle.
  2. a tool that visualizes all the dependencies between classes, while at the same time showing in which slice each class belongs.

Surprise! I can recommend such a great tool: Degraph! (I’m the author, so I might be biased)

You can write tests in JUnit like this:



assertThat(
classpath().including("de.schauderhaft.**")
.printTo("degraphTestResult.graphml")
.withSlicing("module", "de.schauderhaft.(*).*.**")
.withSlicing("layer", "de.schauderhaft.*.(*).**"),
is(violationFree())
);

The test will analyze everything in the classpath that starts with de.schauderhaft. It will slice the classes in two ways: By taking the third part of the package name and by taking the forth part of the package name. So a class name de.schauderhaft.customer.persistence.HibernateCustomerRepository ends up in the module customer and in the layer persistence. And it will make sure that modules, layers and packages are cycle free.

And if it finds a dependency circle, it will create a graphml file, which you can open using the free graph editor yed. With a little layouting you get results like the following where the dependencies that result in circular dependencies are marked in red.

Again for more details on how to achieve good usable layouts I have to refer to the documentation of Degraph.

Also note that the graphs are colored mainly green with a little red, which nicely fits the season!

This post is part of the Java Advent Calendar and is licensed under the Creative Commons 3.0 Attribution license. If you like it, please spread the word by sharing, tweeting, FB, G+ and so on!

CMS Pipelines … for NetRexx on the JVM

This year I want to tell you about a new and exciting addition to NetRexx (which, incidentally just turned 19 years old the day before yesterday). NetRexx, as some of you know, is the first alternative language for the JVM, stems from IBM, and is free and open source since 2011 (http://www.netrexx.org). It is a happy marriage of the Rexx Language (Michael Cowlishaw, IBM, 1979) and the JVM. NetRexx can run compiled ahead of time, as .class files for maximum performance, or interpreted, for a quick development cycle, or very dynamic production of code. After the addition of Scripting in version 3.03 last year, the new release (3.04, somewhere at the end of 2014) include Pipes.

We know what pipes are, I hear you say, but what are Pipes? A Pipeline, also called a Hartmann Pipeline, is a concept that extends and improves pipes as they are known from Unix and other operating systems. The name pipe indicates an inter- process communication mechanism, as well as the programming paradigm it has introduced. Compared to Unix pipes, Hartmann Pipelines offer multiple input- and output streams, more complex pipe topologies, and a lot more, too much for this short article but worthy of your study.

Pipelines were first implemented on VM/CMS, one of IBM’s mainframe operating systems. This version was later ported to TSO to run under MVS and has been part of several product configurations. Pipelines are widely used by VM users, in a symbiotic relationship with REXX, the interpreted language that also has its origins on this platform. Pipes in the NetRexx version are compile by a special Pipes Compiler that has been integrated with NetRexx. The resulting code can run on every platform that has a JVM (Java Virtual Machine), including z/VM and z/OS for that matter. This portable version of Pipelines was started by Ed Tomlinson in 1997 under the name of njpipes, when NetRexx was still very new, and was open sourced in 2011, soon after the NetRexx translator itself. It was integrated into the NetRexx translator in 2014 and will be released integrated in the NetRexx distribution for the first time with version 3.04. It answers the eternal question posed to the development team by every z/VM programmer we ever met: “But … Does It Have Pipes?” It also marks the first time that a non-charge Pipelines product runs on z/OS. But of course most of you will be running Linux, Windows or OSX, where NetRexx and Pipes also run splendidly.

NetRexx users are very cautious of code size and peformance – for example because applications also run on limited JVM specifications as JavaME, in Lego Robots and on Androids and Raspberries, and generally are proud and protective of the NetRexx runtime, which weighs in at 37K (yes, 37 kilobytes, it even shrunk a few bytes over the years). For this reason, the Pipes Compiler and the Stages are packaged in the NetRexxF.jar – F is for Full, and this jar also includes the eclipse Java compiler which makes NetRexx a standalone package that only needs a JRE for development. There is a NetRexxC.jar for those who have a working Java SDK and only want to compile NetRexx. So we have NetRexxR.jar at 37K, NetRexxC.jar at 322K, and the full NetRexx kaboodle in 2.8MB – still small compared to some other JVM Languages.

The pipeline terminology is a metaphore derived from plumbing. Fitting two or more pipe segments together yield a pipeline. Water flows in one direction through the pipeline. There is a source, which could be a well or a water tower; water is pumped through the pipe into the first segment, then through the other segments until it reaches a tap, and most of it will end up in the sink. A pipeline can be increased in length with more segments of pipe, and this illustrates the modular concept of the pipeline. When we discuss pipelines in relation to computing we have the same basic structure, but instead of water that passes through the pipeline, data is passed through a series of programs (stages) that act as filters. Data must come from some place and go to some place. Analogous to the well or the water tower there are device drivers that act as a source of the data, where the tap or the sink represents the place the data is going to, for example to some output device as your terminal window or a file on disk, or a network destination. Just as water, data in a pipeline flows in one direction, by convention from the left to the right.

A program that runs in a pipeline is called a stage. A program can run in more than one place in a pipeline – these occurrences function independent of each other. The pipeline specification is processed by the pipeline compiler, and it must be contained in a character string; on the commandline, it needs to be between quotes, while when contained in a file, it needs to be between the delimiters of a NetRexx string. An exclamation mark (!) is used as stage separator, while the solid vertical bar | can be used as an option when specifiying the local option for the pipe, after the pipe name. When looking a two adjaced segments in a pipeline, we call the left stage the producer and the stage on the right the consumer, with the stage separator as the connector.

A device driver reads from a device (for instance a file, the command prompt, a machine console or a network connection) or writes to a device; in some cases it can both read and write. An example of a device drivers are diskr for diskread and diskw for diskwrite; these read and write data from and to files. A pipeline can take data from one input device and write it to a different device. Within the pipeline, data can be modified in almost any way imaginable by the programmer. The simplest process for the pipeline is to read data from the input side and copy it unmodified to the output side. The pipeline compiler connects these programs; it uses one program for each device and connects them together. All pipeline segments run on their own thread and are scheduled by the pipeline scheduler. The inherent characteristic of the pipeline is that any program can be connected to any other program because each obtains data and sends data throug a device independent standard interface. The pipeline usually processes one record (or line) at a time. The pipeline reads a record for the input, processes it and sends it to the output. It continues until the input source is drained.

Until now everything was just theory, but now we are going to show how to compile and run a pipeline. The executable script pipe is included in the NetRexx distribution to specify a pipeline and to compile NetRexx source that contains pipelines. Pipelines can be specified on the command line or in a file, but will always be compiled to a .class file for execution in the JVM.

 pipe ”(hello) literal ”hello world” ! console”

This specifies a pipeline consisting of a source stage literal that puts a string (“hello world”) into the pipeline, and a console sink, that puts the string on the screen. The pipe compiler will echo the source of the pipe to the screen – or issue messages when something was mistyped. The name of the classfile is the name of the pipe, here specified between parentheses. Options also go there. We call execute the pipe by typing:

java hello

Now we have shown the obligatory example, we can make it more interesting by adding a reverse stage in between:

pipe ”(hello) literal ”hello world” ! reverse ! console

When this is executed, it dutifully types “dlrow olleh”. If we replace the string after literal with arg(), we then can start the hello pipeline with a an argument to reverse: and we run it with:

java hello a man a plan a canal panama

and it will respond:

amanap lanac a nalp a nam a

which goes to show that without ignoring space no palindrome is very convincing – which we can remedy with a change to the pipeline: use the change stage to take out the spaces:

pipe”(hello) literal arg() ! change /” ”// ! console”

Now for the interesting parts. Whole pipeline topologies can be added, webservers can be built, relational databases (all with a jdbc driver) can be queried. For people that are familiar with the z/VM CMS Pipelines product, most of its reference manual is relevant for this implementation. We are working on the new documentation to go with NetRexx 3.04.

Pipes for NetRexx are the work of Ed Tomlinson, Jeff Hennick, with contributions by Chuck Moore, myself, and others. Pipes were the first occasion I have laid eyes on NetRexx, and I am happy they now have found their place in the NetRexx open source distribution. To have a look at it, download the NetRexx source from the Kenai site (https://kenai.com/projects/netrexx ) and build a 3.04 version yourself. Alternatively, wait until the 3.04 package hits http://www.netrexx.org.
This post is part of the Java Advent Calendar and is licensed under the Creative Commons 3.0 Attribution license. If you like it, please spread the word by sharing, tweeting, FB, G+ and so on!

How and Why is Unsafe used in Java?

Overview

sun.misc.Unsafe has been in Java from at least as far back as Java 1.4 (2004).  In Java 9, Unsafe will be hidden along with many other, for-internal-use classes. to improve the maintainability of the JVM.  While it is still unclear exactly what will replace Unsafe, and I suspect it will be more than one thing which replaces it, it raises the question, why is it used at all?

Doing things which the Java language doesn’t allow but are still useful.

Java doesn’t allow many of the tricks which are available to lower level languages.  For most developers this is very good thing, and it not only saves you from yourself, it also saves you from your co-workers.  It also makes it easier to import open source code because you know there is limits to how much damage they can do.  Or at least there is limits to how much you can do accidentally. If you try hard enough you can still do damage.

But why would you even try, you might wonder?  When building libraries many (but not all) of the methods in Unsafe are useful and in some cases, there is no other way to do the same thing without using JNI, which is even more dangerous and you lose the “compile once, run anywhere”

Deserialization of objects

When deserializing or building an object using a framework, you make the assumption you want to reconstitute an object which existed before.  You expect that you will use reflection to either call the setters of the class, or more likely set the internal fields directly, even the final fields.  The problem is you want to create an instance of an object, but you don’t really need a constructor as this is likely to only make things more difficult and have side effects.
public class A implements Serializable {
private final int num;
public A(int num) {
System.out.println("Hello Mum");
this.num = num;
}

public int getNum() {
return num;
}
}

In this class, you should be able to rebuild and set the final field, but if you have to call a constructor and it might do things which don’t have anything to do with deserialization.  For these reasons many libraries use Unsafe to create instances without calling a constructor

Unsafe unsafe = getUnsafe();
Class aClass = A.class;
A a = (A) unsafe.allocateInstance(aClass);

Calling allocateInstance avoids the need to call the appropriate constructor, when we don’t need one.

Thread safe access to direct memory

Another use for Unsafe is thread safe access to off heap memory.  ByteBuffer gives you safe access to off heap or direct memory, however it doesn’t have any thread safe operations.  This is particularly useful if you want to share data between processes.

import sun.misc.Unsafe;
import sun.nio.ch.DirectBuffer;

import java.io.File;
import java.io.IOException;
import java.io.RandomAccessFile;
import java.lang.reflect.Field;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;

public class PingPongMapMain {
public static void main(String... args) throws IOException {
boolean odd;
switch (args.length < 1 ? "usage" : args[0].toLowerCase()) {
case "odd":
odd = true;
break;
case "even":
odd = false;
break;
default:
System.err.println("Usage: java PingPongMain [odd|even]");
return; }
int runs = 10000000;
long start = 0;
System.out.println("Waiting for the other odd/even");
File counters = new File(System.getProperty("java.io.tmpdir"), "counters.deleteme"); counters.deleteOnExit();

try (FileChannel fc = new RandomAccessFile(counters, "rw").getChannel()) {
MappedByteBuffer mbb = fc.map(FileChannel.MapMode.READ_WRITE, 0, 1024);
long address = ((DirectBuffer) mbb).address();
for (int i = -1; i < runs; i++) {
for (; ; ) {
long value = UNSAFE.getLongVolatile(null, address);
boolean isOdd = (value & 1) != 0;
if (isOdd != odd)
// wait for the other side.
continue;
// make the change atomic, just in case there is more than one odd/even process
if (UNSAFE.compareAndSwapLong(null, address, value, value + 1))
break;
}
if (i == 0) {
System.out.println("Started");
start = System.nanoTime();
}
}
}
System.out.printf("... Finished, average ping/pong took %,d ns%n",
(System.nanoTime() - start) / runs);
}

static final Unsafe UNSAFE;

static {
try {
Field theUnsafe = Unsafe.class.getDeclaredField("theUnsafe");
theUnsafe.setAccessible(true);
UNSAFE = (Unsafe) theUnsafe.get(null);
} catch (Exception e) {
throw new AssertionError(e);
}
}
}

When you run this in two programs, one with odd and the other with even. You can see that each process is changing data via  persisted shared memory.

In each program it maps the same are of the disks cache into the process.  There is actually only one copy of the file in memory.  This means the memory can be shared, provided you use thread safe operations such as the volatile and CAS operations.

The output on an i7-3970X is

Waiting for the other odd/even
Started
… Finished, average ping/pong took 83 ns

That is 83 ns round trip time between two processes. When you consider System V IPC takes around 2,500 ns and IPC volatile instead of persisted, that is pretty quick.

Is using Unsafe suitable for work?

I wouldn’t recommend you use Unsafe directly.  It requires far more testing than natural Java development.  For this reason I suggest you use a library where it’s usage has been tested already.  If you wan to use Unsafe yourself, I suggest you thoughly test it’s usage in a stand alone library.  This limits how Unsafe is used in your application and give syou a safer, Unsafe.

Conclusion

It is interesting that Unsafe exists in Java, and you might to play with it at home.  It has some work applications especially in writing low level libraries, but in general it is better to use a library which uses Unsafe which has been tested than use it directly yourself.

About the Author.

Peter Lawrey has the most Java answers on StackOverflow. He is the founder of the Performance Java User’s Group, and lead developer of Chronicle Queue and Chronicle Map, two libraries which use Unsafe to share persisted data between processes.

Developers want to be heard

How often have you been in this situation?

You’re in a meeting with the team and you’re all discussing the implementation of a new feature. The group seems to be converging on a design, but there’s something about it that feels off, some sort of “smell”.  You point this out to the team, perhaps outlining the specific areas that make you uncomfortable. Maybe you even have an alternative solution. The team lets you have your say, but assures you their solution is The Way.

Or what about this?

A tech lead asks you to fix a bug, and as you work on your implementation you bounce ideas around periodically just to make sure you’re on the right track. Things seem to be OK, until it comes to getting your code merged. Now it becomes clear that your implementation is not what the lead had in mind, and there’s a frustrating process of back-and-forth while you explain and defend your design decisions whilst trying to incorporate the feedback. At the end, the solution doesn’t feel like your work, and you’re not entirely sure what was wrong with your initial implementation – it fixed the problem, passed the tests, and met the criteria you personally think are important (readability / scalability / performance / stability / time-to-implementation, whatever it is that you value).

When you speak to women developers, you often hear “I feel like I have to work really hard to convince people about my ideas” or “it’s taken me a long time to prove my worth” or “I still don’t know how to be seen as a full member of the team”.

And you hear these a lot from women because we ask women a lot what they don’t like about their work, since we’re (correctly) concerned as an industry about the lack of female developers and the alarming rate at which they leave technical roles.

However, if you ask any developer you’ll hear something similar.  Even very senior, very experienced (very white, very male) developers have a lot of frustration trying to convince others that their ideas have value.

It’s not just a Problem With Women.

I’ve been wondering if our problem is that we don’t listen.  When it comes to exchanging technical ideas, I think overall we’re not good at really listening to each other.  At the very least, I think we’re bad at making people feel heard.

Let’s think about this for a bit: if we don’t listen to developers, if we don’t help them to understand why they’re wrong, or work together to incorporate all ideas into a super-idea that’s the best solution, developers will become frustrated.  We’re knowledge workers, what we bring to the table is our brains, our ideas, our solutions.  If these are persistently not valued, we could go one of two ways:

  1. Do it our way anyway. We still think we’re right, we haven’t been convinced that our idea is not correct, or that someone else’s is correct (maybe because we didn’t listen to them? Maybe because no-one took the time to listen to us and explain why we were wrong? Maybe because we were right and no-one was listening?).
  2. Leave. We might join a team where we feel more valued, or we might leave development all together.  At least as a business analyst, as a project manager, as a tester, people have to listen to us: by their very definition the output of those jobs is an input to the development team. 

Option one leads to rogue code in our application, often not checked by anyone else in the team let alone understood by them, because of course we were not allowed to implement this.  So it’s done in secret.  If it works, at worst no-one notices.  And at best? You’re held up as a hero for actually Getting Something Done. This can’t be right, we’re rewarding the rebel behaviour, not encouraging honest discussion and making people feel included.

Option two leads to the team (and maybe the industry) losing a developer. Sometimes you might argue “Good Riddance”.  But there’s such a skills shortage, it’s so hard (and expensive) to hire developers, and you must have seen something in that developer to hire them in the first place, that surely it’s cheaper, better, to make them feel welcome, wanted, valued?

What can we do to listen to each other?

  • Retrospectives. Done right, these give the team a safe place to discuss things, to ask questions, to suggest improvements.  It’s not necessarily a place to talk about code or design, but it is a good place to raise issues like the ones above, and to suggest ways to address these problems.
  • You could schedule sessions for sharing technical ideas: maybe regular brown bags to help people understand the technologies or existing designs; maybe sessions where the architecture or design or principals of a particular area are explored and explained; maybe space and time for those who feel unheard to explain in more detail where they’re coming from, principals that are important to them.  It’s important that these sessions are developer-lead, so that everyone has an opportunity to share their ideas.
  • Pair programming. When you’re sat together, working together, there’s a flow of ideas, information, designs, experience. It’s not necessarily a case of a more senior person mentoring a less experienced developer, all of us have different skills and value different qualities in our implementation – for example, one of you could be obsessive about the tests, where the other really cares about readability of the code. When you implement something in a pair, you feel ownership of that code but you feel less personally attached to the code – you created it, but you created it from the best of both of you, you had to listen to each other to come to a conclusion and implement it. And if an even better idea comes along, great, it just improves the code. You’re constantly learning from the other people you work with, and can see the effect of them learning from you.
  • We should value, and coach, more skills than simply technology skills. I don’t know why we still seem to have this idea that developers are just typists communing with the computer – the best developers work well in teams and communicate effectively with the business and users; the best leaders make everyone in their team more productive.  In successful organisations, sales people are trained in skills like active listening, like dealing with objections.  More development teams should focus on improving these sorts of communication skills as a productivity tool. 

I’m sure there are loads more options, I just thought of these in ten minutes.  If you read any books aimed at business people, or at growing your career, there are many tried and tested methods for making people feel heard, for playing nicely with others.

So we should work harder to listen to each other. Next time you’re discussing something with your team, or with your boss, try and listen to what they’re saying – ask them to clarify things you don’t understand (you won’t look stupid, and developers love explaining things), and repeat back what you do understand. Request the same respect in return – if you feel your ideas aren’t being heard, make sure you sit down with someone to talk over your ideas or your doubts in more detail, and be firm in making sure the team or that person is hearing what you think you’re saying.  We may be wrong, they may be right, but we need to understand why we’re wrong, or we’ll never learn.

If we all start listening a bit more, maybe we’ll be a bit happier.

This post is part of the Java Advent Calendar and is licensed under the Creative Commons 3.0 Attribution license. If you like it, please spread the word by sharing, tweeting, FB, G+ and so on!

Self-healing applications: are they real?

This post is an example about an application where the first solution to each and every IT problem – “have you tried turning it off and on again” – can actually do more harm than good. Instead, we have an application that can literally heal itself: it fails at the beginning, but starts running smoothly after some time. To give an example of such application in action, we recreated it in the most simple form possible, gathering inspiration from what is now a five-year old post from the Heinz Kabutz’ Java Newsletter:



package eu.plumbr.test;

public class HealMe {
private static final int SIZE = (int) (Runtime.getRuntime().maxMemory() * 0.6);

public static void main(String[] args) throws Exception {
for (int i = 0; i < 1000; i++) {
allocateMemory(i);
}
}

private static void allocateMemory(int i) {
try {
{
byte[] bytes = new byte[SIZE];
System.out.println(bytes.length);
}

byte[] moreBytes = new byte[SIZE];
System.out.println(moreBytes.length);

System.out.println("I allocated memory successfully " + i);

} catch (OutOfMemoryError e) {
System.out.println("I failed to allocate memory " + i);
}
}

}

The code above is allocating two bulks of memory in a loop. Each of those allocation is equal to 60% of the total available heap size. As the allocations occur sequentially in the same method, then one might expect this code to keep throwing java.lang.OutOfMemoryError: Java heap space errors and never successfully complete the allocateMemory() method.

Let us start with the static analysis of the source code:

  1. From the first fast examination, this code really cannot complete, because we try to allocate more memory than is available to JVM.
  2. If we look closer we can notice that the first allocation takes place in a scoped block, meaning that the variables defined in this block are visible only to this block. This indicates that the bytes should be eligible for GC after the block is completed. And so our code should in fact run fine right from the beginning, as at the time when it tries to allocate moreBytes the previous allocation bytes should be dead.
  3. If we now look into the compiled classfile, we will see the following bytecode:


    private static void allocateMemory(int);
    Code:
    0: getstatic #3 // Field SIZE:I
    3: newarray byte
    5: astore_1
    6: getstatic #4 // Field java/lang/System.out:Ljava/io/PrintStream;
    9: aload_1
    10: arraylength
    11: invokevirtual #5 // Method java/io/PrintStream.println:(I)V
    14: getstatic #3 // Field SIZE:I
    17: newarray byte
    19: astore_1
    20: getstatic #4 // Field java/lang/System.out:Ljava/io/PrintStream;
    23: aload_1
    24: arraylength
    25: invokevirtual #5 // Method java/io/PrintStream.println:(I)V
    ---- cut for brevity ----

    Here we see, that on offsets 3-5 first array is allocated and stored into local variable with index 1. Then, on offset 17 another array is going to be allocated. But first array is still referenced by local variable and so the second allocation should always fail with OOM. Bytecode interpreter just cannot let GC clean up first array, because it is still strongly referenced.

Our static code analysis has shown us that for two underlying reasons, the presented code should not run successfully and in one case, it should. Which one out of those three is the correct one? Let us actually run it and see for ourselves. It turns out that both conclusions were correct. First, application fails to allocate memory. But after some time (on my Mac OS X with Java 8 it happens at iteration #255) the allocations start succeeding:


java -Xmx2g eu.plumbr.test.HealMe
1145359564
I failed to allocate memory 0
1145359564
I failed to allocate memory 1

… cut for brevity ...

I failed to allocate memory 254
1145359564
I failed to allocate memory 255
1145359564
1145359564
I allocated memory successfully 256
1145359564
1145359564
I allocated memory successfully 257
1145359564
1145359564

Self-healing code is a reality! Skynet is near…

In order to understand what is really happening we need to think, what changes during program execution? The obvious answer is, of course, Just-In-Time compilation can occur. If you recall, Just-In-Time compilation is a JVM built-in mechanics to optimize code hotspots. For this, the JIT monitors the running code and when a hotspot is detected, JIT compiles your bytecode into native code, performing different optimizations such as method inlining and dead code elimination in the process.

Lets see if this is a case by turning on the following command line options and relaunching the program



-XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly -XX:+LogCompilation

This will generate a log file, in our case named hotspot_pid38139.log, where 38139 was the PID of your java process. In this file the following line can be found:



<task_queued compile_id='94' method='HealMe allocateMemory (I)V' bytes='83' count='256' iicount='256' level='3' stamp='112.305' comment='tiered' hot_count='256'/>

This means, that after executing “allocateMemory” methods 256 times, C1 compiler has decided to queue this method for C1 tier 3 compilation. You can get more information about tiered compilation’s levels and different thresholds here. And so our first 256 iterations were run in interpreted mode, where bytecode interpreter, being a simple stack machine, cannot know in advance, if some variable, bytes in this case, will be used further on or not. But JIT sees the whole method at once and so can deduce than bytes will not be used anymore and is, in fact, GC eligible. Thus the garbage collection can eventually take place and our program has magically self-healed. Now, I can only hope none of the readers should actually be responsible for debugging such a case in production. But in case you wish to make someone’s life a misery, introducing code like this to production would be a sure way to achieve this.

This post is part of the Java Advent Calendar and is licensed under the Creative Commons 3.0 Attribution license. If you like it, please spread the word by sharing, tweeting, FB, G+ and so on!