In our first review, The Atlassian guide to GC tuning is an extensive post covering the methodology and things to keep in mind when tuning GC, practical examples are given and references to important resources are also made in the process. The next one How NOT to measure latency by Gil Tene, he discusses some common pitfalls encountered in measuring and characterizing latency, demonstrating and discussing some false assumptions and measurement techniques that lead to dramatically incorrect reporting results, and covers simple ways to sanity check and correct these situations. Finally Kirk Pepperdine in his post Poorly chosen Java HotSpot Garbage Collection Flags and how to fix them! throws light on some JVM flags – he starts with some 700 flags and boils it down to merely 7 flags. Also cautions you to not just draw conclusions or to take action in a whim but consult and examine – i.e. measure don’t guess!
Background
Some of the items of interest are load the application with work (apply load to the application as it would be in production), turn on GC logging, set data sampling windows, determine memory footprint, rules of thumb for generation sizes, determine systemic requirements.
java -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:<file> …
java -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<file> …
- Determine desired behaviour.
- Measure behaviour before change.
- Determine indicated change.
- Make change.
- Measure behaviour after change.
- If behaviour not met, try again.
The author starts with saying “some of the mistakes others including the author has made” – plenty of wrong ways to do things and what to avoid doing.
WHY and HOW – you measure latency, HOW only makes sense if you understand the WHY. There is public domain code / tools available that can be used to measure latency.
Don’t use statistics to measure latency!
Classic way of looking at latency – response time (latency) is a function of load! Its important to know what this response time – is it average, worst case, 99 percentile, etc… based on different types of behaviours?
Hiccups – some sort of accumulated work that gets performed in bursts (spikes). Important to look at these behaviours when looking at response time and average response time. They are not evenly spread around, but may look like periodic freezes, shifts from one mode/behaviour to another (no sequence).
Common fallacies:
– computers run application code continously
– response time can be measure as a work units over time
– response time exhibits a normal distribution
– “glitches” or “semi-random omissions” in measurement don’t have a big effect
Hiccups are hidden knowingly or unknowingly – i.e. averages and standard deviations can help!
Many compute the 99 percentile or another figure from averages and distributions. A better way to deal with it is to measure it using data – for e.g. plot a histogram. You need to have a SLA for the entire spread and plot the distribution across 0 to 100 percentile.
Latency tells us how long something should take! True real-time is being slow and steady!
A very important way to establishing requirements – an extensive interview process, to help the client come up with a realistic figures for the SLA requirements. We need to find out what is the worst the client is willing to accept and for how long or how often is that acceptable, to help construct a percentile table.
Question is how fast can the system run, smoothly without hitting an obstacles or crashes! The author further makes a comparison of performance between the HotSpot JVM and C4.
Coordinated omission problem – what is it? data that it is not emitted in a random way but in a coordinated way that skews the actual data. Delays in request times does not get measured due to the manner in which response time is recorded from within the system – which omits random uncoordinated data. The maximum value (time) in the data usually is the key value to start with to compute and compare the other percentiles and check if it matches with that computed with coordinated emission in it. Percentiles are useful, compute them.
Graphs with sudden bumps and more smooth lines tend to have coordinated emission in it.
HdrHistogram helps plot histograms – a configurable precision system, telling it the range and precision to cover. It has built in compensation for Coordinated Omission – its OpenSource and available on Github under the CCO license.
jHiccup also plots histogram to show how your computer system has been behaving (run-time and freeze time), also records various percentiles, maximum and minimum values, distribution, good modes, bad modes, etc…. Also opensource under COO license.
Measuring throughput without a latency requirement, the information is meaningless. Mistakes in measurement/analysis can lead to orders-of-magnitude errors and lead to bad business decisions.
Poorly chosen Java HotSpot Garbage Collection Flags and how to fix them! by Kirk Pepperdine
PermSize, MaxPermSize
UseConcMarkSweepGC, CMSParallelRemarkEnabled, UseCMSInitiatingOccupancyOnly CMSInitiatingOccupancyFraction
CMSClassUnloadingEnabled
AggressiveOpts
-ea
-mx4G
-XX:+MaxPermSize=300M
-XX:+UseConcMarkSweepGC
-XX:+CMSClassUnloadingEnabled
-XX:SurvivorRatio=4
-XX:+DisableExplicitGC
Again examine the GC logs to see if any of the above should be used or not, if used, what values are appropriate to set. Quite often the JVM does get it right, right out of the box, hence getting it wrong can be detrimental to your application performance. Its easy to mess up then to get it right, when you have so many flags with dependent and overlapping functionalities.
Conclusion: Refer to benchmarks and GC logs for making decisions on enabling and disabling of flags and the values to set when enabled. Lots of flags, easy to get them wrong, only use them if they are absolutely needed, remember the Hotspot JVM has built in machinery to adapt (adaptive policy), consult an expert if you still want to.
Using Intel Performance Counters To Tune Garbage Collection
How NOT to measure latency
Useful resources
- Are your GC logs speaking to you, the G1GC edition by Kirk Pepperdine – Slides – Video
- Performance Special Interest Group discussion – moderated by Richard Warburton (video)
- Caching in: understand, measure and use your CPU Cache more effectively” by @RichardWarburto – (video & slides)
- Article on Atomic I/O operations (Linux) by Jonathan Corbet
- Articles and Presentations about Azul Zing, Low Latency GC & OpenJDK by Gil Tene (videos & slides)
- Lock-Free Algorithms For Ultimate Performance by Martin Thompson
- Performance Java User’s Group – “For expert Java developers who want to push their systems to the next level”
- Tuning the Size of your thread pool by Kirk Pepperdine
- How NOT to measure Latency by Gil Tene
- Understanding Java Garbage Collection and What You Can Do about It by Gil Tene
- Vanilla #Java Understanding how Core Java really works can help you write simpler, faster applications by Peter Lawrey
- Profiling Java In Production – by Kaushik Srenevasan
- HotSpot JVM garbage collection options cheat sheet (v3) by Alexey Ragozin
- Optimizing Google’s Warehouse Scale Computers: The NUMA Experience – authors from Univ. of Cal (SD) & Google!
- MegaPipe: A New Programming Interface for Scalable Network I/O by several authors!
- What Every Programmer Should Know About Memory by Ulrich Drepper
- Memory Barriers: a Hardware View for Software Hackers – Paul E. McKenney (Linux Technology Center – IBM Beaverton)
- Are your GC logs speaking to you, the G1GC edition by Kirk Pepperdine – Slides – Video
- Performance Special Interest Group discussion – moderated by Richard Warburton (video)
- Caching in: understand, measure and use your CPU Cache more effectively” by @RichardWarburto – (video & slides)
- Article on Atomic I/O operations (Linux) by Jonathan Corbet
- Articles and Presentations about Azul Zing, Low Latency GC & OpenJDK by Gil Tene (videos & slides)
- Lock-Free Algorithms For Ultimate Performance by Martin Thompson
- Performance Java User’s Group – “For expert Java developers who want to push their systems to the next level”
- Tuning the Size of your thread pool by Kirk Pepperdine
- How NOT to measure Latency by Gil Tene
- Understanding Java Garbage Collection and What You Can Do about It by Gil Tene
- Vanilla #Java Understanding how Core Java really works can help you write simpler, faster applications by Peter Lawrey
- Profiling Java In Production – by Kaushik Srenevasan
- HotSpot JVM garbage collection options cheat sheet (v3) by Alexey Ragozin
- Optimizing Google’s Warehouse Scale Computers: The NUMA Experience – authors from Univ. of Cal (SD) & Google!
- MegaPipe: A New Programming Interface for Scalable Network I/O by several authors!
- What Every Programmer Should Know About Memory by Ulrich Drepper
- Memory Barriers: a Hardware View for Software Hackers – Paul E. McKenney (Linux Technology Center – IBM Beaverton)
This post is part of the Java Advent Calendar and is licensed under the Creative Commons 3.0 Attribution license. If you like it, please spread the word by sharing, tweeting, FB, G+ and so on!