Reflections on Log4Shell
by The Diabolical Developer
Introduction
Thanks to Log4Shell, the past two weeks have been some of the most intense days in my career! The industry response has been immense, with IT staff working non-stop across the globe to patch systems, improve detections, and to provide support. Within Microsoft, I witnessed a masterclass in security incident management. Despite the challenging conditions, it was exhilarating to be working alongside so many world-class experts in their field. Weirdly, it has been one of the best professional experiences in my life :-).
For me, it also continued to demonstrate the maturity of the Java industry. The response was fast, we had accurate information, and vendors and individuals collaborated 24/7 to protect end-users.
This article will focus primarily on lessons learned and how I think the software industry can do better going forwards. I don’t claim any unique insights here. I suspect the words I’m typing now have been put to virtual paper a slightly different number of times across the internet. But that’s OK! One of the things we need to change in the industry is that we talk about software security more often. We should talk about security daily just as we do about TDD, containers, DevOps, Agile, and the other topics that are now part of everyday discourse.
NOTE: This post does not necessarily represent the views of my employer.
A Summary of Log4Shell and Mitigations
I’m the Principal Group Manager for Java at Microsoft. I am therefore biased, but I think you’d be hard-pressed to find better resources on Log4Shell and what you should do to mitigate:
- Microsoft Security Response Center Advisory.
- Microsoft Guidance on Preventing, Detecting, and Hunting Log4Shell.
Timeline
It’s always helpful to have a timeline to put context around an incident like this.
- 2021-11-24 – JNDILookup string substitution leading to a Remote Code Execution (RCE) discovered by Chen Zhaojun of the Alibaba Cloud Security Team and reported to the Apache Software Foundation.
- 2021-12-01 – Earliest known exploit attempt 2021-12-01 04:36:50 UTC reported by Cloudflare.
- 2021-12-06 – Log4j released version 2.15.0 (mitigated the known attack vectors at the time).
- 2021-12-09 – Issue was made public.
- 2021-12-10 – CVE-2021-44228 published with a CVSS score of 10 (the highest possible). The world starts patching.
- 2021-12-13 – Log4j 2.16.0 is released, removing the vulnerable JNDILookup class, negating a follow-up DOS issue (CVE-2021-45046).
- 2021-12-14 – CVE-2021-45046 published with a CVSS score of 3.7 (low).
- 2021-12-17 – CVE-2021-45046 is upgraded from CVSS 3.7 to 9.2, as an RCE vulnerability was still possible in 2.15.0.
- 2021-12-17 – Log4j 2.17.0 is released to fix CVE-2021-45105.
- 2021-12-18 – CVE-2021-45105 published, log4j 2.16.0 and below vulnerable to a DOS (but only with a rare configuration).
One related takeaway from this timeline is if you’re an SRE, developer, or system administrator.
“The first reported CVE for vulnerable
software is likely not the last that week.”
In practice, this means that security responses aren’t a sprint. They’re a long-distance race, and you need to plan your technical and human resourcing appropriately. People tend to tire on days 5 and 6, so you need to factor that in your rota.
Lessons Learned – How Can We Improve?
At Microsoft, we actively practice having a Growth Mindset, so instead of looking back and applying the blame game, I will focus on lessons learned from the past two weeks. With the same collaboration, the industry showed with introducing TDD, DevOps, et al.. I’m confident that we can also ensure safer software stacks for the future!
No Blame Game
It was fascinating to watch drive-by commentators on social media claiming:
- “I would never write such terrible code.”
- “It’s all Java’s fault because it’s java!”
The list went on. There was a lack of empathy, self-awareness, and arrogance on display by these folks. However, there was one small bonus: those commentators made a convenient list of engineers that hiring managers like myself will never hire :-). Conversely, there was an outpouring of support for the core engineers around Log4J 2 and impacted stacks and cross-industry collaboration to detect, patch, and mitigate the issue.
The reality is that software engineers are human, and humans aren’t even infallible. So with that predicate in mind, what can we do to help us, humans, be more secure with our software stacks?
Security as a 1st class citizen for the next generation
Security as a first-class citizen in our computing curriculum is a generational change that needs to happen.
Most of us never had any sort of formal education with regards to designing secure software architectures or developing code with security as first-class citizens. Today, we teach our undergraduates and apprentices concepts such as TDD, Agile, SOLID principles, and more. We also need to put security in as part of that curriculum!
I was fortunate enough to have some fantastic mentors who put some Red Team thinking into my brain in my early career. For example, whenever I’m writing code that communicates over a network, I just automatically think about:
- Adding encryption.
- Auth/Auth.
- I sanitize data that goes over the wire.
- I sanitize input that could execute.
- DOS protection – backoff strategies and more.
Now I might sit here smugly in the knowledge that I know a little bit more than others. However, the reality is that my understanding is incomplete. I’m an enthusiastic amateur or expert beginner, which makes me even more dangerous :-).
Countless experts can provide the material, but coordination and standardization still need to occur at scale. This rollout is the significant generational change that needs to happen.
Secure the supply chain and Component Governance
Several organizations have been talking about securing the software supply chain. In short, the whole industry needs to mature and start ensuring they know exactly where they get their bits and where they’ve deployed those bits.
A famous quote about TDD was that “It gives you the confidence to refactor without fear.”. I believe that securing your supply chain lets you deploy without fear.
Securing your supply chain is the sort of gritty, low-level plumbing engineering that’s not glamorous but will pay you back time and time again. When 0-days like Log4Shell come along, you’ll thank yourself!
OSS needs some Tender Loving Care (TLC)
It seems the industry didn’t quite learn its lessons from previous vulnerabilities like Shellshock. Many open-source projects are well funded and staffed by paid full-time engineers to maintain the software. However, there’s also a host of libraries and frameworks that the industry relies on, which aren’t. Apache Log4j 2 is a classic example of this type of project, and going forwards we’re going to have to support those pillars properly.
There are many options to support these OSS libraries and frameworks ranging from hiring the maintainers to donating to a common find. However, the critical factor goes back to component governance and having organizations genuinely understand what they rely on and funding accordingly.
Conclusion
Security is one of the industry areas that need the most work. You can do your part by:
- Don’t blame people, blame systems. Have empathy, and help those around you improve for next time.
- Educate yourself and the next generation of engineers.
- Ensure that you have a secure supply chain and component governance.
- Work with your org to support OSS, with funding to support those building blocks.
- Lastly, smile – remember that in the darkest of times, things like https://log4jmemes.com are born.
Yours Most Seriously (for a change),
The Diabolical Developer
Author: Martijn Verburg
The Diabolical Developer, CEO – jClarity, London JUG co-leader (LJC), Speaker, Author, Javaranch Mod, PCGen & Adopt OpenJDK / A-JSR Cat herder, Java Champion