Log4j: Between a rock and a hard place

There seems to be a misunderstanding here. We have on the one side a garbage feature that should never have been implemented - but if you want to keep it for backwards compatibility, sure. But then we have log4j scanning all values instead of only format strings - I think it can be argued that this behavior is a critical bug and was never intended to begin with. It seems to have only come about because whoever implemented the JNDI stuff lost their bearing in the absurd class hierarchies and abstractions in log4j.

Of course the last part holds the solution for our backwards compatibility issue. Remove the JNDI nonsense from the default package and move it into an extension package. Whoever wants to keep it can just add that to their dependencies and continue to enjoy logging functions that sometimes also make network connections and block your program.

jameshart · 4 years ago

Indeed - as evidence for this, I would submit that slf4j and logback were created to offer a drop in replacement for log4j (slf4j literally provides alternative implementations of the org.apache.log4j.Logger class), but I have never seen anybody complain that "I switched to logback and slf4j and my jndi substitutions stopped working."

Nobody thought this was how log4j worked; log4j's documentation for format syntax only covers {} placeholders - the same format that slf4j has grandfathered in from log4j.

I agree this feels like a case where they got confused about their internal terminology. Log4j refers to messages with {} placeholders as 'FormattedMessages'; it refers to the log pattern syntax as 'Patterns' in code - but it seems to refer to them as 'log formats' in documentation.

Somewhere in this mess, someone hooked up the pattern capabilities into the formatting system.

unscaled · 4 years ago

> but I have never seen anybody complain that "I switched to logback and slf4j and my jndi substitutions stopped working."

SLF4J was created to replace Apache Commons Logging and Logback was created to replace Log4j 1.x. Both were created Ceki Gülcü, the original author of Log4j 1.x [1].

Logback came out in 2006. The first beta version of Log4j 2.x was only released 6 years later in 2012, and the JNDI lookup feature was added in 2.0-beta9[2] in 2013!

Obviously nobody complained when switching from Log4j 1.x to SLF4J+Logback that a feature from a completely different library (with the same name) that would be created 7 years into the future was not supported.

> Somewhere in this mess, someone hooked up the pattern capabilities into the formatting system.

That's not what happened. The lookup mechanism (which includes "${jndi:}" lookups) is completely unrelated to the message formatting subsystem.

The way formatting and pattern lookups work in log4j2 is:

1. logger.info("Hello {}", "world") creates a FormattedMessage instance with the "Hello {}" format string and a single parameter, "world".

2. The FormattedMessage is wrapped in a LogEvent and routed to the correct appender(s).

3. Most appenders will format the LogEvent with a Layout. In our case, it's PatternLayout we care about[3].

4. PatternLayout will pre-calculate a set of PatternConverters based on your pattern, so it doesn't have to keep parsing the pattern on every invocation. "%m" will map to MessagePatternConverter.

5. (grossly simplifying zero-garbage and streaming optimizations) Each pattern converter is executed and appends to the final layout text's StringBuilder.

6. (grossly simplifying oh so many things) MessagePatternConverter will first call event.getMessage().getFormattedMessage(). The logic for formatting the message is entirely encapsulated by Message and its subclasses. MessagePatternConverter has no way to distinguish the format string from the user-provided parameters!

7. MessagePatternConverter finally applies the pattern lookups to the formatted message text. The pattern lookup mechanism is completely separate from and orthogonal to the message formatting mechanism.

---

That was long-winded, but I had to fight these annoying misconception about "log4j not implemented format strings properly".

Now, there are several things I'm not saying here:

1. I don't think more than a handful of people ever relied on lookups working on the log message (formatted or otherwise), as opposed to the pattern in the configuration file.

2. I don't think Log4j should have kept compatibility here. The moment the maintainers implemented "%m{nolookups}" (on version 2.7), they should have made it the default. That being said, I know this is very hard to do in the Java ecosystem. But I think it is time that the Java developer community changes its extremist position regarding compatibility at all costs.

3. I don't think that Log4j should have implemented pattern lookups for text messages to begin with. Even if was just the format string part (which is impossible to do with Log4j's current architecture anyway).

4. I don't think any kind of string formatting should be included in a logging library. If you want to format log messages, use an external formatting function or string interpolation (if you're lucky enough to be using Kotlin or Scala). If it is added, it should only be used as a convenience, and shouldn't do anything more than formatting (like lookups). Relying on developers to always remember that log.info("Hello {}", world) is safe and log.info("Hello {}" + world) gives the entire internet full control of your server is beyond stupid. Even if Log4j went with this silly distinction, I would say it was a horrible design.

[1] https://techblog.bozho.net/the-logging-mess/

[2] https://logging.apache.org/log4j/2.x/changes-report.html#a2....

[3] It seems like PatternLayout is the only layout vulnerable to this bug in log4j2, but it is hard to tell, the implementation being a classic Java mess of deep class hierarchy, liberal use of reflection to control everything and some heroic attempts to break SOLID principles at least 4 times on a single line of code. Take my analysis with a grain of salt. It's a gross simplification of what is unfortunately par for the course in many Java libraries.*

lultimouomo · 4 years ago

> But then we have log4j scanning all values instead of only format strings - I think it can be argued that this behavior is a critical bug and was never intended to begin with.

It was actually intended behavior, and this is what really boggles the mind! Javadoc says explicitly that variable replacement is recursive, with cycle detection (which will throw! What happens to the log line in this case?) [0].

[0] https://logging.apache.org/log4j/2.x/log4j-core/apidocs/org/...

culturedsystems · 4 years ago

That link is about variable replacement in config strings, which is intentionally recursive. It doesn't mention the use of the variable replacement mechanism when interpolating values into log messages, which is what makes this vulnerability so bad, and as far as I can see was not intentional.

ehsankia · 4 years ago

Right, I was also confused by the blame on backward compatibility. You can keep things backward compatible without necessarily making it on by default. There is no reason why `formatMsgNoLookups` should the default. If it is indeed an obscure and hacky feature for backward compatibility, just make it opt-in. People who really care about it will enable it, most people won't have to carry that baggage and we wouldn't be in a situation like this.

Thorrez · 4 years ago

>for a feature we all dislike yet needed to keep due to backward compatibility concerns.

If they really dislike the feature that much, they likely dislike the code and want to completely delete it. I'm not sure if making it opt-in would make them as happy as fully deleting it, so they are less motivated to make it opt-in than they would be to fully delete it.

iratewizard · 4 years ago

Hindsight is always 20-20.

thanatos519 · 4 years ago

"lost their bearing in the absurd class hierarchies and abstractions" sounds familiar. Java app stack traces are like Neal Stephenson epics, but less entertaining.

x0x0 · 4 years ago

And enabled by default. That's the most mind-blowing bit of this feature. The backcompat argument is a deflection for shipping a time bomb into people's codebases.

closewith · 4 years ago

To be fair to the maintainers, they didn't ship anything into people's codebases. People chose Log4j and pulled it into their code. FOSS contributers aren't responsible for downstream use of their projects.

I'm still flabbergasted that the original maintainers are rushing around trying to patch these problems. Unless their specific personal/professional projects are at risk they have no responsibility to hurry and fix a thing.

You'd think, in the spirit of open source, these multi-billion dollar companies--like Apple and Google and Amazon--would recognize the danger and immediately divert the best engineers they had to help this team identify and mitigate the problems. They should have been buried in useful pull requests.

For that matter, they should have really picked them all up in private jets and flown them to neutral working space with those engineers for a one or two week hackathon/code sprint to clean up the outstanding issues and set the project on a sustainable path. To get those maintainers there they should offer a six figure consulting fee and negotiate with their current employers to secure their temporary help.

I can't believe these folks just get abandoned like this while CEOs/CTOs from rich companies wring their hands wailing about the problems and not offering solutions.

wpietri · 4 years ago

> I'm still flabbergasted that the original maintainers are rushing around trying to patch these problems. Unless their specific personal/professional projects are at risk they have no responsibility to hurry and fix a thing.

Sorry, but what's the hard part to understand? Open source maintainers end up in this position because they are nice, helpful people who like using computers to solve problems for others. People who spend years on a project and then see a bigger problem arise don't suddenly turn that off. With the bigger problem, they'll want to work harder, not just hoist a middle finger and go binge Netflix without a care in the world.

But I totally agree with you on the CTOs, etc. I don't expect random programmers who like working on logging to also be good at solving complicated sociotechological problems around paying for global infrastructure. But it boggles my mind that none of these richly rewarded, supposedly brilliant experts at organizing engineers has gotten out in front of this. If not out of community spirit or social responsibility, then out of pure self interest.

bshipp · 4 years ago

> none of these richly rewarded, supposedly brilliant experts at organizing engineers has gotten out in front of this

Indeed. Each of them has had to spend the last few days madly trying to fix this problem to avoid exposing exposing their infrastructure. Each has been, in some way, replicating the wheel to do so. I'm curious how many will actually submit their findings to the original OSS so others can learn from their experience?

There's always resources to put a fire but rarely enough to install a sprinkler system.

ratww · 4 years ago

> they'll want to work harder, not just hoist a middle finger.

There is a perfectly healthy, acceptable, middle ground between those two extremes, however.

throw_log4jfang · 4 years ago

> You'd think, in the spirit of open source, these multi-billion dollar companies--like Apple and Google and Amazon--would (...) mitigate the problems.

FAANG engineer here, and one who had to work extra hours to redeploy services with the log4j vulnerability fix. I'm not sure you understand the scope and constraints of this sort of problem. Log4j's maintainers have a far more difficult and challenging job than FANGs or any other consumer of a FLOSS package, who only need to consider their own personal internal constraints, and if push comes to shove can even force backwards-incompatible changes. The priority of any company, FANG or not, is to plug their own security holes ASAP. Until that's addressed the thought of diverting resources to fix someone else's security issues doesn't even register on the radar. I mean, are you willing to spend your weekend working around the clock to fix my problems? Why do you expect others like me to do that, then? Instead I'm spending a relaxing weekend with my family with the confort of knowing my service is safe. Why wouldn't I?

bshipp · 4 years ago

I'm not saying you, as an engineer for those companies, should be the one to donate your time and energy toward the problem. We all have competing priorities, as do the maintainers of those FLOSS packages.

I'm saying that your company's CTO, especially one with a very large companies, could likely identify two or three engineers who they pull into a meeting and say "reach out to these guys and get them whatever they need. Here's my cell, call me the moment you need the plane or additional resources."

Seriously, if a CTO has a budget of a few hundred million dollars and thousands of dedicated employees, how hard is it to throw a few crumbs to the open source community to change this situation from being one of a burden on a volunteer effort to, instead, one where they feel like they're in the middle of an international event where their knowledge and services are vital to keeping the internet alive?

Again, I'm exaggerating, but you see where I'm going with this. It's a missed opportunity for some seriously great PR out of a seriously bad situation.

2muchcoffeeman · 4 years ago

> I mean, are you willing to spend your weekend working around the clock to fix my problems?

Surely the difference is you are getting paid, and if your boss says, help these guys out, you can do it? As opposed to some guys with jobs who have a project on the side. The big guys could even do something like offer to pay the maintainers and maybe they can take leave or something.

I agree with both sentiments. The big guys are under no obligation to fix an issue in some library they happen to use. But the log4j guys are under even less obligation when they do it in their spare time.

Everyone should enjoy their weekends.

rhizome · 4 years ago

> You'd think, in the spirit of open source, these multi-billion dollar companies--like Apple and Google and Amazon--would (...) mitigate the problems.

Your "(...)" elides the word "help," which completely changes the meaning of the quote, and your reply is constructed uncharitably as if that word wasn't in the original statement.

ihatecookies · 4 years ago

Somehow, I find what you are saying here to be totally unplausible.

> Log4j's maintainers have a far more difficult and challenging job than FANGs

You are saying that the companies that built advanced ML-based Chess/Go engines like Alpha Zero/Go can't solve a simple logging bug involving string substitution?

If your company ends up using the product in all your teams/project and products wouldn't it be in the company's interest to keep the product safe?

How do we know you're not a CTO/C--/manager in your 'faang' just taking this opportunity to bitch about how bad and unreliable open source is? You do have a track record when it comes to this.

> I mean, are you willing to spend your weekend working around the clock to fix my problems?

Wow, that's cynical even for a 'faang' dude.

wnolens · 4 years ago

same. my evenings and weekend are totally gone to put out this fire. which I wouldn't do if i wasn't obligated to

Deleted Comment

jollybean · 4 years ago

Speaking as an individual, of course you want to sit by the pool this weekend.

But as a professional representative of your org. surely you'll recognized the unsustainability of the situation and that it's far from ideal even in the pure self-interest of the company in question.

phkahler · 4 years ago

>> I'm still flabbergasted that the original maintainers are rushing around trying to patch these problems.

Agreed, while reading it I also disagreed at this point:

>> the maintainers of log4j would have loved to remove this bad feature long ago, but could not because of the backwards compatibility promises they are held to.

Nobody is holding them to anything. If they want to remove an old feature, go right ahead. If those using it think it's that important they can fork the project and maintain it themselves. Oh right, that would take effort or money.

omegalulw · 4 years ago

> Nobody is holding them to anything

I don't get this argument. Part of sharing your work is making sure what you put out is actually helpful to people. If they remove features people really like, then the library won't be as helpful - so it's perfectly fine for the OG devs to maintain this feature. The same thing with "scrambling" to fix - that could be because a sword is hanging over your head, or because you care about the people who use your work. Thinking this way, I can perfectly see them working hard to fixing this bug.

AmpsterMan · 4 years ago

I understand it perfectly. Log4j is used in many Enterprise systems. Java is a fairly conservative language. Combine both together and you get much hesitancy to break backwards compatibility ingrained in the Java world.

duxup · 4 years ago

Did they want to remove it because of security concerns?

If so, I really wouldn’t hold someone to any backwards compatibility promise if security is a concern.

MattGaiser · 4 years ago

Are data breaches actually treated as all that seriously? For all the talk about cyber security, there seems to generally be little investment. It appears to be viewed as more of a reputational concern than an operational one.

A past organization of mine had a data breach (the kind that ended up making the news everywhere). A few people left (probably making it worse with all the turnover there), but I would be surprised if anything really changed in that organization.

twunde · 4 years ago

If the company is in healthcare or finance, yes. Otherwise the typical answer is no. Most companies just load up on cyber insurance and call it a day. That said, reputational concern, is a big thing for companies. Take Dropbox for example. Early on they suffered several security breaches, and had a bad reputation around security. They've since built out a fairly large security program, in part because bad security can block deals, especially in the enterprise space.

I'll note that there's been more investment in security the last 4-5 years. Most B2B companies do a SOC2, and early on, so there tends to be a baseline of competence.

willis936 · 4 years ago

A data breach isn't the primary concern here. This exploit allows full pwnage of a system and could take down entire networks for as long as it takes to rebuild them.

PeterisP · 4 years ago

This is not really about data breaches. The first widely spread automated attacks seem to drop cryptominers, however, we should expect that (if it's not already happened) within a week or so this will get used as the entry point for ransomware attacks, since it gives attackers a solid way of getting of code execution into company servers for anyone who has not solved this issue.

Deleted Comment

ralph84 · 4 years ago

> I'm still flabbergasted that the original maintainers are rushing around trying to patch these problems.

If the RCE had been responsibly disclosed instead of via tweets and PR comments, maybe there wouldn't have had to be so much scrambling. And indeed maybe ASF could have found corporate OSPOs to help with remediation.

There are lots pixels being spilled on how the users of open source software should be paying for it (?), but I haven't seen much criticism of the vulnerability not being responsibly disclosed.

weaksauce · 4 years ago

to the best of my knowledge it was discovered via a minecraft exploit and I don't think minecraft players are generally the "responsible disclosure" kinda people.

flatiron · 4 years ago

There’s no hiding something this easily exploitable. This isn’t rowhammer or spectre where you need a degree to understand it. Copy and paste this in and that’s it. It would have never survived “responsible disclosure”

Deleted Comment

sorry_outta_gas · 4 years ago

I'm not sure about Amazon but Google project's zero and openfuzz teams seem to be doing a lot of good work when it comes to open-source security -- more would be nice always

Personally I'd like something like a security health card/metric on opensource libaries that we could tie into CI systems/pull requests or something

in the past there were so few libarries it wasn't as daunting

I'd be able reason about stuff like libpng, libttf ..etc and think about them or even support them but now some projects are massive hodgepodges of thousands upon thousnads of packages

thrdbndndn · 4 years ago

I knew you're deliberately exaggerating but isn't it a little bit over the top?

That ("...private jets..") doesn't happen because the solution isn't exactly the hard part, and the unpaid original maintainers are doing them anyway.

bshipp · 4 years ago

I admit to a certain level of exaggeration but, at the same time, we are talking literal peanuts to a large company. They could spend a million dollars and it'd be a rounding error on their balance sheet.

In all seriousness, taking actions like I identified above would cost the companies virtually nothing but result in huge long-term benefits by signaling to the rest of the open source world that "we love your work and will be right beside you helping if the chips are down."

This is, of course, not a suitable compensation model for popular open source projects. Thats a separate conversation.

But it would at least be something.

joshjdr · 4 years ago

For argument’s sake, at least, I don’t consider anything suggested here as definitively “over-the-top”. It may seem (or be) unrealistic in practice (for reasons I don’t know), but the suggestion is far from unconscionable— it may, in fact, be the lowest cost solution to what could cost mega-corps billions in current (and potential future) fines/liabilities. To the extent it sounds like an exaggeration, I think that embodies the point of the comment— there are some (almost unreconcilable) concerns that impact the interplay of corporations and open source development.

gilgad13 · 4 years ago

As you said, the solution isn't the hard part. The reason that large companies aren't deploying their own solutions for this issue isn't that their engineers engineers that are incapable of developing their own solutions, but because then they would have to carry that patch forever, and if a problem was found with their particular solution they would be on the hook for it.

And yes, I do think this, "but everyone else is doing the same thing so it isn't really our fault" attitude is a problem.

kccqzy · 4 years ago

> You'd think, in the spirit of open source, these multi-billion dollar companies--like Apple and Google and Amazon--would recognize the danger and immediately divert the best engineers they had to help this team identify and mitigate the problems.

Google doesn't even use log4j. What are you talking about? The spirit of open source does not dictate that the richest companies automatically shoulder the burden of maintenance of projects they do not even use. Google already has initiatives like Summer of Code that help open source projects it does not use, and I think it's perfectly fine to draw the line there.

> divert the best engineers they had

So the lessons from the mythical man-month are forgotten here. At this point I don't think adding more manpower helps.

xena · 4 years ago

Google voice was vulnerable to this, so I think this means they use log4j somewhere but I'm not an expert.

pabs3 · 4 years ago

The Android SDK depends on log4j, so they definitely do use it.

phendrenad2 · 4 years ago

What? It was already fixed. You just need to update. There's no need for the world's top fintech programmers to hack it out on a mountaintop somewhere.

Also, the reason the maintainers are rushing to fix it is: they're worried about losing "market share". Having been in open-source circles for a long time, maintainers care GREATLY about how many users they have. They just like watching their download stats go up every year. Even if it beings them no financial rewards. It's a sort of addiction.

bshipp · 4 years ago

> maintainers care GREATLY about how many users they have.

They do. Until they don't.

That inevitable day when they get yelled at in a github issue thread by a user who didn't bother reading the documentation, while staring at their kid in the living room playing video games and start wondering to themselves "why am I doing this hobby in my spare time again?"

Mild dopamine hits to affirmation-addicted programmers is not the sturdiest foundation upon which to build enterprise-grade software libraries.

dagmx · 4 years ago

An influx of pull requests is also equally difficult for open source projects.

Anything sufficiently at scale needs a set of maintainers that the commercial tech companies would then collaborate with to get the PRs going.

Otherwise if everyone's just panicking and rushing to submit PRs, they'll inundate the maintainer. There's also no guarantee that even the best engineers at these companies are intimately familiar with the project, and might introduce regressions or other vulnerabilities in the process.

Anyway I do agree companies should be working with OSS devs, but it shouldn't be rushed or knee jerk. It should be collaborative and measured.

iJohnDoe · 4 years ago

Great comment. I think this speaks to overall what’s happening in the world today.

dehrmann · 4 years ago

> You'd think...these multi-billion dollar companies...would recognize the danger and immediately divert the best engineers they had to help this team identify and mitigate the problems.

For the general case, the problem is that a reporter might report the vulnerability to the open source project, then the project needs to keep it a secret while they make a fix. There isn't a great way to leverage these stakeholders. It's obviously different for something like Android that is open source, but clearly Google.

ohazi · 4 years ago

> They should have been buried in useful pull requests.

Drive-by pull requests during a highly visibile emergency are rarely useful.

manquer · 4 years ago

True, but that thoughtfulness is not what is stopping major companies from contribution is it ?

rackjack · 4 years ago

This is a problem in open source: everybody wants the fruits of labor without paying for it. The log4j vulnerability is what happens when you don't pay for it.

reidrac · 4 years ago

That's right. It is open source and, when it breaks, you get to keep both pieces.

skybrian · 4 years ago

On the other hand, I expect that most people running log4j didn't know about or want this feature. What should they pay for something they don't want?

Maybe it makes more sense to fund system-wide efforts?

coliveira · 4 years ago

That's why I don't contribute to open source that is used by big corps. I don't like the idea of working for free for the benefit of billionaires.

mbrodersen · 4 years ago

You give away your work with a price tag of $0. So people/companies pay $0 for it. What’s hard to understand about that?

Deleted Comment