A few years ago, Naomi did an internship with Initech. Before her first day, she was very clear on what her responsibilities would be: she'd be on a team modernizing an older product called "Gem" (no relation to Ruby libraries).
By the time her first day rolled around, however, Initech had new priorities. There were a collection of fires on some hyperspecific internal enterprise tool, and everyone was running around and screaming about the apocalypse while dealing with that. Except Naomi, because nobody had any time to bring the intern up to speed on this disaster. Instead, she was given a new priority: just maintain Gem. And no, she wouldn't have a mentor. For the next six months, Naomi was the Gem support team.
"Start by looking at the code quality metrics," was the advice she was given.
It was bad advice. First, while Initech had installed an automated code review tool in their source control system, they weren't using the tool. It had started crashing instead of outputting a report six years ago. Nobody had noticed, or perhaps nobody had cared. Or maybe they just didn't like getting bad news, because once Naomi had the tool running again, the report was full of bad news.
A huge mass of the code was reimplemented copies of the standard library, "tuned for performance", which meant instead of a sensible implementation it was a pile of 4,000 line functions wrapping around massive switch statements. The linter didn't catch that they were parsing XML using regular expressions, but Naomi spotted that and wisely decided not to touch that bit.
What she did find, and fix, was this pattern:
private Boolean isSided;
// dozens more properties
public GemGeometryEntryPoint(GemGeometryEntryPoint gemGeometryEntryPoint) {
this.isSided = gemGeometryEntryPoint.isSided == null ? null : new Boolean(gemGeometryEntryPoint.isSided);
// and so on, for those dozens of properties
}
Java has two boolean types. The Boolean
reference type, and boolean
primitive type. The boolean
is not a full-fledged object, and thus is smaller in memory and faster to allocate. The Boolean
is a full class implementation, with all the overhead contained within. A Java developer will generally need to use both, as if you want a list of boolean values, you need to "box" any primitives into Boolean
objects.
I say generally need both, because Naomi's predecessors decided that worrying about boxing was complicated, so they only used the reference types. There wasn't a boolean
or an int
to be found, just Boolean
s and Integer
s. Maybe they just thought "primitive" meant "legacy"?
You can't convert a null
into a boxed type, so new Boolean(null)
would throw an exception. Thus, the ternary check in the code above. At no point did anyone think that "hey, we're doing a null check on pretty much every variable access" mean that there was something wrong in the code.
The bright side to this whole thing was that the unit tests were exemplary. A few hours with sed
meant that Naomi was able to switch most everything to primitive types, confirm that she hadn't introduced any regressions in the process, and even demonstrated that using primitives greatly improved performance, as it cut down on heap memory allocations. The downside was replacing all those ternaries with lines like this.isSided = other.gemGeometryEntryPoint.isSided
didn't look nearly as impressive.
Of course, changing that many lines of code in a single commit triggered some alarms, which precipitated a mini-crisis as no one knew what to do when the intern submitted a 15,000 line commit.
Naomi adds: "Mabye null was supposed to represent FILE_NOT_FOUND
?"