Big Number One

Patsy's company has a product that's "number one" in their particular industry. It generates lots of revenue, but a lot of the code dates back to the far-off year of… 2017. Coding practices were… uh… different, four years ago? Especially in the rapidly maturing language of… Java?

Within Patsy's company, they do refer to their 2017 code as "legacy" code, and it has left quite the legacy. For example, here's how they parse numbers:

    public BigDecimal parseNumberInString(String content) {
        // variables
            for (int i = 0; i < content.length(); i++) {
                if (Character.isDigit(content.charAt(i))) {
                    numbers.append(content.charAt(i));
                } else if ((content.charAt(i) == DOT || content.charAt(i) == SPACE || content.charAt(i) == COMMA) && numbers.length() > 0) {
                    //is it a decimal separator? if not, ignore, if yes append "."
                    if (
                            (content.length() == (i + 2) && (Character.isDigit(content.charAt(i + 1))))
                                    || (content.length() >= (i + 2) && (Character.isDigit(content.charAt(i + 1))) && lastDecimalFound!=null && !lastDecimalFound.equals(content.charAt(i)))
                                    || (content.length() == (i + 3) && (Character.isDigit(content.charAt(i + 1)) && Character.isDigit(content.charAt(i + 2))))
                                    || (content.length() > (i + 3) && (Character.isDigit(content.charAt(i + 1)) && Character.isDigit(content.charAt(i + 2)) && !Character.isDigit(content.charAt(i + 3))))
                                    || (content.length() > (i + 2) && (Character.isDigit(content.charAt(i + 1)) && !Character.isDigit(content.charAt(i + 2))))
                                    || (content.length() > (i + 4) && (Character.isDigit(content.charAt(i + 1)) && Character.isDigit(content.charAt(i + 2)) && Character.isDigit(content.charAt(i + 3)) && Character.isDigit(content.charAt(i + 4))))
                                    || (lastDecimalFound==null && content.length() == (i + 4) && (Character.isDigit(content.charAt(i + 1)) && Character.isDigit(content.charAt(i + 2)) && Character.isDigit(content.charAt(i + 3)) && content.charAt(i)==decimalMark))
                                    || (lastDecimalFound==null && content.length() > (i + 4) && Character.isDigit(content.charAt(i + 1)) && Character.isDigit(content.charAt(i + 2)) && Character.isDigit(content.charAt(i + 3)) && content.charAt(i)==decimalMark && !Character.isDigit(content.charAt(i + 4)) && (content.charAt(i+4) != COMMA) && (content.charAt(i+4) != DOT))
                            ) {
                        numbers.append('.');
                    }
                    lastDecimalFound=content.charAt(i);
                } else if (content.charAt(i) == DASH && numbers.length() == 0) {
                    //probably a negative sign
                    numbers.append(content.charAt(i));
                } else if (numbers.length() > 0) {
                    //we found some other characters and already have a number parsed, so quit
                    break;
                }
            }
            // … snip
    }

Obviously, as this is "legacy" code, you couldn't rely on Java having a built-in method to parse numbers. And this worked, at least, though comments like "probably a negative sign" aren't exactly going to inspire confidence.

It's ugly, it's complicated, and it only supports two significant figures behind the decimal place, which is where the trouble actually starts. Customers started to want to use more precision, going out to three decimal places.

Well, that's a reasonable feature, and since this was legacy code anyway, it was a great opportunity to rewrite the function. Which is exactly what one of Patsy's co-workers did.

    public BigDecimal parseNumberInString(String content) {
    // variables
        for (int i = 0; i < content.length(); i++) {
            if (Character.isDigit(content.charAt(i))) {
                numbers.append(content.charAt(i));
            }
            else if (content.charAt(i) == DOT && lastDecimalFound == null){
                //parse first dot encountered, ignore rest
                lastDecimalFound=content.charAt(i);
                numbers.append('.');
            } else if ((content.charAt(i) == DOT || content.charAt(i) == SPACE || content.charAt(i) == COMMA) && numbers.length() > 0) {
                continue;
            } else if (content.charAt(i) == DASH && numbers.length() == 0) {
                //probably a negative sign
                numbers.append(content.charAt(i));
            } else if (numbers.length() > 0) {
                //we found some other characters and already have a number parsed, so quit
                break;
            }
            //… snip
        }
    }

This fixed the three decimal places problem, but wouldn't you know it, their users in Europe keep complaining about it incorrectly parsing numbers. Maybe someday they'll find a way to fix that issue too, if it's even possible to parse numbers in locale-specific formats. If only such a thing were possible.

[Advertisement] ProGet’s got you covered with security and access controls on your NuGet feeds. Learn more.

Featured Comments

Mr. TA (nodebb) 2021-06-29

They'll fix it, eventually. They just need 4 more years!
The Shadow Knows (unregistered) 2021-06-29

Legacy code often means what was written under a previous management structure - "that was written by IT Business Financials, but you're now Business Financials (IT) and company policy is that each team is only responsible for their own code as there's different practices in use"