Dan B is working on software which interacts with a bank. We'll get the REAL WTF out of the way right at the top: "The bank has requested that we send them an email containing the total of all the transactions…"
Yes, core financial business functions being handled over email. I imagine some readers are probably thinking about drinking something stronger than coffee at the thought of it. A lot of readers, on the other hand, are already onto something stronger than coffee and going, "Oh, yeah, seen that before. Hell, I'm pretty sure that EDI explicitly supports email as a delivery mechanism."
Let's dig into Dan's more specific challenge. The program he was supporting needed to take an input from their backend- a mainframe style, flat-file with fixed-width fields, and convert it into the format the bank wanted. Their input file tracked the value of each transaction as a zero-padded number in cents. E.g., a ten dollar transaction would be "0001000". The bank wanted a roll-up of all the line items, where the currency was represented as dollars and cents, e.g. "10.00". The email alert that needed to go out simply needed to be a summary of the file processed.
Now, there's an obvious way to do this, even when we're worried about things like floating point rounding errors. There's also a wrong way to do this. Let's look at that one.
def create_message_body_from_file(path):
def to_dec(s): return Decimal('{}.{}'.format(s[:-2], s[len(s)-2:]))
with path.open() as f:
# 28:39 contains the transaction amount
trans_amount_lines = [line[28:39] for line in f][2:-2]
decimal_converted_lines = list()
for line in trans_amount_lines:
s = line.lstrip('0')
decimal_converted_lines.append(to_dec(s))
summed_lines = sum(decimal_converted_lines)
todays_date = date.today()
body_format = {
'file_name': path.name,
'summed_lines': str(summed_lines),
'todays_date': str(todays_date)
}
body = '''
The ACH file has been successfully transmitted the file {file_name} with a
total ACH amount of {summed_lines} to the bank on {todays_date}.
'''.format(**body_format)
return body
We start with the to_dec
inner function. Here, we take a string and split it up, then reformat it and then convert it to a decimal type. Because string munging is always the best way to separate dollars and cents.
Then we pull out the field we care about- all the characters from 28:39 in every line. For every line in the file, we make a list of those fields, and then we use [2:-2]
to skip the first two and last two lines in the file (headers and footers, I assume?).
Once we've got the lines parsed out, we then strip the leading zeroes, then run it through the to_dec
function. Which, it's worth noting, converts the string to a numeric value- thus stripping the leading zeroes without doing string munging.
Finally, we sum it up, build our format, and output our body.
In total, we iterate across every row three times (once to read from the file, once to convert to decimal, once more to sum). For each row, generate a new string once when we strip and once again when we convert to decimal. Then two more, just for fun, when we create our body_format
and format the string.
As Dan puts it: "While not incorrect, this is just… convoluted, hard to read, and a memory hog." The only good news is that "we're not processing too many transaction, yet…"
Ah, the best news for any business with bad code: "at least we're not generating a lot of sales". I'm sure this will be fine.