There's a very specific brand of bad code that I see from time to time, which I think of as "Oh, this poor person was permanently damaged by exposure to C." They're not writing C, but there's something about their "accent" which tells you: they learned C and didn't recover from the experience. Every reference variable can be treated like a pointer if you're insistent enough.
There are other, similarly brain-breaking languages. COBOL. PL/SQL. VBA. Programmers learn the quirks of these languages, fail to understand them, and then start writing weirdly formal, structured code in languages that aren't supposed to work that way.
Kerry has a co-worker like that. They're doing C#, and they include weird little ticks that are a clear sign of some kind of other langugae trauma. My guess is that it's a mix of PL/SQL and Visual Basic (not .NET), based on my experience with other people who have suffered through learning PL/SQL and VB as their first languages.
We're going to take a look at some. In the interests of anonymization, I'm not going to include all of the code, which is 632 lines of tedium in one method, but I'm going to include the important highlights. Like, for starters, the message signature:
public long? saveClaim(CreateClaimViewModel startClaim, out long? IncidentNumber, out int ClaimSameIncidentCount, string claimOrigination)
As you might gather, the purpose of this method is to take an insurance claim, inputted in the UI, and save it into the database. As part of that operation, three fields will have their value set: the claim ID (the return value of long?
), the IncidentNumber
, and the ClaimSameIncidentCount
, all of which are also fields on the claim's model object.
Why doesn't this return the claim model type, instead of using a return value and two out
parameters to return that information? Because if you come from one of those traumatized backgrounds, you might think output parameters is the most natural way to pass data around.
The next weird thing you'll notice is:
string ProcessStep = "";
long claimId = 0;
try
{
ProcessStep = "Step 1";
int? modelId;
int? makeId;
fetchVehicleMakeModelIds(startClaim.VehicleMake, startClaim.VehicleModel, out makeId, out modelId);
ProcessStep = "Step 2";
int? customerId = fetchCustomer(startClaim .CustFirstName, startClaim.CustLastName);
int? claimOriginationId = fetchClaimOriginationId(claimOrigination);
// skip about 600 lines…
claimId = claim.ClaimID;
ProcessStep = "Step 21";
_log.Debug($"Claim {claim.ClaimID} successfully saved");
ProcessStep
is a variable that gets set at the start of each functional block. There are, in total, 21 "process step" blocks. Or, as a normal programmer might think of them, things that should each be their own procedure.
For the longest time, I thought this ProcessStep
variable was just an overglorified comment, but as it turns out, when you scroll all the way down to the one, giant exception handler at the bottom, you see that it gets used:
catch (Exception ex)
{
_log.Debug($"Error during save claim Process Step = {ProcessStep} in exception {ex.Message}");
_log.Fatal("ruh roh", ex);
…
}
That's… almost clever? I mean, it lets you easily know which processing step failed, which, if you're not using procedures, can be hard to track. Actually smart would be to not write a 632 line function that needs to be divided into "process steps", but this shows a kind of cunning, nonetheless.
Starting in Process Step 5, this code opens a database context. From there, we mix-and-match, sometimes doing LINQ style queries:
Customer customer = null;
var q1 = from t1 in ctx.Customers
where t1.FirstName == startClaim.CustFirstName &&
t1.LastName == startClaim.CustLastName &&
t1.Phone == startClaim.CustPhone
select t1;
and sometimes doing what our developer is clearly more comfortable with, calling a procedure with output parameters:
var incidentNumberId = new SqlParameter
{
ParameterName = "IDName",
Value = "IncidentNumber"
};
var nextId = new SqlParameter
{
ParameterName = "NextID",
SqlDbType = SqlDbType.Int,
Direction = ParameterDirection.Output
};
ctx.Database.ExecuteSqlCommand("spGetNextIdentifier @IDName, @NextID out", incidentNumberId, nextId);
var incidentNumber = (int)nextId.Value;
IncidentNumber = incidentNumber; // output value
That's not wrong to do, per se; an important feature of a decent ORM is that it lets the developer talk straight to the database when you need to. But mixing and matching approaches in the same method is just awkward and confusing.
Since we are using an ORM, we also know that we have some flexibility in how we actually trigger our persistence and interact with transactions. We can, for example, build the entire update in memory and then send it over to the database. Or, if you care about locking, you can write basically the same code but tell the ORM to lock the records while you're working. Or all variety of middle ground.
You'd think, for example, that we'd want to perform this entire update as a single transaction. You'd think that, but each process step that updates a row looks something like:
// insert customer record
customer = new Customer
{
FirstName = startClaim.CustFirstName, // what came from the API may have been overwritten from RHM
LastName = startClaim.CustLastName,
…
};
ctx.Customers.Add(customer);
ctx.SaveChanges();
That call to SaveChanges
does what it says on the tin: it saves your changes to the database. This has a few disadvantages: you're increasing the number of database round trips, you're basically ensuring the ORM doesn't get any real chance to optimize how it batches up your queries, you're not getting any of the advantages of transactions so if a statement fails, you'll end up with an inconsistent state in the database.
On the plus side, if a later step does fail, you've only lost a small part of your work thus far.
Many of those 632 lines of code are object constructions. They mostly are all just big blocks of Property = Value, OtherProperty = OtherValue
, but the block that initalizes a contract introduces a few bonus ternaries in there:
contract = new Contract
{
ContractNumber = startClaim.StampNumber,
ProductID = (int)startClaim.ProductID,
ProgramID = (int)startClaim.ProgramID,
…
CustomerID = customer.CustomerID,
Status = (startClaim.ContractStatus == null || startClaim.ContractStatus == 0) ? 1 : startClaim.ContractStatus, // default to eligible QRP 1/24/2020
ContractType = (startClaim.ContractType == null || startClaim.ContractType == 0) ? 4 : startClaim.ContractType, // default to Inboarded QRP 1/24/2020
PurchaseDate = startClaim.ContractPurchaseDate,
MilesAtPurchase = startClaim.MilesAtPurchase,
TreadAtPurchase = startClaim.TreadAtPurchase,
InvoiceNumber = startClaim.ContractInvoiceNumber == null ? startClaim.OriginalInvoiceNumber : startClaim.ContractInvoiceNumber,
…
};
Again, this isn't wrong, but in the context of a gigantic uber-method, it's a nasty trick. Also, the InvoiceNumber
ternary is a null check, but as most of this code uses nullable values, most of the rest of the code is using the ??
coalescing operator. It's odd that they forgot on just this one step.
Oh, there is one thing that's wrong, though you wouldn't know it from looking at this code. startClaim.ProductID
and startClaim.ProgramID
are both defined as int?
- nullable ints. The cast is an attempt to make them actual ints. Which is fine, unless they're actually null. Can they be? Maybe not! But someone clearly doesn't fully understand what nullable types are for. Maybe they're just lacking confidence- they don't want an int
, but an int
?
Well, let's go back to the exception handling, shall we? I elided a bit before, just because I wanted to highlight the one case where ProcessStep
was used.
catch (DbEntityValidationException dbex)
{
foreach (DbEntityValidationResult validationError in dbex.EntityValidationErrors)
{
_log.Debug("Entity: " + validationError.Entry.Entity.GetType());
foreach (DbValidationError error in validationError.ValidationErrors)
{
_log.DebugFormat("Property: {0} Error: {1}", error.PropertyName, error.ErrorMessage);
}
}
_log.Debug($"Error during save claim Process Step = {ProcessStep} in exception {dbex.Message}");
throw new SaveClaimException("unable to save claim, see log for errors");
}
catch (Exception ex)
{
_log.Debug($"Error during save claim Process Step = {ProcessStep} in exception {ex.Message}");
_log.Fatal("ruh roh", ex);
Exception myEx = ex.InnerException;
while (myEx != null)
{
_log.Debug($"Inner Exception = {ex.InnerException.Message}");
myEx = myEx.InnerException;
}
throw new SaveClaimException("unable to save claim, see log for errors");
}
As you can see, there are actually two catch blocks. They both, basically do the same thing: log the exception and convert them both into a SaveClaimException
, with no inner exception. This means that the calling code has no idea what the actual failure is- the only thing anyone can do is go back to the log. Whether the user entered invalid data or the database is down, it's anybody's guess. Whether the process failed at processing step 5 or processing step 17, also just a guess, unless you go back to the log. And since we're saving changes at every step, we have no idea what the internal database state might look like, without looking at the log.
Management is puzzled. They know that every time they assign this developer to a project, the project quickly spirals into chaos. The other developers constantly complain that this person writes terrible, unmaintainable code, but to management's eyes, that's just a judgement call. There must be some other reason.