In modern times, there's almost no reason to use Assembly, outside of highly specific and limited cases. For example, I recently worked on a project that uses a PRU, and while you can program that in C, I wanted to be able to count instructions so that I could get extremely precise timings to control LEDs.

In modern times, there's also no reason to use Delphi, but Andre found this code a few years ago, and has been puzzling over it ever since.

procedure tvAdd(var a,b:timevectortype; Range: Integer); register; var i: Integer; pa,pb: PDouble; begin i:=succ(LongRec(Range).Lo-LongRec(Range).Hi); pa:=@a[LongRec(Range).Hi]; pb:=@b[LongRec(Range).Hi]; asm mov ecx, i mov eax, [pa] mov edx, [pb] @loop: fld qword ptr [eax] fadd qword ptr [edx] fstp qword ptr [eax] add eax,8 add edx,8 dec ecx jnz @loop wait end; { for i:=starts to ends do a[i] := a[i] + b[i]; } end;

The curly brackets at the end are a comment- they're telling us what the original Delphi code looked like, and it's pretty straightforward: loop across two lists, add them and store the result in the first list. The Assembly code was used to replace that to "boost performance". This code is as optimized as it can possibly be… if you ignore that it's not.

Now, at its core, the real problem is that we've replaced something fairly readable with something nigh incomprehensible for what is likely to be a very minor speedup. But this is actually worse: the assembly version is between 2-5 times slower.

The Assembly version also has a pretty serious bug. If i, the length of the range we want to add across, is zero, we'll load that into the register ecx. We'll still attempt to add values from lists a and b together, even though we probably shouldn't, and then we'll decrement the contents of ecx. So now it's -1. The jnz, or "jump non-zero" will check that register, and since it's not zero, it'll pop back up to the @loop label, and keep looping until ecx wraps all the way around and eventually hits zero again.

Talk about a buffer overrun.

Now, as it turns out, playing with the Range object did turn out to be kind of expensive, so Andre did fix the code with an optimization: he used integers intsead.

procedure tvAdd(var a,b:timevectortype; afrom, ato: Integer); register; var i: Integer; begin for i := afrom to ato do a[i] := a[i]+b[i]; end;