The Daily WTF: Curious Perversions in Information Technology

2024-02-26 Reply Admin

As part of the project to remove 20000 lines of code, mountain coder frist had to write a program to precheck the code and be sure all the numbers existed in the loop - column 23 wasn't kept around for example.

Domin Abbus · 2024-02-26 Reply Admin

if this developer was paid by the line of code, I can see how (and what) has been "extremely optimized."

MaxiTB · 2024-02-26 Reply Admin

There is a common rule you can easily evaluate the skill level of a developer:

A good developer is lazy but never sloppy

No this developer wrote n lines of code instead of one line with an iteration, that makes him very, very diligent - or in other words a very, very bad developer.

MaxiTB · 2024-02-26 Reply Admin

There's nothing about this code that is optimized. Simply because you run against the memory barrier the whole time due to the bigger code size. And hitting the memory barrier (aka cache misses) is the fastest way you slow down CPUs. So no, nothing optimized here at all.

Besides, https://en.wikipedia.org/wiki/Don%27t_repeat_yourself ;-)

Addendum 2024-02-26 07:07: If you are interested behind the reasoning why and when loop unrolling is done by Roslyn, here is a nice discussion thread that goes a bit into the details behind the scenes: https://github.com/dotnet/runtime/issues/8107

jeremypnet · 2024-02-26 Reply Admin

I think whether there is an impact on instruction caching or not is somewhat uncertain since we do not know what happens in list.DeleteItem(n). As an extreme possibility, consider that it might delete a record from a file. If it is doing IO, loop unrolling is not going to make any measurable difference one way or the other.

Mr. TA · 2024-02-26 Reply Admin

You guys all missed it. Obviously the developer knew how to make a for loop which goes 0 to 60 from whatever programming 101 tutorial he read, but he didn't know how to go 60 to 0. Therefore, he just wrote out the lines by hand.

MaxiTB · 2024-02-26 Reply Admin

No, I 100% agree. DeleteColumn seems to be a high level method to begin with, so even when you only had four of those statements it would be a pointless endeavor. And started my argument from a .net perspective here, if you go Java for example it would make literally no sense and because most platform implementations of Java are still interpreters, well, it would make the performance way, way worse because at least some interpreters are hotspot compiling hot paths like simple iterations. So no matter how you put it, it's neither clean code, nor it's optimized code. And that's why 'A good developer is lazy but never sloppy' is universal true.

2024-02-26 Reply Admin

I once delivered a standard lecture in my C class about the 'toupper' function. Then I gave a simple programming assignment that begged to use it. One student (who failed the class) wrote something like the following:

switch (c) { case 'a': c = 'A'; break; case 'b': c = 'B': break; .... continue ad nauseum.

I think I know where he went after my class.

Barry Margolin · 2024-02-26 Reply Admin

Even if you don't know how to count down, you can count up and do list.DeleteColumn(60-i)

2024-02-26 Reply Admin

Unfortunately, The Company treated lines of code produced as a KPI, so Personal Mountains was promptly fired for abysmal performance at their next review.

2024-02-26 Reply Admin

It would also be possible to start with something like 'i = 61;' and then repeat 'list.DeleteColumn(i--);' 61 times. Multipe off-by-one problems can be included depending on the start value, the handling of sequence points and the correct amount of 'list.DeleteColumn(i--);' statements. However, thinking about how (in)efficient a -- operation is, especially when performed "inside" a method call, is very important.

cellocgw · 2024-02-26 Reply Admin

I vote for list.DeleteColumn(0:60) . Vectorize, vectorize, vectorize!

2024-02-26 Reply Admin

Ah, but the developer probably knew that deleteColumn was more efficient if it deleted from the end of the list towards the front. At least I hope they knew that!

Mr. TA · 2024-02-26 Reply Admin

What makes you think the developer knew how to subtract? Or that the 0-60 loop with subtraction would get him the results he's looking for? 🤣

Mr. TA · 2024-02-26 Reply Admin

Huh? Java interpreted its byte code, like, how many years ago? 15?

I'm no Java expert but even I know that they've been enjoying JIT compilation for a while and it actually sometimes produces faster code than the equivalent .NET program JITted by the .NET JITter. (Albeit by like 1%, but still. )

"The JDK implementation by Oracle is based on the open-source OpenJDK project. This includes the HotSpot virtual machine, available since Java version 1.3. It contains two conventional JIT-compilers: the client compiler, also called C1 and the server compiler, called opto or C2."

Addendum 2024-02-26 17:14: Employing, not enjoying

2024-02-26 Reply Admin

How often do you have to delete this many columns ?

prueg · 2024-02-26 Reply Admin

What I find hard to believe is that, after removing 20,000 lines of code, that there was "no measurable impact on its behavior or performance". I would have expected significant improvements with a lot of nonsense being removed. I mean, if the original coder was producing this, what else would they have in their codebase?

LorenPechtel · 2024-02-26 Reply Admin

Choose your algorithms carefully, let the optimizer do it's job. It's better than you are.

And I will second the notion that this sort of stuff causes cache misses and thus will if anything be slower. Consider what I'm sure a lot of us saw in school, the standard Sieve of Eratoshtenes--some years back I did a little experiment--standard sieve vs brute force prime testing. As I expected, on a modern processor brute force won.

You can get in a lot of divisions for the cost of one memory reference and in filling the sieve you're going to end up kicking your own outer loop out of the cache and thus take the hits for reading it back in. And, while I didn't attempt it, the brute force routine could be paralled across however many cores you have but the sieve is memory bound and won't benefit from additional cores.

MaxiTB · 2024-02-27 Reply Admin

Not sure what you mean, I think you confuse what I'm referring to and yes, I don't blame you, it's a bit of mess like everything else that comes with Java :D

=== .net ===

.net generates IL code from the original language (C#, C++/CLI, F#, VB.net, Ada#, etc.) and this binary code is packed up in assemblies, which are binary files with IL + type info and can historically strongly cryptographic signed to allow global deployment without man-in-middle attacks (hence the name .net BTW). There is a second option, I get to it later.

On the deployment platform the assembly will either right after loading or on-demand compiled into machine code; there is never an interpreter involved. This is also the reason why code Emit doesn't produce a similar performant code because dynamically generated code doesn't do all the optimization - plus it's also the reason that ASP.net takes a while to do new request when you hot-change a cshtml file. The compiled code is then linked JIT, which is obviously required for dynamic code (because you always run multiple assemblies in the process).

Now the second option I was mentioning is ngen/AOT compilation. This just means that you pack native precompiled machine code into the assembly as well; obviously that has serious drawbacks, you have to do regular patches to keep the code up to date for every little system change etc. and the code is no longer platform independent.

MaxiTB · 2024-02-27 Reply Admin

(cont)

=== Java ===

Java is a byproduct of a failed Sun OS, it was basically their version of packed in BASIC which was common in the old days. Like BASIC it translated the source code into bytecode, calls this process compilation and then interprets the byte code in an interpreter called virtual machine. So there was no JIT linker required, no compilation to machine code at all. This has a major benefit: Interpreters are easy to make, no matter how you call them, so Java could spread like BASIC everywhere really fast.

Obviously this made Java super slow like any other interpreted language, so they added along the road something they called "hotspot compilation"; which is, as I wrote before, compilation of tiny parts of code into machine language when they are executed on a critical path. Now you might wonder: Why didn't they make a full compiler - well, there were many issues, but basically it was Java itself was never designed to be complete language and because it already spread so much outside of Sun's control every breaking-change would have killed Java. And that's basically also the reason why Java didn't improve at all, Sun feared over it's market dominance.

Now these days there are versions of Java around which allow even AOT compilation where you don't need the interpreter at all, but they have similar to .net AOT serious drawbacks. There is also version out there which do more and less HotSpot compilation, so as I said Java is not a standardized open-sourced framework like .net, you never know what you get on whatever platform you are targeting.

MaxiTB · 2024-02-27 Reply Admin

How does HotSpot compilation is Java work, well, that's a good question because again there is no standard.

But the old version (and I guess the Open Source version as well) just remembered if they have processed steps before and if it encountered it again, they didn't interpret the bytecode again but reused what they had in cache. Hence the limitation of code size, because caching is obviously not free too.

This is BTW the reason why in unbiased benchmarks .net is about twice as fast as Java even with HotSpot compilation and slightly slower than C++. Obviously C++ has the downside that you have to maintain the code all the time to keep peek performance, because a lot of OS change internals all the time, hardware changes, libs change etc. You don't really need to bother about that in .net as long as you don't use AOT compilation as mentioned before, so while you end up in an scenario like that with faster .net code compared to C++, I call that a biased benchmark and I hated MS for using that back in those .net BETA days.

Domin Abbus · 2024-02-27 Reply Admin

If my initial premise was true then something was optimized here: it was the paycheck...

2024-02-27 Reply Admin

Looks like our old enemy Jed is still on the loose.

https://thedailywtf.com/articles/Jed-Code

2024-02-28 Reply Admin

You guys all missed it. Obviously the developer knew how to make a for loop which goes 0 to 60 from whatever programming 101 tutorial he read, but he didn't know how to go 60 to 0. Therefore, he just wrote out the lines by hand.

What an idiot. It's simple

for (var i = 0; i < 60; i++)
{
    var ii = 60 - i //TODO check if there's an off by one error
    //list.DeleteColumn(i);
    list.DeleteColumn(ii);
}

2024-03-05 Reply Admin

Not sure about Visual Studio, but in IntellJ you just alt-enter on a for loop and there's an "unroll for loop" function that will do exactly this. Of course, that may not have been how this programmer did it, and the whole thing is one of those, fun to code, not fun to use, things you sometimes might see in an IDE...

Climbing Optimization Mountain

Leave a comment on “Climbing Optimization Mountain”