The Daily WTF: Curious Perversions in Information Technology

2018-06-12 Reply Admin

First off, isn't there some language builtin to fetch the largest value in the array? DRY, people! Also I agree with the rest of the article. It's amazing how much worse the code gets with the context. Can we have more subtle WTFs like this please?

2018-06-12 Reply Admin

adding "const" to the parameter would be good practice, but would not reduce copying at all. Adding & (making it a reference) would remove copying.

The real wtf is that the sorted property of the vectors isn't expressed in the type system. If it were, then the implementation of max would indeed be trivial on any sorted collection.

2018-06-12 Reply Admin

I, too, to the 0.03333 frames-per-second performance.

2018-06-12 Reply Admin

To cut down on copy operations you want to use a const reference; simply adding const will only prevent accidental modifications to your vector.

philipcraig · 2018-06-12 Reply Admin

Wtfs:

no const qualifier on the parameter (for signalling non mutation)
no reference qualifier on the parameter (to prevent copying)
misleading parameter name ("list")
overly specific function (could be made generic to work on any collection type via passing first and last iterators)
there is ALREADY a standard library function for this (std::max_element)
the function returns incorrect results on empty vectors
the function returns incorrect results on vectors containing all negatives
the parameter's type doesn't reflect that its elements are stored in sorted order
if the caller knew that this was a sorted type, caller could just call myVector.back() rather than this function (though would still be undefined on an empty vector)

Not bad for such a small chunk of code. About the only wtf it missed was returning a reference to the element.

Steve_The_Cynic · 2018-06-12 Reply Admin

ITYM "returning a reference to the local variable", but yes. A reference to the actual max element? No problem, although it's a bit strange, but a reference to the local variable "max"?

And you missed one: Using array indexing rather than dragging an iterator across the elements. The code is sufficiently clear that the optimiser can probably save N-1 calls to list.size(), but...

2018-06-12 Reply Admin

You missed one:

Type mismatch between int i and size_t std::vector.size()

2018-06-12 Reply Admin

The method name and parameter assume an unsorted list. Changing the method's behavior to choose the first or last value based on assumed order is a mistake unless the method is simply no longer called.

Of course, add a model that smartly updates whenever the "list" of data changes so that it knows the min/max value at all times .... that's the way to go.

2018-06-12 Reply Admin

Since it's an array, isn't it already passed in by reference already (via pointer)?

Sorry if wrong. my c/c++ code knowledge is deprecated to 2001.

2018-06-12 Reply Admin

The call to index into the vector is not faster than an iterator. It's probably faster since all you are really doing is jumping to a location in memory based on index 0's location.

2018-06-12 Reply Admin

You'd be correct if it was indeed a native array. But an std::vector<T> is not native array.

2018-06-12 Reply Admin

2 more WTFs regarding parallel threads that modify the "list"-array: 1.) There is a code duplication: list[i] accessed 2 times. In the if-condition and in the max-asignment. 2.) instead of storing the list.size() in a temp variable and using that, the size is always re-evaluated in every iteration of the for-loop What happens if somebody deletes the first element in a parallel thread? For example list=[0,1,2] becomes [1,2]. You are at i=1, your max is 1. Now you re-evaluate: size = 2. You exit the loop. max stays at value 1 which is not true. Second error: you are at i=2. The list is changed after the if-statement, but before the max assignment max=a[i]. Now this max-assignment throws an out-of-bounds exception. Or it accesses a memory cell that has an undefined value.

2018-06-12 Reply Admin

@ray10k: Absolutely!

double SomeClass::getMaxKeyValue(std::vector<double> const& keys) {
std::vector<double>::const_iterator element = std::max_element(keys.begin(), keys.end(), std::less<double>()); return element != keys.end() ? *element : 0; }

2018-06-12 Reply Admin

In the current code your parrallel thread "issues" are moot as the function works on a copy of the vector.

2018-06-12 Reply Admin

The real WTF is that it returns 0 if the array contains negative numbers exclusively, no ?

2018-06-12 Reply Admin

Managed to miss the commentary in the article itself. Sorry.

2018-06-12 Reply Admin

It actually makes zero difference. Have a look at https://godbolt.org/g/ToiWpU : the loop code is exactly the same in every case. Of course, compiling with optimisations disabled would probably have vastly different results depending on how much iterator debugging the standard library implementation does, but the final version isn't (and shouldn't be) any different.

2018-06-12 Reply Admin

So a one line change in this method increased the framerate to the desired result. Sounds like a great code base where such minimal changes were needed.

2018-06-12 Reply Admin

You say that but there are alot of side effects to doing so.

In .NET, it chains you to a specific framework version.
You don't know what library code could be doing. Reading parts of the .NET reference for WinForms made it clear that Microsoft's developers aren't any smarter than I am.
Some of the functions do stupid things on corner cases. Both max() and min() might return 0 on an empty list but crash on a null list. Both compare 0 elements so what's the difference? Why is 0 even a reasonable default anyway? And how is that default 0 distinguishable from a list where 0 is a legitimate max or min? The right answer here is to return null if there's nothing to compare and then have the caller recognize that and act accordingly. Unless you're in some dopey language that does truthy/falsy conflation of NULL, and 0, and ''', and who knows what else.
Sometimes you're pulling in a ton of dependencies for a single call you really could've written yourself. JavaScript "developers" are notorious for this. happens on the desktop too though. You'd be surprised how much performance you can wring out of hardware if you aren't offloading anything and everything to a pile of libraries.
Somebody else controls your support policies. They drop support, you drop support. Some framework developers get it in their head that it's their right to force users to think/act a certain way and use software as the club to beat that into them.

It's not that I don't like libraries but some of these arguments reek of non-programming programmerism.

2018-06-12 Reply Admin

raised after the fix was committed: Bug #58822: Slow motion mode no longer operates properly.

2018-06-12 Reply Admin

The right answer here is to return null if there's nothing to compare and then have the caller recognize that and act accordingly.

You may not recognize it until the code is in prod already, and the null is showing up mysteriously in another module entirely.

2018-06-12 Reply Admin

As if a DivideByZeroException or ArgumentOutOfBoundsException is any better than a NullReferenceException? At least the null forces a compile-time check. Places that absolutely need a number can call GetValueOrDefault() with a reasonable default (or 0) and override the implicit warning that the null represents.

2018-06-12 Reply Admin

Nope. C++ standard containers have value semantics, when you pass them "as-is" it means the contents is copied. If you want reference semantics, you have to use it explicitly (ie. add a trailing & to the type). On one hand it gives a lot of control (when you know what you're doing), on the other hand it makes it easier to shoot yourself in the foot...

2018-06-12 Reply Admin

You're correct that those are WTFs when dealing with array mutation in a concurrent context. But the "parallel arrays" anti-pattern has nothing to do with concurrency. This pattern is where you keep multiple arrays of simple data types and associate the data by the accident of having the same index, instead of keeping a single array of a meaningful complex data type.

2018-06-12 Reply Admin

One more: this function doesn't use anything from its class so shouldn't be a member of the class.

2018-06-12 Reply Admin

Devil's Advocate: This (awful, however "subtly" if you don't know C++) code isn't really a WTF.

Went into production, was slow as Funk, got called out, was fixed. I'm not even sure it went into production.

It's the ones that are never subjected to peer review, enter production, and are immortalized because "everything else depends upon this method of sorting!" that are actually the WTFs.

Although, I'd agree: the lack of a passed-in comparison functor (by reference, natch) and what, although we are only given this as a sideline hint, appears to be a lack of generics ... is a WTF. Quite frankly, you fix those two things at the design stage, and all the pass-by-reference-to-const and what-does-zero-really-mean and the rather peculiar return-by-ref ... just vanishes.

2018-06-12 Reply Admin

Expressing the type concept of a sorted collection is not a very common thing. I agree: it would be useful, if quite frankly not as commonly useful as you might think.

Scanning through my mental db of languages, I can only come up with PHP7. No, wait, bad drugs! Actually ML in general, I would think ... and I'm not even sure of that. In Haskell, for example, you could most certainly assert the requirement that the type passed in exhibits the property of "sortedness," because Haskell doesn't do classes (eg collections), it does type classes. Which are a higher-level concept.

But then again, in real life, this hardly matters. If I want to throw the billion items in my sorted list or array or whatever into a Max function, the actual programming language is totally irrelevant. I just need to pick the right function.

(This one ain't it.)

2018-06-12 Reply Admin

The real wtf: the article is talking about a WTF from an algorithmic and design perspect; the comments instead spend lengths framing said WTF in all kinds of C++ language features. C++ circlejerk much?

I bet if the code is in Java/JS the comments would've been very different.

2018-06-13 Reply Admin

ad. 3. you don't want to put extra null checks into core library functions hey, we lost enough performance to latest microcode patches due to spectre

2018-06-13 Reply Admin

This looks like a C# dev decided that C++ works pretty much the same. C# List ~= C++ Vector. The Vector doesn't absolutely need to be const (although there are probably more wtfs in the codebase around this), it needs to be a reference or a pointer.

double SomeClass::getMaxKeyValue(std::vector<double> &list) or double SomeClass::getMaxKeyValue(std::vector<double> *list)

The result in cases where the whole list is negative is funny though. I wonder if that just never happens or they gave up before finding out about numeric_limits or -DBL_MAX.

Alexis_de_Torquemada · 2018-06-14 Reply Admin

std::max_element has been in the language standard for 20 years. Its use does not tie you to any .NET version. I also never heard about any implementation bugs that were released to the public. The only improvement suggested to it (and other sequence algorithms) is to use a range based on iterator plus sentinel instead of two iterators for greater genericity and performance (Ranges TS). But that change would be source-compatible with almost all existing code.

Bottom line is that there are about two reasons not to use it:

ignorance that it exists
terminal stages of NIH syndrome

Of course, as has been mentioned, using it still requires a not empty check to avoid undefined behavior, as opposed to stupidly returning zero as the "maximum of nothing".

2018-06-15 Reply Admin

I have issue with claiming "parallel arrays antipattern" without providing context.

As soon as you start hitting memory throughput issues, you will want all data, processed at once, gathered locally. It will be more cache friendly and it will simplify SIMD, if you want to use it.

At this time all your arrays of structures will go back to N parallel arrays.

gcewing · 2018-06-21 Reply Admin

Calling the parameter "list" is only confusing if you think that "list" always means "linked list".

Also, since when was it ever a good idea to shackle the name of something to the data type used to implement it? That way lies the bad kind of Hungarian.

2018-06-21 Reply Admin

That's what I thought.

2021-08-14 Reply Admin

If I ever run into this problem and solve it by destructing my collections of objects into collections of their constituent members, I'll send you $1 million.

Maximum Performance

Leave a comment on “Maximum Performance”