The Daily WTF: Curious Perversions in Information Technology

dkf · 2022-02-07 Reply Admin

Is there no simple way to say that an integer must be non-negative to be valid in C#? Because that's what that code is doing (in an excessively complicated way).

Jeremy Pereira · 2022-02-07 Reply Admin

Came here to point out that a minus sign causes the regex to fail.

Is there no simple way to say that an integer must be non-negative to be valid in C#?

Off the top of my head as somebody who hasn't programmed in C# for more than 10 years:

p.Quantity >= 0

2022-02-07 Reply Admin

Since this is the FluentValidation, there should be something like

RuleFor((p) => p.Quantity).GreaterThan(0);

Or maybe Between(0, SomeReasonableUpperBoundary);

2022-02-07 Reply Admin

I am pretty sure, Quantity used to be defined as string originally, and then somebody realized it can be int and refactored the API. And when restoring the codebase to compileable state, did not pay much attention to refactoring the validation rule to something saner. Not really a WTF, these things happen all the time on projects under active development. And even having Quantity declared as string is not a WTF per se because who knows what requirements there were in the past - like, supporting the ability to specify it as a range - and then simplified to only support numbers.

Steve_The_Cynic · 2022-02-07 Reply Admin

Not really a WTF

So you don't think that it's a WTF that whoever did this didn't bother thinking about what he or she was doing?

2022-02-07 Reply Admin

Is there no simple way to say that an integer must be non-negative to be valid in C#?

Sure, just declare it as uint. IANA C# developer, but I think it works just like unsigned int in C++.

Off the top of my head as somebody who hasn't programmed in C# for more than 10 years: p.Quantity >= 0

They're asking to do it at the data type level, not at runtime.

And even having Quantity declared as string is not a WTF per se because who knows what requirements there were in the past - like, supporting the ability to specify it as a range - and then simplified to only support numbers.

I think that's still a WTF. If you need a range of numbers, get a NumberRange class. (Whatever JSON library you're using has a way to send them as strings if you want to.)

2022-02-07 Reply Admin

Agree w @Steve.

If you're refactoring the type of a variable, you damn sure need to visit every single reference to that variable to make whatever adjustments are necessary. Simply clicking "build" and then chasing down the errors (not even the warnings), is exactly the kind of WTF (anti-)workmanship that keeps software the running joke of the engineering world.

Sadly, with the advent of http-mediated client server architetures, JSON, etc., damn near every data element is stringified somewhere along the chain. So in effect "string" becomes the default datatype of everything, and only in special cases for short segments of the data's journey is it typed differently. This mindset ends up permeating far more of the codebase than it properly should. Heck, even the original BASIC differentiated between numeric and string data, although it freely and sometimes confusingly interconverted them.

2022-02-07 Reply Admin

Oh Cthulhu, this is my former colleagues' kind of code, except they wrote it in SQL and PHP. Get an id from a database field that you know to be INT 11,0 NOT NULL? Better wrap it in mysql_real_escape_string() in the next query. Even when it came from user input, they'd apparently never heard of is_numeric(). Nope, put manual quotes around it and wrap it in mysql_real_escape_string().

2022-02-07 Reply Admin

The .ToString() could cause the RegEx to fail valid integer values because it will format the number using the default thread locale - which could include thousand-separators like spaces or commas.

2022-02-07 Reply Admin

<!-- Easy Reader Version: Validate that this is an HTML comment -->

That's a fairly non-trivial task, since as (I hope) we all know, you can't parse HTML with regex.

2022-02-07 Reply Admin

Sadly, with the advent of http-mediated client server architetures, JSON, etc., damn near every data element is stringified somewhere along the chain.

Yeah but the fact that he's doing an implicit .toString() says he knew the data type was already not a string at that point in the process.

2022-02-07 Reply Admin

I sometimes forget how much our worlds differ. In my programming language a WORD and an INT are separate entities. Both are 16-bits, but the WORD does not have ordinality. You have to explicitly convert them. You cannot compare a WORD, you cannot do bitmasks on an INT (because it's an ordinal number.) In safety systems, you cannot perform an addition without explicitly evaluating the overflow bit. It may be limiting, but at least I do not have to deal with this crap. ;-)

2022-02-07 Reply Admin

Isn't that regex searching for things which consist entirely of non-digits, rather than things which contain at least one non-digit?

2022-02-07 Reply Admin

This looks like FluentAnnotation, and your question if there is a simpler way to check if something is a number is yes :-)

https://fluentvalidation.net/

RuleFor(obj => obj.Property).Must(x => int.TryParse(x, out _) ).WithMessage("Invalid Number.");

2022-02-07 Reply Admin

which could include thousand-separators like spaces or commas.

Indeed the regexp check looks like a fairly normal defensive programming countermeasure against someone setting a locale in the wrong place, and eventually breaking a wire protocol (e.g. JSON, or even CSV).

(whether that makes sense in this specific context is a separate question)

2022-02-07 Reply Admin

Isn't that regex searching for things which consist entirely of non-digits, rather than things which contain at least one non-digit?

Entirely non-digits would be ^[^\d+]$

2022-02-07 Reply Admin

Using \d to validate whether a string contains an integer is fundamentally wrong. \d matches any Unicode digit, of which there are hundreds. And I doubt C# is able to do arithmetic with tokens consisting of digits from different scripts (or even with non-Latin digits).

2022-02-08 Reply Admin

Recognising if a string is a HTML comment absolutely can be done with a regex. Citing that post is a WTF in itself, because not only was it responding to a question that can be done with regex, but both in the original context and here it's tantamount to "no subset of a context-free language is regular", which is demonstrably false.

For one thing, the definition of a comment in the HTML spec is regular. 

Jenda · 2022-02-08 Reply Admin

Whoever decided \d ought to match anything any writing system might consider a digit ought to be hunted down and hung by reproductive organs or mammaries (whichever the individual possesses) till pronounced dead. I imagine there might exist a person or two that once in a long while actually want such a character class, but in 99.99999999 % of cases \d was ever used, only "arabic" numerals aka something the computer will happily convert to something it can do maths with was expected and meant.

2022-02-08 Reply Admin

The fact regexes predate Unicode by decades may have something to do with the issue you raise.

2022-02-09 Reply Admin

Oh dear. That's about the worst regexp you can use to recognize an HTML comment.

The current spec says the following (https://html.spec.whatwg.org/#comments):

Comments must have the following format:

The string "<!--".

Optionally, text, with the additional restriction that the text must not start with the string ">", nor start with the string "->", nor contain the strings "<!--", "-->", or "--!>", nor end with the string "<!-".

The string "-->".

(I wonder if all this appears, or whether this gets eaten by something interpreting the strings as HTML comments...)

2022-03-07 Reply Admin

"This value should never be negative" is (unintuitively) a terrible reason to use an unsigned integer type.

It will cause errors in edge cases:

In c/c++, unsigned is contagious. -1/1u results in a very large unsigned value.
In any language, you cannot reliably detect the error case where the value underflows and becomes unreasonably large.

Good reasons to used unsigned:

the value encodes a bitfield not a numeric value
you're on an embedded system where every bit counts; the ability to encode 2x the number of positive values outweighs the risk of error

In general, encode "this variable holds only non-negative values" in the name not the type.

https://jacobegner.blogspot.com/2019/11/unsigned-integers-are-dangerous.html

Validly Numeric

Leave a comment on “Validly Numeric”