The Daily WTF: Curious Perversions in Information Technology

2021-07-21 Reply Admin

The docs tell us to do what the code today does: https://docs.microsoft.com/en-us/dotnet/api/system.net.ipaddress.tryparse?view=net-5.0

2021-07-21 Reply Admin

The WTF is in C# standard library that parses “1” as a valid IP address (and interprets it as 0.0.0.1). RTFM: https://docs.microsoft.com/en-us/dotnet/api/system.net.ipaddress.tryparse?view=net-5.0#System_Net_IPAddress_TryParse_System_String_System_Net_IPAddress__

2021-07-21 Reply Admin

There is no such place as ::1 or any other valid form of a IPv6 address

2021-07-21 Reply Admin

Hmm. The OP solution is close to "usable:" it just needs to add groups and check each group for 0 < 255. Which will still of course leave "IP Addresses" in the right format, but still not legit.

Expecting IPAddress.TryParse() to check for actual validity (as opposed to format validity) is to be excessively nominalist. It's clearly intended to do what browsers do when faced with something that might be an IP Address. The next step in a browser, of course, is to try the address an fail in a very Postel-like way.

Really, what you'd need to do in .NET would be to call the TryParse() first in order to get an IPAddress object. Then you'd switch on the AddressFamily -- which as a bonus will allow IPv6 -- and discard if you don't get an acceptable family. And if you truly, really, want nothing but quads, then the regex is almost correct.

So, not a bad effort, just a bit sloppy.

Mr. TA · 2021-07-21 Reply Admin

This is definitely a good candidate for a breaking change with a global flag to revert to legacy behavior.

2021-07-21 Reply Admin

Why the hell isn't this deprecated and replaced by a functioning TryParse2, TryParseEx, or whatever?!

2021-07-21 Reply Admin

The IP address format allows for (at least) two different representations, either as four 8bit unsigned decimal integers separated by a dot, or as a single 32bit unsigned decimal integer, and this isn't just an oddity of .NET. For instance, try directing your browser to http ://2130706433/ and see where it takes you.

2021-07-21 Reply Admin

For an IPv4 address you should never use or accept anything other than four decimal numbers separated by dots. The BSD alternative forms (such as the one you mention) are of no practical use to normal users and merely provide a way for attackers to fool users and (low-quality) tools. Any program or tool which accepts a wider format should be fixed.

Jaime · 2021-07-21 Reply Admin

This isn't unique to IPAddress. For example, try validating a Date in javascript by seeing if it runs through the constructor properly. All kinds of things are accepted that probably aren't valid in your use case, like "123". Often these "see if it works" approaches don't quite do what you think they do. That's why .Net has DateTime.ParseExact.

If you are building a UI that requires an IPv4 dotted decimal address - such as a tool that does subnetting - then parse for your exact requirements. If you are building a UI that asks for an endpoint to connect to - then accept whatever IPAddress accepts. Sure, someone might get surprising results when they accidentally types a valid address in Int64 format, but someone else might get pleasantly surprised when your application supports IPv6 as soon as the framework does.

TheCPUWizard · 2021-07-21 Reply Admin

" unless the vast majority of strings this application works with "..... Wrong scope..... it can very well be ... "the vast majority of strings used WHERE THE EXTENSION METHOD IS IN SCOPE are", at which point it makes perfect sense.

2021-07-21 Reply Admin

It gets even weirder than that!

The 4.2BSD implementation of the inet_aton() function for converting strings to IP addresses has been widely copied and accepts strings of the following formats:

8.8.8.8 8.8.16 8.24 32

where 8, 16, 24, and 32 represent an integer of that width in network byte order, and they can each be in decimal, octal, or hexadecimal. So all of the following are valid IP addresses referring to the same IP:

10.11.12.13 10.11.3085 10.723981 168496141 0xa0b0c0d 01202606015 0xA.0xb.0xC.0xd 0xA.11.06015

2021-07-21 Reply Admin

Legacy support isn't a WTF, it's an unstated requirement of your project. This requirement is always there in any but green field development.

dkf · 2021-07-21 Reply Admin

The WTF is in C# standard library that parses “1” as a valid IP address

That's correct. Alas. IPv4 addresses (which are actually 32-bit unsigned big-endian integers) can be given as a single number (full 32-bits), or a pair of numbers separated by a period (that's interpreted as an 8-bit followed by a 24-bit number), or the most common case of four numbers separated by periods (all 8 bit numbers). And any of those numbers may be given as decimal, hex or octal. It's awful. Restricting to a dotted quad of decimals is limiting things a lot, but that's OK; that's the non-crazy case and simultaneously the case that almost every legit use in the wild is.

2021-07-21 Reply Admin

0xA.11.06015

And I don't think you can even be sure whether the implementation you have will parse that last number as octal or decimal. I recall that within the past year or so there was an exploit that obfuscated an IP address by having it decoded as octal. Nobody ever asked for that, but you know how some programmers can be.

2021-07-21 Reply Admin

99.0.0.99 is an IP address. 99.0.0.099 is a host name, because 9 is not a valid octal digit.

If anyone is ever foolish enough to allow numeric TLDs, there's gonna be a line around the block to buy one.

2021-07-21 Reply Admin

I love these WTF's where everyone gets the slime dumped on them.

And even in the end, when everyone perfectly understands what's going on, there's still this icky feeling that things are horribly wrong.

2021-07-21 Reply Admin

It seems the unfortunate reality is that this code is what it should be.

According to the previously linked documentation, you're responsible for making sure you pass it a string in dotted decimal notation. So there's a built-in function to validate an IP, but you need to pre-validate the input. Which is what the regex accomplishes. And then the whole routine is dumped in a function because why would you constantly re-write this?

C# clearly scores a WTF point with this one.

2021-07-21 Reply Admin

Mostly because the .NET method does exactly what it is supposed to do: take a string from a browser and turn it into something that might, or might not, be a valid IP address.

That, I am afraid, is what backward compatibility is all about. If a browser (say, Edge) relies on this method for say 10% of its lookups -- and trust me, I've worked for Bing, it really does -- then you'd have to be insane to break backward compatibility.

What TryParse() does in this case is exactly what you would expect. It returns an object (IPAddress) which is, you know, an object. This object has a method called AddressFamily. You can look it up, as I did: It returns InterNetwork for IPv4, or InterNetworkV6 for IPv6.

It also returns CCITT for X.25 network addresses (yes, I am sad and I remember X.25).. It handles DecNet, IPX, OSI, and more.

It does what it is specified to do.

Now, if you want some strange filter like ddd.ddd.ddd.ddd (and remember to qualify the ddd as the decimal representation of a 2^8 number, then sure, add that filter.

But the fact is, the damn thing not only does what it says on the tin -- it gives you a tin-opener to find out what is inside.

Ya still have a problem wid dat?

2021-07-21 Reply Admin

None of those formats is weird. Back in the old days, we didn't have CIDR - all IP addresses were allocated as /8, /16 or /24s, also known as Class A, Class B and Class C addresses.

Thus the numeric entry mode of 8.24 was useful for Class A, 8.8.16 for Class B, and 8.8.8.8 for Class C - the first numerical bit was fixed based your allocation, and you can start with 1 being the router, 2 being the file server, 10 being the printers, 1000 being Admin staff, 2000 being production staff, 10000 being engineering staff, etc. So you'd access the router by prefix.1, print to prefix.10 prefix.11, etc.

And that's what the "C" in CIDR meant - CIDR allows finer grained subnetting - instead of 16M (2^24 - 2), 65534 or 254 hosts on a network, you can pick any bitrange you want, up to /30. It was "classless" - no more Class A, Class B or Class C.

Octal also made a lot of sense, since early machines were octal - none of this hexadecimal thing. After all, 9 bit machines were common, as were 36 bit machines.

You can abuse thus stuff today, after all, connecting to localhost is much easier as 127.1. Saves a lot of typing, no?

2021-07-21 Reply Admin

Despite what Microsoft think, "Legacy support" doesn't necessarily mean "don't fix anything that's broken". Their .NET RNG (System.Random) has an error in the initialisation that is known to them, and spoils some of the statistical properties. They won't fix it for "legacy" reasons. Whereas PHP had an error in their RNG (mt_rand). They fixed it, but if you still need the old broken behaviour (e.g. if you have unit tests that rely on a repeatable sequence from the RNG), there's a flag you can use for that.

2021-07-22 Reply Admin

Ugh, and I've been bitten by that more than once. Because $foolWithATool insisted on fine-grained IP-based access control. At least I can tidy up and line up the long list of IPs, right? Wrong. Sigh.

2021-07-22 Reply Admin

I don't know about the C# specifics, but in many languages, the given regexes matches more than just the well known dotted ASCII decimal representation of IP addresses. Which means that /\b\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}\b/ matches strings like "૭໒.೩୨.൫፬.៦໗" which uses digits from many different scripts.

2021-07-22 Reply Admin

Their .NET RNG (System.Random) has an error in the initialisation that is known to them

What error? Is there a GitHub issue around that?

2021-07-22 Reply Admin

Mostly MS doesn't fix it because it actually doesn't need fixing.

Want a cryptographic RNG? Call a cryptographic RNG.

MS doesn't get to magically guess what variety of RNG a user wants, or indeed needs. It's just an API call. It's documented. If you want a better API call, go find it -- or even create it yourself.

"They should get it done right" is a classic complaint against backward compatibility. No, they shouldn't, in this case. The original "promise" has not changed. And there are very likely tens of millions of unit tests out there that rely on that "promise." I will agree -- a substantial number of those unit tests are WTFs, but that's just the way that unit tests go. You bork the requirements, you bork the unit test.

A large part of me agrees with the multi-threading issue on MS RNGs, though. I'll still bring up the "unit test" argument here, but actually, and with personal experience, I would absolutely agree that some blatantly lax and time-related implementation of the seed is going to create Unit Test Havoc, since unit tests are run in parallel.

That's a slightly different issue, though. I could probably build a multi-threaded UT RNG with a guaranteed Mock output (which of course is what you want) in about 50 lines of C#.

Or I could just sit back and moan about how MS always gets it wrong. Sort of depends upon whether I want a paycheck or not, I suppose.

2021-07-26 Reply Admin

This thread starts off referencing the bug (link provided is now dead), but then mostly turns into a discussion about replacing the algorithm entirely. https://github.com/dotnet/runtime/issues/6203 https://github.com/dotnet/runtime/issues/18996

2021-07-26 Reply Admin

The actual bug is an incorrect constant being used (the difference between indexes into an array). Because of that, the RNG fails some statistical tests that it would otherwise pass if they used the correct constant. A simple error, 2 second fix, but it could break many unit tests. As I said, they could easily fix it, but provide an optional flag to make it use the wrong constant. Thus everyone can save their unit tests with 2 seconds of their own effort.

Validate Freely

Leave a comment on “Validate Freely”