The Daily WTF: Curious Perversions in Information Technology

2018-08-28 Reply Admin

\frist!\

2018-08-28 Reply Admin

regex with 42 lookaheads can parse answer to life, the universe and everything

2018-08-28 Reply Admin

Um, (?:...) is just a non-capturing group. (?=...) is a lookahead assertion.

Non-capturing groups are generally a good idea in regexex, since capturing groups need to allocate and copy the captured strings.

2018-08-28 Reply Admin

Then when some company threatens to sue due to trademark infringement, P's peer will take it down, 75% of the web will automatically update their dependencies for some reason, and the npm people will have to put it back so everything doesn't break.

TRWTF is npm and almost everything about it, especially its users. The number of dependencies that are installed (400 subfolders for one project in particular) the few times I've run npm install is astounding.

2018-08-28 Reply Admin

This is why we can't have nice things anymore, such as URLS for ftp : ftp://mysite....

2018-08-28 Reply Admin

And now tell me how this expression makes sure that the URL is valid in the way that a fetch succeeds with 100% accuracy.

2018-08-28 Reply Admin

Just decided to have a look through those 400 folders; there's some quality stuff.

assert-plus: This library is a super small wrapper over node's assert module beeper: Make your terminal beep caseless: wrap an object to set and get property with caseless semantics but also preserve caseing. decamelze: Convert a camelized string into a lowercased one with a custom separator detect-newline: Detect the dominant newline character of a string getpass: Get a password from the terminal. isarray: Array#isArray for older browsers. jsonify: This module provides Douglas Crockford's JSON implementation without modifying any globals. once: Only call a function once. randomatic: Generate randomized strings of a specified length, fast. slash: Convert Windows backslash paths to slash paths

I could do this all day. And of course each of these has its own README, LICENSE, package.json and even test cases in some instances. WTF? These people almost make me ashamed to say I write JavaScript.

(and is just me, or do yanks have this really odd obsession with appending "-ize" to everything they can?)

2018-08-28 Reply Admin

crap, those were meant to be on separate lines. Sorry. Is TRWTF me or the lack of a preview button?

2018-08-28 Reply Admin

No, TRWTF is Markdown.

Watson · 2018-08-28 Reply Admin

I wonder how many registered TLDs (https://www.iana.org/domains/root/db) it's failing on so far?

(And I agree: (?:...) isn't a lookahead).

Watson · 2018-08-28 Reply Admin

"Make your terminal beep caseless" Do upper-case beeps sound different to lower-case ones?

Applied Mediocrity · 2018-08-28 Reply Admin

When Wiley drops a crate of Acme anvils on the beep-beep, it will sound flat.

2018-08-28 Reply Admin

As Nope already said, qualifier ?: just means non-capturing group.

2018-08-28 Reply Admin

Remy is TRWTF. Nope is right. Read the docs: (?:x) is not a lookahead. (?=x) is.

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp

2018-08-28 Reply Admin

And people think APL is a write-only language.

2018-08-28 Reply Admin

When he attempts to drop a crate of Acme anvils on the beep-beep, HE will sound flat.

2018-08-28 Reply Admin

You don't tend to get those 400 subfolder deep dependencies any more because it uses a flat structure now. But your point still stands, far too many dependencies.

2018-08-28 Reply Admin

It is possible to write readable useful regex. Mostly by not overusing it. But APL?

2018-08-28 Reply Admin

They do in the assert module beeper.

2018-08-28 Reply Admin

This story seems familiar. There have been many repeats of this story on here. Sometime ago I needed a regex to validate an IP and optionally a port. I decided to have some fun with it and spend an extra half hour trolling in the code to essentially make a regex that would ensure the integers are in the right range. This creates some unpleasant regex. Next step was to not have to repeat it for each IP address segment. I ended up doing this with lookaheads and encountered some nasty quirks and peculiarities of the regex implementation along the way. I submitted this with the comment "This is ZALGO." expecting to generate some amusement later from other developers but apparently it went through. I've made another version that's ever so slightly simpler. I'm going to make one more and then have the code run all of them and then choose whether it matches or not by sum(matches) > 1 and see if that makes a stir. I guess the problem is that no matter what I do, it works.

2018-08-28 Reply Admin

Similar to the difference between a girl beep and a boy beep.

Nutster · 2018-08-28 Reply Admin

But the cactus that Wiley will inevitably run into will feel sharp. Oh look, a Mach 2 bird.

Steve_The_Cynic · 2018-08-28 Reply Admin

Do any of them cope with IPs like 2002:abef:abef:abef:abef:abef:abef:192.168.1.1 or ::FFFF:192.168.1.1 or FF02::1 ?

2018-08-28 Reply Admin

I was visiting this page from my phone, and I was like: "wow, that's a huge regex!"

And then I realized the code snippet scrolls horizontally...

2018-08-28 Reply Admin

What kind of work do we live if .pron is not valid ?

Bananafish · 2018-08-28 Reply Admin

When Wiley drops a crate of Acme anvils on the beep-beep, it will sound flat.

This deserves a retweet!

2018-08-28 Reply Admin

Here it is on Debuggex

2018-08-28 Reply Admin

So I was learning how to program, and I was using PHP. I made a link shortener, and I'm sure that there are many WTFs in my code, but I had one particular idea to check if a URL was valid. Make a request myself and see what gets returned. So I used cURL, and only got the headers (thus invalidating links that didn't direct to a web server).

Then I generate a short ID with some random base 36 data (first checking to see if that ID is in the database) and a simple lookup when presented with either the short ID or the original URL.

Surprisingly robust as it still works, despite having moved around many servers, database types (it's now redis powered instead of mysql), and even PHP versions. XD

2018-08-28 Reply Admin

A related question: do upper case numerals compute the same as lower case numerals?

2018-08-28 Reply Admin

Much better to have spaghetti code and write all the code yourself then to rely on modular dependencies, am I right? I mean, just look at Linux -- a little utility for almost everything you can imagine.

2018-08-29 Reply Admin

Original poster here. I tried to talk sense to them not to do that. They still do it anyway. The codebase is also a mess with bad overall design, the practices used in various places are not consistent, and there isn't even proper error handling (Error handling in JS isn't just try/catch; Promise errors can only be handled with .error, and for streams, .on('error', handler)). Once an error occurs the bot just crashes, often without any logs. Their solution? Run a .bat file that runs the bot in an infinite loop. As for this WTF, I did some research after they showed me this monstrosity, and it's indeed the most comprehensive one out of any other such regexes in public. I guess you can say they really out-performed the competition.

Unfortunately I think this is the reality we have all day now: Reddit and Discord are filled with teens who write tiny pet projects, but have no awareness of good practices, or the discipline to look for/follow them. Their code are dubious, badly designed, and often clueless. Unlike typical CS students, they have the time and passion to delve into the wrong path very, very far (like this WTF). They treat anyone who can write barely functional code their "peer teacher". The list goes on. If you think junior devs/interns are clueless, think again.

At least in the era of IRC bots in Perl, we don't get a place where everyone can post all the bad code collectively for others to see them. Enter GitHub: piles upon piles of such code are posted there every day, with lots of the fellows copying/following these code because, well, they don't know better either.

(Meanwhile, at the opposite end we have Minecraft modding: the ecosystem really dislikes open sourcing your mods for some reasons. However! Since Minecraft is obfuscated Java build, every mod release only works for one version of Minecraft, and now Forge because nobody writes non-Forge mods anymore. Once the author abandons the mod or is gone, the mod is dead. Dead mods are everywhere. But that's for another WTF.)

2018-08-29 Reply Admin

Then when some company threatens to sue due to trademark infringement, P's peer will take it down, 75% of the web will automatically update their dependencies for some reason, and the npm people will have to put it back so everything doesn't break. TRWTF is npm and almost everything about it, especially its users. The number of dependencies that are installed (400 subfolders for one project in particular) the few times I've run npm install is astounding.

You mean the snowflake that decides to unpublish all his popular npm modules because of one of his modules has the same name as an IM software, and hence got claims?? The WTF is hardly npm, it's authors who can't bother just renaming their modules or anything. Instead of trying to communicate like a normal person, they have to scream oppression and dirty capitalism with ears covered, of course. Just like how they treat Microsoft all the time.

But to be fair, on the other side of the coin, lots of JS devs are very clueless and they contribute to the huge dependencies of npm modules too. They can't be bothered to solve even the most trivial problems that can be done with basic knowledge of JS features, they have to add another module that happens to be on the top of google search results. It's how you get the likes of isArray.

2018-08-29 Reply Admin

The most WTF thing is the article itself mistaking non-capturing group to look-ahead.

The regex is still crappy, though. They haven't tested it against more recent TLDs and internationalized domain name. It might be understandable if that regex is used years ago when support for validating and parsing URL are scant, but on modern browser, URL can be validated with JavaScript API https://developer.mozilla.org/en-US/docs/Web/API/URL/URL or a

2018-08-29 Reply Admin

Frenk is watching you

2018-08-29 Reply Admin

actually, there is a whole Perl class. See the beauty: https://metacpan.org/source/ABIGAIL/Regexp-Common-2017060201/lib/Regexp/Common/URI Taking the RFCs into account.. It may not be small or whatever, but it is accurate.

urkerab · 2018-08-29 Reply Admin

Or better still, my favourite IP address, 2130706433

2018-08-30 Reply Admin

Yeah, one goes beep and the other one goes BEEP.

2018-08-31 Reply Admin

I also agree with the two commentators that "(?:" is just a non-capturing group. This allows to exactly specify the data you need to have extracted. So IMHO I would say this posted WTF is a WTF itself.

Look Ahead. Look Out!

Leave a comment on “Look Ahead. Look Out!”