The Daily WTF: Curious Perversions in Information Technology

Tsaukpaetra · 2015-11-01 Reply Admin

I imagine this is what movie producers would like to see on hackers screens....

rc4 · 2015-11-01 Reply Admin

I saw regexes and bailed. (Although I do seem to love regex abuse, but it's only okay when I do it).

PleegWat · 2015-11-01 Reply Admin

Those variable names smell decompiled. Or obfuscated.

LB_ · 2015-11-01 Reply Admin

Just be glad it wasn't regex replaces transforming XSLT to HTML

JBert · 2015-11-01 Reply Admin

PleegWat:
Those variable names smell decompiled. Or obfuscated.

A decompiler would likely not append nearly everything with `tempX` and then not include others, or sometimes short things like `obj3` versus `pattern1`.

All signs point to exit stage left.

boomzilla · 2015-11-01 Reply Admin

There's a person on my team who likes to join subqueries using aliases like TEMP, TEMP1, TEMP2, etc.

Fox · 2015-11-01 Reply Admin

I'm not sure which was a better read, the article, or the Stack Overflow comment linked in it. I'm leaning towards the SO comment.

Tsaukpaetra · 2015-11-02 Reply Admin

boomzilla:
join subqueries using aliases like TEMP, TEMP1, TEMP2, etc

Ah, I must have met his padawan, who uses T1, T2, T3, etc.

Filed under: But it saves three bytes for each query! That's optimized!

aliceif · 2015-11-02 Reply Admin

Fox:
I'm not sure which was a better read, the article, or the Stack Overflow comment linked in it. I'm leaning towards the SO comment.

Apart from the fact that it has been linked here pretty much every time regexes or parsing HTML were mentioned ...

Fox · 2015-11-02 Reply Admin

Well I'm newish here, so sue me for not having seen it before.

aliceif · 2015-11-02 Reply Admin

It has been linked all over the internet in every regex discussion.

Better now?

Just look for HTML Regex Zalgo.

Oh and: http://blog.codinghorror.com/parsing-html-the-cthulhu-way/

And: http://meta.stackoverflow.com/questions/261561/please-stop-linking-to-the-zalgo-anti-cthulhu-regex-rant

Fox · 2015-11-02 Reply Admin

Okay well I haven't had the ~~mis~~fortune of seeing or being involved with anything regex+html related, so it's new to me.

Better?

Jaloopa · 2015-11-02 Reply Admin

boomzilla:
There's a person on my team who likes to join subqueries using aliases like TEMP, TEMP1, TEMP2, etc

I used to work with someone who aliased tables in joins by going through the alphabet

select columns
from table1 a
join table2 b
on things
join table3 c
on things

Trying to work out where g.column was coming from in a complicated join was a nightmare

giammin · 2015-11-02 Reply Admin

can't believe this is real...

Vault_Dweller · 2015-11-02 Reply Admin

tmepList7

You sure about that?

Vault_Dweller · 2015-11-02 Reply Admin

You must work for one of our clients

foxyshadis · 2015-11-02 Reply Admin

giammin:
can't believe this is real...

I rewrote much worse than this when I was starting out in independent contracting in the early 2000's. LOTS of VB6 and VBA that wasn't even up to this level. Some of those were fodder for the early days of TDWTF, when it was still an MSFN blog! (and MSFN was weblogs.asp.net for that matter.)

PleegWat · 2015-11-02 Reply Admin

Mental self defence, I guess - multiple arguments have been brought in suggesting someone actually typed this garbage manually.

gleemonk · 2015-11-02 Reply Admin

I woke up unhappy about the state of things. Now I have a seething hatred for this clumsy world.

hifi · 2015-11-02 Reply Admin

I still maintain a project that parses HTML with regexes. It just scrapes a few bits of text so it's not that bad. CSS selector syntax like with jQuery would be a bit easier to use though.

WernerCD · 2015-11-02 Reply Admin

Congratulations! You are one of the Ten Thousand!!!! Ignore anyone who bitches and moans that it's new to you.

gleemonk · 2015-11-02 Reply Admin

Just last week I had to extract some attributes from a collection of HTML markup documents. My initial reaction was to use a regex because that's the quickest solution, right? In this case I would not have cared about the two problems and all that. The markup was somewhat predictable.

It turned out that the regex solution was harder. After failing for half an hour to get the regexes working, I threw in a DOM parser and was done in ten minutes. The reason I didn't try that in the first places was that I was worried about it rejecting some documents for malformedness. Yet the tolerant parser didn't fail for any document. As a bonus it will be much more robust.

I realize that most of my reservations about using proper parsers is that they were too strict. Also early DOM interfaces wanted to look like a C interface which made them very cumbersome touse. (That part has not been true for ten years now, but it poisons my experience.)

hifi:
CSS selector syntax like with jQuery would be a bit easier to use though.

That's what xpath is for. I know xpath syntax is dense like a regex, but at the same time it's more powerful than a CSS selector.

Mikerad1979 · 2015-11-02 Reply Admin

I'm another of the 10,000. And, I threw up in my mouth, a little.

PleegWat · 2015-11-02 Reply Admin

gleemonk:

hifi:
CSS selector syntax like with jQuery would be a bit easier to use though.

That's what xpath is for. I know xpath syntax is dense like a regex, but at the same time it's more powerful than a CSS selector.

We used to have a rudimentary xpath parser implemented in a regex-based ruling engine. I'm glad we're rid of it, though the xpath library has its own problems.

Quite · 2015-11-02 Reply Admin

The three hardest things in computer software development are naming things.

Developer_Dude · 2015-11-02 Reply Admin

https://texaslynn.files.wordpress.com/2014/02/concept-welcome-to-my-world.jpg

izzion · 2015-11-02 Reply Admin

Wait, so @accalia is the one who committed this monstrosity?

ufmace · 2015-11-02 Reply Admin

Man, that thing is like fractally bad. The more I look at it, the worse it seems. Right off the bat, I had a feeling that it could be replaced by a couple dozen lines in any decent XML parser. Then when I started looking, I also noticed that I can't tell what it's supposed to be doing. Where the output of this thing? I can't see... wait a second, is that... yes, yes they really are. The only output appears to be messing with the XML in the StringBuilder via StringBuilder.Replace. And that was when I noticed to GOTOs. Yup, there are liberally sprinkled GOTOs. I think I better stop trying to read that thing before I really lose my sanity.

gleemonk · 2015-11-02 Reply Admin

PleegWat:
We used to have a rudimentary xpath parser implemented in a regex-based ruling engine.

DOES NOT COMPUTE

PleegWat:
I'm glad we're rid of it

EMPATHY

PleegWat:
though the xpath library has its own problems.

I wouldn't want to implement xpath myself. After years of occasional use of xpath with various libraries I'm still not sure whether differing results are my incomplete understanding of the standard or incomplete implementation of it. I've always decided to reword the query until results matched my expectation :smiley:

PleegWat · 2015-11-02 Reply Admin

Back when that xpath parser was implemented, there had to be an absolute cap on parsing state, including buffered document contents. You cannot implement full xpath under those conditions, and indeed our current parser requires building the full XML/DOM tree before it'll start applying the xpath expression.

The old parser only accepted patterns of the form /foo/bar/baz, but it handled them on an input stream with less than 100 bytes of parsing state.

AlexMedia · 2015-11-02 Reply Admin

ufmace:
And that was when I noticed to GOTOs. Yup, there are liberally sprinkled GOTOs

Holy crap, I didn't even notice that before!

[image]

John_Imrie · 2015-11-02 Reply Admin

regex abuse, like all abuse, is only fun when it's consensual :stuck_out_tongue_winking_eye:

DCRoss · 2015-11-02 Reply Admin

If this is also your first time seeing that particular strip, then you're doubly lucky. That would make you one of today's 0.318, or something like that.

Tsaukpaetra · 2015-11-02 Reply Admin

DCRoss:
seeing that particular strip

:giggity:. Also, what is a particular? Can we just noun a word like that?

Fox · 2015-11-02 Reply Admin

*insert appropriate xkcd comic here*

tufty · 2015-11-02 Reply Admin

aliceif:
Oh and: http://blog.codinghorror.com/parsing-html-the-cthulhu-way/

Is it time to mention how the bbhtmarkcode parser used in pissforce handles HTML?

PWolff · 2015-11-02 Reply Admin

ufmace:
before I really lose my sanity

YMBNH. I'd like to know how you can believe you had any left long before your arrival at TDWTF, though.

gohere: // stuff goto gohere;

Looks like a clumsy workaround for the lack of ComeFrom.

And here we have entry 3+7i

Yay for Gaussian indices!

And that predecessor must be a regex newbie - (s)he uses the string methods IndexOf and Replace instead of using the far more appropriate RegEx methods, and I didn't find any groups in the pattern, not even simple ones.

Without that, and maybe Lookahead and Lookbehind, how could this code ever hope to get as a positive example into a book about code maintainability?

Shadow Over XML

Leave a comment on “Shadow Over XML”