- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
I imagine this is what movie producers would like to see on hackers screens....
Admin
I saw regexes and bailed. (Although I do seem to love regex abuse, but it's only okay when I do it).
Admin
Those variable names smell decompiled. Or obfuscated.
Admin
Just be glad it wasn't regex replaces transforming XSLT to HTML
Admin
All signs point to exit stage left.
Admin
There's a person on my team who likes to join subqueries using aliases like TEMP, TEMP1, TEMP2, etc.
Admin
I'm not sure which was a better read, the article, or the Stack Overflow comment linked in it. I'm leaning towards the SO comment.
Admin
Filed under: But it saves three bytes for each query! That's optimized!
Admin
Apart from the fact that it has been linked here pretty much every time regexes or parsing HTML were mentioned ...
Admin
Well I'm newish here, so sue me for not having seen it before.
Admin
It has been linked all over the internet in every regex discussion.
Better now?
Just look for
HTML Regex Zalgo
.Oh and: http://blog.codinghorror.com/parsing-html-the-cthulhu-way/
And: http://meta.stackoverflow.com/questions/261561/please-stop-linking-to-the-zalgo-anti-cthulhu-regex-rant
Admin
Okay well I haven't had the
misfortune of seeing or being involved with anything regex+html related, so it's new to me.Better?
Admin
I used to work with someone who aliased tables in joins by going through the alphabet
Trying to work out where
g.column
was coming from in a complicated join was a nightmareAdmin
can't believe this is real...
Admin
You sure about that?
Admin
You must work for one of our clients
Admin
I rewrote much worse than this when I was starting out in independent contracting in the early 2000's. LOTS of VB6 and VBA that wasn't even up to this level. Some of those were fodder for the early days of TDWTF, when it was still an MSFN blog! (and MSFN was weblogs.asp.net for that matter.)
Admin
Mental self defence, I guess - multiple arguments have been brought in suggesting someone actually typed this garbage manually.
Admin
I woke up unhappy about the state of things. Now I have a seething hatred for this clumsy world.
Admin
I still maintain a project that parses HTML with regexes. It just scrapes a few bits of text so it's not that bad. CSS selector syntax like with jQuery would be a bit easier to use though.
Admin
Congratulations! You are one of the Ten Thousand!!!! Ignore anyone who bitches and moans that it's new to you.
Admin
Just last week I had to extract some attributes from a collection of HTML markup documents. My initial reaction was to use a regex because that's the quickest solution, right? In this case I would not have cared about the two problems and all that. The markup was somewhat predictable.
It turned out that the regex solution was harder. After failing for half an hour to get the regexes working, I threw in a DOM parser and was done in ten minutes. The reason I didn't try that in the first places was that I was worried about it rejecting some documents for malformedness. Yet the tolerant parser didn't fail for any document. As a bonus it will be much more robust.
I realize that most of my reservations about using proper parsers is that they were too strict. Also early DOM interfaces wanted to look like a C interface which made them very cumbersome touse. (That part has not been true for ten years now, but it poisons my experience.)
That's what xpath is for. I know xpath syntax is dense like a regex, but at the same time it's more powerful than a CSS selector.
Admin
I'm another of the 10,000. And, I threw up in my mouth, a little.
Admin
We used to have a rudimentary xpath parser implemented in a regex-based ruling engine. I'm glad we're rid of it, though the xpath library has its own problems.
Admin
The three hardest things in computer software development are naming things.
Admin
https://texaslynn.files.wordpress.com/2014/02/concept-welcome-to-my-world.jpg
Admin
Wait, so @accalia is the one who committed this monstrosity?
Admin
Man, that thing is like fractally bad. The more I look at it, the worse it seems. Right off the bat, I had a feeling that it could be replaced by a couple dozen lines in any decent XML parser. Then when I started looking, I also noticed that I can't tell what it's supposed to be doing. Where the output of this thing? I can't see... wait a second, is that... yes, yes they really are. The only output appears to be messing with the XML in the StringBuilder via StringBuilder.Replace. And that was when I noticed to GOTOs. Yup, there are liberally sprinkled GOTOs. I think I better stop trying to read that thing before I really lose my sanity.
Admin
DOES NOT COMPUTE
EMPATHY
I wouldn't want to implement xpath myself. After years of occasional use of xpath with various libraries I'm still not sure whether differing results are my incomplete understanding of the standard or incomplete implementation of it. I've always decided to reword the query until results matched my expectation :smiley:
Admin
Back when that xpath parser was implemented, there had to be an absolute cap on parsing state, including buffered document contents. You cannot implement full xpath under those conditions, and indeed our current parser requires building the full XML/DOM tree before it'll start applying the xpath expression.
The old parser only accepted patterns of the form
/foo/bar/baz
, but it handled them on an input stream with less than 100 bytes of parsing state.Admin
Holy crap, I didn't even notice that before!
[image]Admin
regex abuse, like all abuse, is only fun when it's consensual :stuck_out_tongue_winking_eye:
Admin
If this is also your first time seeing that particular strip, then you're doubly lucky. That would make you one of today's 0.318, or something like that.
Admin
:giggity:. Also, what is a
particular
? Can we just noun a word like that?Admin
*insert appropriate xkcd comic here*
Admin
Admin
YMBNH. I'd like to know how you can believe you had any left long before your arrival at TDWTF, though.
Looks like a clumsy workaround for the lack of ComeFrom.
Yay for Gaussian indices!
And that predecessor must be a regex newbie - (s)he uses the string methods IndexOf and Replace instead of using the far more appropriate RegEx methods, and I didn't find any groups in the pattern, not even simple ones.
Without that, and maybe Lookahead and Lookbehind, how could this code ever hope to get as a positive example into a book about code maintainability?