• LCrawford (unregistered)

    But wait - if he'd have fixed the translation class frist, it would have continued to work!

  • doubting_poster (unregistered)

    moral of the story: don't reinvent the wheel.

  • Wheel Inventor (unregistered) in reply to doubting_poster

    But then I'll be out of a job D:

  • my name is missing (unregistered)

    I like to think of this as leading to Java classes like AbstractAbstractAbstractFactoryFactory

  • bvs23bkv33 (unregistered)

    abstracturbation? abshitraction!

  • Yazeran (unregistered)

    Ok, this was like a horror movie where you think 'now it can't get worse' and then Boom

    The Boom thing here was that little line about 'So he did the next best thing- he wrote a “translation” module that would, using regular expressions, convert the new-style XML files back into the old-style XML files.' The horror

    And I even work with Perl daily and by and large like it (yes I know, just look for my horns and all :-)

    But even I know enough that you should never parse XML with regexes! (Obligatory stackoverflow link: https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags)

    Yazeran.

    Plan: To go to Mars one day with a hammer.

  • Thomas (unregistered)

    Sometimes, when confronted with a problem, you think “I know, I'll use regular expressions.” Now you have two problems!

  • Irregular (unregistered) in reply to Thomas

    Mostly because your regular expressions become irregular expressions :P

  • Ath Athanasius (google)

    Ah, one of those cases where abstraction actually just leads to obfuscation and confusion.

  • Andrew A. Gill (unregistered) in reply to Yazeran

    It's been a while since I looked at it, but I believe if you only use a subset of HTML and only use a well-formed version of that subset, you can create an HTML parser with regex. The issue is that HTML was standardized with tags that can nest out of order or in ways that can trip up a regex. That's not a bad thing, mind. Since HTML is a markup language, it works well for what it does.

    If you write HTML that follows strict self-imposed rules created with regex parsing in mind, you can use a regex to parse it.

    I also thought I read that XML could be parsed using regex, but the better question for either XML or HTML is: Why on earth would you want that?

  • (nodebb) in reply to Yazeran

    using regular expressions

    I think I'd be using some expressions regularly when considering this code…

    “WTF?!” “WTF?!” “WTF?!” “WTF?!” “WTF?!” “WTF?!”

  • Chronomium (unregistered) in reply to doubting_poster

    This guy didn't reinvent the wheel; he reinvented the whole Ford.

    He also made it out of Lego because then you can upgrade aerodynamics on the fly, or maybe replace the wheels with a better shape than round just in case something better is ever found.

  • Rafa Larios (unregistered) in reply to Yazeran

    That is one great StackOverflow answer

  • Dave (unregistered) in reply to Chronomium

    "This guy didn't reinvent the wheel; he reinvented the whole Ford."

    I think this guy actually invented a car transporter to put a Model T Ford on so it could keep up with modern traffic.

  • Carl Witthoft (google) in reply to Yazeran

    ahhhh, dammitol, I was going to post that SO page. Well, maybe we can write a combination XML_and_datestring regex converter now.

  • Peter Schott (google)

    I've actually worked with someone who did that for a data import process. Instead of tweaking processes to match the format(s) provided so we'd have ones that worked with those specific processes, the data was imported, then the table structures were massaged to match one specific format so we could just import everything from that process. That kind of ignored the way different systems could handle more information in some cases so that was just lost. I was a bit sad when I saw how all of that data conversion work was being handled, but ... it was being handled and completely outside of my domain. It was the whole "when all you (think you) have is a hammer, everything's a nail" mindset.

  • r̷͓̠̠̠͗̚͝ã̷̡̟̌̽̋ṯ̷͖̥̥̖̊͑͂̎ (unregistered) in reply to Rafa Larios

    b̵̨͎̞̬̂̓̚̚͠l̶̜̤͔̊a̵̯̍͛̊k̵̨̥͓̮̦̅̚͝ȅ̵̠̳̭̅̽ỷ̷̢̹̪͝r̴̹̙͖̘̼͌́̽̋̏a̸͈͐̾̂t̷͈͛͑̑̽ ̷̺͍͇̦̿k̶̨̥͉͖̑̓̌̐͝n̷̢̩̰͔͛̒̈͊̂o̴͈͇͋͛̔͌c̴̡̳̮͈̓̍͌̒̚ḳ̴̨͇̮̔͛͗͝s̷̩̦̗̥̉̽̂̈́ ̸̨̛̺̱̋̈̂͘ạ̵̛͎͚̀̿t̵̯̹̮͂ ̴̰̹̭̟̓̀͆̈̓t̷̪̱̬̱̾̇̃h̸̭͐é̸̪͕̌ ̸̟̰̇͝d̶̨̤̲̊͌ǒ̷̡̝̱̆͠ȍ̵̭̫͙͉̜̇̇́̕r̶̠̳̤̱̈́̚ ̶̢̭͍̙̯̓̅͑y̸͇̋̂̓͝ȍ̴͕̙̟̙͕̊͝ȕ̸͎̲̮ ̶͇͖̊̋̔d̷͚̺̾̿͛̐ờ̶̹̜̀ ̷̨̤̺̎̽̈́n̵̮̹͙̔̀ơ̵̯͔͔̐͌t̷̢̗̟̒ͅͅ ̵̝̹̗͗͐ằ̵͙̹̰̏̃͌n̶̛̫̮̣̍̑̾̈́s̸͉̓͌̿͆w̶͎͊ͅe̷̥̳̔r̴̨̗̯̫͊́ ̶̧͖́b̷̡̹̞̭̈́̈́u̴̳͙̤͇͚̇͛͠t̴̹̹̲̽͗̅̏̈ ̵̱̟̰͚̱̂ḧ̷̪̤́͛͘͝ȩ̷͖̜̇̈́̍̕ ̶͚̱͍̞̍̉͌̅͒k̷͈̠̟͙͐̍͊͠ͅṉ̴̡̅͆͘ͅo̶̦͎̍w̸͈̲̄̽̽͂̄s̸̰̟̄̈̆̉͝ ̷̨̣͙̈̈̈͘ú̸̲̒̕͘͝n̶̠̘̱̱̳̄̓͐̎d̸͓̣̟̣̝̋̀̋̎e̶̼̬̱̊̿̀͗̽r̷̨̟͓͎͂̈́̂͛ ̶̘̯̝͗t̵̛̩̩̙̝̮̋̓͌̉ẖ̴̆e̶̝̍̀̂͘ ̶͍̈́̽̈́͘͘d̴̜̻̼͒̇̊̀͝ͅo̵̢̲̜̲̺̔͋ǒ̶̼̘̘̈́̄r̴̹̥̝̀̕ ̴̫̠̀͌͌͛͠a̶̟͇͈̬̠̎̈̈́̎ ̷͔̀͌͂̀m̸̟͊̔̑̽a̷̲̫̜̓͋ŝ̷̳̱̾͊͊s̷̹̈́̚͠ȉ̶̧̨̢͖̣͂͑v̴̢̻̹̺͂ê̸̛̬͍̳̟͈̇̄̓ ̴͓̺̙̓͐̂r̴̠̮̙̘̅͝͠a̵̢͓̙̒͌̾̚ͅt̸̪͍̀̀̕ ̵̭̔͛̓͘͜ť̶̗͔̖̯ḁ̴̻̤̰͎̓͗͂̇̕ỉ̴̫̃͝l̷̬̝̓̄ ̵̨̤̌̈͛̄͒ǐ̵͎̘̤̌̉n̴̡͎̾̂̀̄t̴̢̟͍̼͇̄͒r̶͎͎̐̏̚ͅų̵̱̲̬̦́͑̓̈̕d̵̮͖͖̏͜͝ȩ̵͑͐s̸͍͓̼̳̀ ̶̙̯̻̳̓̽̉́̕g̵̨̠̭̋̉̅͋͠r̴̡̡̦̰̉̿͒ͅa̸̙̠͎̅s̷̮̫̮̍̄́̊͘͜p̶̜͈̩̔͐̀̅͘i̵̡̛̱͌͗̾͝ͅn̴̖͛g̷̤͛͛͠ ̷͕͑̌͒̽̃t̶͈̘̲̦̃ḣ̸̘̣é̶́͝ͅ ̷̹̪̯̗̕͝a̷̖͇͛ͅḯ̴̲̭̩̊̌̽͝ŗ̴̧̫̝̏͐̔ ̶̹̥͍̗͖̉ṷ̵̡̣̃̍̀͋͜͠ͅṉ̶̀͝͝ṫ̴̻̳̝̤͗̐͗i̴̘̖̣̭̠̇́̉͠l̶͉͔̺͆̈ ̷̩͈̬̹͛͝c̷̢̮͕̻͐̿̊͛̀ĺ̶̫̖̱̯͌͊͝ȉ̷͔ͅc̶̛̞͕̬̈́͛k̴̛̯̪͋̽͠ ̸̱̅͘g̵̢̠̠̺͆̿͐ͅo̷̩̩̐̚̚͝͝ĕ̷̹͎̘ś̸̛͙̼͎̈́͑ ̵̺̔̎̇̿y̸̺̅̆̾̈́̈́o̶̳͔͂̋̈̕͠ų̷̪̟̻͙̍͒ṛ̸̱̟̥̅͜ ̴͕̹͍̪͎̀̏͝d̶̪͕̩̒̚͜ò̶̳̙̀́̚o̵̥̞͌̄̅͝r̷̦͆ ̵̧̭̝́̓͊̐͘a̷̛̟̭̝̰̓͂̚ń̴̜̼̽̈̈͘d̶̼̬̳̟̳̓̽͋̆ ̴̦̘̍̄͐̿͐į̵͔̫̬͔̉͒͌n̵̘̪̊̽ ̵̥̲͉̏͜č̷̣̳́o̸͉̪͘m̴͎͓̥̓e̸̤̠͒͛̍̕s̶͔̹͘ ̶͙͕̒̄̊t̶̡͖͙̝̺͆h̵̦͍͛͒̈́̔͘ȩ̷̭̈́ ̵̖̭͖̙̟͑̊̅̓͝ŕ̸͙͚̝͕̥̔̉a̷̻͖͌́͐t̴̡̻͙̠͇͆͗͝ ̷̦̆̾͌̀n̵̦̘͙͆́́̋̿o̷̳̔ẁ̸̩̇̔ ̴͓̝̻͛͋i̶͉̻̣̟̍̃͂̒ͅt̴̹͒̀̚͘s̸͙̣̞̍̅̊ ̸̻̎̅̃͒i̴̘̓̄s̴̺̖͔̑ ̶̳̯͔͙̫̊͊̇̀̚ö̴͙́v̵͎̘̫̱̔̏̌̈́e̶͉̝̾̿̌̾̚r̴̨̨͙̈́̎͊ ̴̥̩̫͘t̷͎̅̕h̵̠̹̫͍̉́̈́͘͠e̷͕̹͔̬̅ ̷͚̭͇̙̭͝e̸̡̲͓͔̅̈̔́ǹ̵̛̩͚̖d̴̞̟͍̩̕

  • David C. (unregistered)

    When I got to the phrase "...it was a bit of a crapshoot...", my brain read that as "crashpoot". Which I have decided I really like and will have to start using in the future for WTF code that randomly flakes out.

  • Clive Broadwell (unregistered) in reply to Thomas

    Sometimes, when confronted with a problem, you think “I know, I'll use regular expressions.” Now you have two problems!

    Sometimes, when trying to make a joke, you think, "I know, I'll quote XKCD." Now everyone knows you have no sense of humor or creativity.

  • ZB (unregistered) in reply to Clive Broadwell

    Sometimes, when trying to attribute a quote, you think, "I know, I'll attribute XKCD." Now everyone and Jamie Zawinski knows you're an ignoramus.

  • spaceman (unregistered)

    https://stevehartken.wordpress.com/

    "For all we know, Steven Hawkin, British Emperialist, holds back the very essence of time itself. Don't trust anyone named Steven Hawkin. He could easily be Hitler's Genes!"

  • eric bloedow (unregistered) in reply to Chronomium

    oh, you made me think of a story about Regina Vacuum cleaners: they told their engineers to ignore the fact that Plastic melts at high temperatures, thus producing plastic motors that wore out VERY quickly, THEN tried to lie about their sales figures by NOT counting the numerous returns on their quarterly reports...when they got found out, that was pretty much IT for them!

  • (nodebb)

    Did Steven change the name of the system as well? How many people would want to keep working on a system with that name? How about "Pearl Utility for Systematic Survey and Investigation - Enhanced Synergy"?

    Addendum 2018-02-16 00:14: I meant "Perl", but my fingers settled on "Pearl".

  • 🤷 (unregistered)

    Huh-huh. DICS. Huh-huh.

  • spaceman (unregistered) in reply to LCrawford

    https://twitter.com/urgentProgram

  • PenguinF (unregistered)

    Inner Platform Effect anyone?

    Also, best comment thread ever.

  • RLB (unregistered) in reply to Andrew A. Gill

    It's been a while since I looked at it, but I believe if you only use a subset of HTML and only use a well-formed version of that subset, you can create an HTML parser with regex

    So... if you yourself create the HTML and only Netscape 2 HTML at that, in other words, if you have no reason to parse your HTML using regexes since you already know exactly what went into it in the first place, then and only then you can parse HTML using regexes. Meanwhile, I can write haiku in Finnish, provided someone write one Finnish haiku for me and I stick to copying that haiku.

  • spaceman (unregistered) in reply to PenguinF

    It is only do you find that life is an inner platform of continually death until you really find peace. Do no harm, listen to Jesus, and you will live forever. Just a serious of doors opening.

    That doesn't mean you plan your day, or write code like you are writing the Bible!

  • isthisunique (unregistered)

    @Andrew A. Gill

    You sort of can and you can't represent everything in regex. It depends on what you mean by regex. Regex as it's defined theoretically doesn't have any memory other than its current state and current byte then the next combination of bytes. It's common for people to get this wrong either when they are using regex with advanced features or multiple regexes and doing what regex can't in their language of choice.

    You can however make a regex representing all permutations that'll be able to match a lot of things you can't match with the normal thinking that comes with regex as long as your domain isn't "infinite". Regex can sometimes represent infinite sequences non-infinitely but not all. For finite sequences such as HTML to a depth of 10 you can represent all possible sequences with that in regex as literals. If you don't want to represent them literally you need something more powerful.

    I once someone use regex in such a way using webscale architecture. Essentially generating nearly all valid permutations which took over a petabyte, then storing it on the cloud. You see that's how the cloud works. It doesn't matter how inefficient your solution is anymore. If you need a brazillian gigabytes the cloud's got your back. It brings a new meaning to overhead. I noticed the generator already has the logic needed to validate the text being processed. Five hours later and a brand spanking new 100 lines of code company operating expenses dropped by several million a year. I was then fired for making things all automatic. The CTO insisted that my solution had destroyed the company because it was no longer possible to change something in only one sequence. I learnt that day just how important webscale is and how empty our lives would be without it. I moved on to be a chef.

  • code_goddess (unregistered) in reply to 🤷

    I would so have been tempted to create a Binary Alphanumeric Getter Of Files to incorporate into DICS.

  • Sole Purpose of Visit (unregistered) in reply to isthisunique

    So, what you are saying is that the important thing here is to pre-select a non-infinite domain, and then choose a flavor of "regexp" that extends the traditional Finite State Automaton by incorporating a theoretically unlimited look-ahead mechanism? Yes, that would work.

    However, I beg leave to doubt that you have ever worked for a company that transfers petabytes of data to and from Teh Cloudz whenever such a regexp requires it. Best wishes in your new career as a chef!

  • spaceman (unregistered) in reply to code_goddess

    http://peace4patience.s3-website-us-east-1.amazonaws.com/

  • Andrew A. Gill (unregistered) in reply to RLB

    There are some cases where it comes in useful to use such a subset. For example, imagine a program that finds prime numbers and prints them, one prime to a paragraph. Each time it detects a prime, it appends the new prime with p tags before and after to the already existing file and then tacks on the closing body and html tags in a footer. You know what goes in the file, and you're using a well formed subset. Other people on the internet can download your file and easily parse it with regular expressions.

    It still doesn't explain WHY you'd want that in HTML as opposed to CSV or something else more useful, but the name of the game is Technically Possible.

    Now if you'll excuse me, I have some soup waiting for me in a flour strainer.

  • foxyshadis (unregistered) in reply to RLB

    HTML 2 (and pre-standard HTML) was far, far more of an unparseable abomination than XHTML, so I dunno where you're going with that.

  • Quite (unregistered) in reply to spaceman

    And the Programmer spake unto his Computer, Three times shalt thou loop, and three times correspondingly shalt thou test thine exit condition. And on the First loop, He created the Factory and the XML Object And on the Second loop, He Tested the Return from that Factory, and saw that verily, it was very True. And on the Third loop, He Rested, and entered a Sleep, and threescore thrice times ten Milliseconds slept He.

    And on the Fourth loop, exited He the Loop, and looked upon his works, and saw that indeed, it was Object Oriented.

  • Dave3of5 (unregistered)

    Worked with a contractor who wrote code like this. It's SOLID you see as in everything becomes a separate class and everything is abstracted such that it is meaningless. He basically admitted to me that the reason for doing so was to keep himself in a job. Something that could be done in a few lines would suddenly become 20 classes and 10 design patterns of abstraction away such that you couldn't understand what was going on. The worst part was trying to help him fix bugs as no simple fix was ever enough, every simple fix to be applied involved creating some other class and using some other obscure design pattern that didn't really fit. The biggest problem was he was badly mismanaged and the projects requirements were constantly in flux meaning he had plenty of excuses to keep writing code in this manner. To quote Dijkstra:

    "The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise."

    Abstraction != Generic.

  • (nodebb) in reply to isthisunique

    Perl has recursive regexes which can e.g. ensure XML tags match, and .NET has balancing groups which can be leveraged to a similar end, and there are probably other extensions around that give you enhanced capabilities beyond the usual regex.

  • Obstreperous objectification (unregistered) in reply to Ath Athanasius

    Obfuscation-oriented programming at its finest!

  • fa (unregistered)
    Does this method return the IP address? Or a string in the from “$ip_addr:$port”? Or maybe even an array, like [$ip_addr, $port].

    Why, a class of course!

  • Mike (unregistered)

    I literally gasped when I read the punchline.

Leave a comment on “It's Called Abstraction, and It's a Good Thing”

Log In or post as a guest

Replying to comment #:

« Return to Article