• Harold Finch (unregistered)

    Frist

  • Just this guy, you know? (unregistered)

    The much lower voted reply on that stack exchange says how this should be done. Try to initialise a regex option with the string in question, if it throws it's not a regex

  • Geoff (unregistered) in reply to Just this guy, you know?

    I would also agree. Unless the application is something for single users to run locally, doing recursive regex processing on arbitrary strings has a name "Denial of Service Vulnerability." While just slapping the old try block around you call to your regex class probably looks inelegant it probably is the best option most of the time.

  • Matt Westwood (unregistered)

    Meh. Probably a placeholder method which merely ensures that the regex passes some sort of test but the dev never got round to tidying it up.

  • Nicholas "LB" Braden (github)

    Machine learning regex golf...I feel like Randall Munroe has already thought of it.

  • PWolff (nodebb)

    I knew regex patterns are yet another write-only language.

    Which just corroborates the enterpriseyness of regexes.

    (Time for PHB to hire another Ted to maintain that naturally-grown tropical regex jungle.)

  • cTor (unregistered)

    I would argue that the method does exactly what it says it does in the inline comment... TRWTF would be the naming and the fact that to encapsulate a oneliner in a function declaration seldom brings clarity to the codebase.

  • dkf (nodebb)

    /^((?:(?:[^?+*{}()[\]\\|]+|\\.|\[(?:\^?\\.|\^[^\\]|[^\\^])(?:[^\]\\]+|\\.)*\]|\((?:\?[:=!]|\?<[=!]|\?>)?(?1)??\)|\(\?(?:R|[+-]?\d+)\))(?:(?:[?+*]|\{\d+(?:,\d*)?\})[?+]?)?|\|)*)$/

    How do we know that that's a valid regex? Not so smart now, Remy…

  • jkshapiro (nodebb) in reply to dkf

    Well at least this regex does seem to validate itself.

  • BatConley (nodebb)

    Great, we have a regex that validates regexen; even itself. What possible good can come of it?

  • Bill T (unregistered) in reply to dkf

    Everyone knows it's regexes all the way down. They're competing with the turtles.

  • Carl Witthoft (google)

    +1 for "R’gexyleh." .
    Meanwhile, validate a regex pattern ... how? I'd consider it "valid" if and only if it actually extracts exactly what I want to extract. Plus all the weedy little corner cases I forgot about. Somehow I think this borders on Turing's laws about never figuring out what a program will do.

    Meanwhile, there's worse things than valid regexpressions. http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454

  • Herby (unregistered)

    So this is how the phrase "Now you've got two problems" comes about.

    But users inputting RE's si the real WTF that nobody is looking at.

  • foxyshadis (unregistered) in reply to Geoff

    While just slapping the old try block around you call to your regex class probably looks inelegant it probably is the best option most of the time.

    Which could ALSO result in a denial of service. It's trivial to create an infinitely recursive regex, and not even terribly difficult to create a non-recursive one that spends an eternity on every character. Welcome to regular expressions.

  • RegexMan - the unwanted superhero (unregistered)

    /^((?:(?:[^?+*{}()[\]\\|]+... I read that as if it's allowing an unescaped ?, +, *, { and } as starting characters of a regex. Doesn't seem valid to me, as the target of those operators would be unspecified.

  • Tyrannosaurus Regex (unregistered)

    chomp(my $validation = <STDIN>) # ask mikey, he'll eat anything, including himself

  • Olivier (unregistered) in reply to dkf

    TheRegexCoah says it is not valid saying that character | cannot follow character ? at position 111. And a quick test in Perl does not validate a regex I had been using yesterday and I know is valid.

    Instead of that useless bashing on regex (it's a powerful tool and like any powerful tool, it should only be handled by trained professionals) , it'd be better to make sure sensitive stuff is written on this blog.

  • löchlein deluxe (unregistered)

    Just so somebody has said it: if your "regular expression" engine supports recursion, it is not a regular expression engine but by rough estimate a context-free language parser with a weird input syntax.

  • Anon (unregistered)

    How can a regex validate itself using regex? Isn't it fundamentally not possible for a system to prove itself?

  • linepro (unregistered)

    But is it also valid brainf*ck?

  • G (unregistered)

    public bool isSaneRegex(string expression) { return false; }

  • Ron Fox (google)

    TRWTF is regular expression syntax which looks like someone sneezed punctuation on the page and then tried, without success to wipe it up.

  • Alex (unregistered)

    IsLegalRegex(string) { try { new Regex(string); } catch(e) { return false; } return true; }

  • Sumireko (nodebb) in reply to dkf

    It's recursive, so feed it into itself, and if it works, then it 100% passes

  • TimothyB (unregistered)

    This reminds me of the punchline of an old UserFriendly comic strip - something like: "Your Perl is strong - I can't tell if that's a regular expression or line noise." (Though I didn't find the strip - the site would be blocked here.)

  • dkf (nodebb) in reply to Olivier

    Instead of that useless bashing on regex

    The point is that it is insanely difficult to validate regexes other than through actually testing whether they match what they should and don't match what they shouldn't. I'm mostly OK with this; replacing regexes with other sorts of string manipulations is often annoyingly difficult. But validating them with a regex? Ugh. You don't actually know whether the code is right or whether it's just a ship of fools.

    And it's a recursive expression, which is not portable between engines anyway.

    “Validating” by checking if the string is non-empty (and not null) looks quite attractive. It's a punt, but it's a cheap and obvious punt.

  • You all suck (unregistered)

    TRWTF is that an empty string is a valid regular expression.

  • Beep boop boop (unregistered) in reply to foxyshadis

    So then you do the validation in a separate thread, use a watchdog timer, and kill it if it takes more than <x> microseconds to process.

    Alternatively, do that with every request you service, at the top-most level possible, and watch your DOS concerns melt away.

  • Rich Hendricks (unregistered)

    I don't always use RegEx's, but when I do, I validate them using https://regex101.com/

    :)

  • Anonymous (unregistered) in reply to Geoff

    doing recursive regex processing on arbitrary strings has a name "Denial of Service Vulnerability." While just slapping the old try block around you call to your regex class probably looks inelegant it probably is the best option most of the time.

    And how's running an arbitrary regex in an engine that allows recursion any safer?

Leave a comment on “The Validation Regex”

Log In or post as a guest

Replying to comment #:

« Return to Article