• newt0311 (unregistered) in reply to Gareth

    Thats actually what I was thinking about. if the $ isn't there then even a partial match at the start will do. either way, using a regex for something like that is idiotic. They should only be used where complex formats are used and even then, only if those formats can not reasonably be abstracted out to data structures due to circumstances.

  • newt0311 (unregistered) in reply to Coditor

    If the array is ordered (I wonder, is it?) they could even go with a binary search. either way, almost anything will be faster than reading the entire array, going through several type conversions, and then remaking a string... to let a regexp read the entire array again.

  • sp0n9e (unregistered) in reply to Sad Developer
    Sad Developer:
    Gareth:
    Unlike PHP, Ruby provides a full stack trace by default with every exception and not just a line number, so you'll always know exactly which method call failed.

    hmmm, actually PHP can do that too : http://uk2.php.net/debug_backtrace (oh and better, you can actually recover programmatically from an error if you're clever enough, because you get an array of the stack trace!)

    Anyway, next time someone talks to me about how great Ruby is and I'll point it to that WTF... who would ever want to use a language that you can't even read out loud (let a lone understand a single line of it) after 15 years of programming experience in C, C++, VB, Java and PHP (which is quite a wide choice, isn't it?)

    Also, the backtrace is IN THE EXCEPTION as well.
    $> php -a
    <?php
    Reflection::Export(new ReflectionClass('Exception'));
    Class [ <internal> class Exception ] {
    
    • Constants [0] { }

    • Static properties [0] { }

    • Static methods [0] { }

    • Properties [6] { Property [ <default> protected $message ] Property [ <default> private $string ] Property [ <default> protected $code ] Property [ <default> protected $file ] Property [ <default> protected $line ] Property [ <default> private $trace ] }

    • Methods [9] { Method [ <internal> final private method __clone ] { }

      Method [ <internal, ctor> public method __construct ] {

      • Parameters [2] { Parameter #0 [ <optional> $message ] Parameter #1 [ <optional> $code ] } }

      Method [ <internal> final public method getMessage ] { }

      Method [ <internal> final public method getCode ] { }

      Method [ <internal> final public method getFile ] { }

      Method [ <internal> final public method getLine ] { }

      Method [ <internal> final public method getTrace ] { }

      Method [ <internal> final public method getTraceAsString ] { }

      Method [ <internal> public method __toString ] { } } }

  • (cs)

    The real WTF is that he tried to concatenate a long, orange vegetable root to a string instead of a caret.

    Yeah, I know it's been mentioned already, but come on...

  • bramster (unregistered) in reply to ted
    ted:
    I understand nothing of this, but the attaching monstrosities to carrots part made me laugh.. haha. Don't people learn from previous WTFs? As I recall the carrot/caret thing has come up in at least 2 previous WTFs.

    captcha: dubya.. what?

    . . . and there I was, waiting for the canonized path to explode. . .

  • Zachery (unregistered) in reply to pete
    pete:
    james:
    • your plugin is great - period.
    • on a wtf-scale between 1 (low wtf) and 10 (max wtf) yours is at most a 0.5
    • even if it's a wtf at least it's still working
    • if you wanna see real wtf, there's better around (windows source, loads of php webapps, java senior architects, cobol legacy stuff, ...)

    since when was 0.5 between 1 and 10.

    there is a wtf :p

  • poopdeville (unregistered) in reply to rmr
    rmr:
    I know exactly 0 ruby - can someone explain what p is and what |p| means? Is that one of those mystical closures I've been hearing about?

    Sure, but I'll use some simpler code to explain. Let

    a = [1,2,3,4,5,]
    be an array. The Array class defines an 'each' method, that allows constructs like

    a.each {|p| print p}
    . 'each' is an iterator, and just returns elements from the array.

    The fragment

    {|p| print p}
    is called a block, and in this case is acting like an anonymous subroutine. The
    |p|
    syntax declares that p is a formal parameter for the block -- that is, p is bound to the block.

    You can implement closures, coroutines, continuations, etc using blocks like this..

  • Dark (unregistered)

    One problem I haven't seen anyone mention yet is that by stuffing the pathnames into a regexp, you interpret them as regexps. This means, for example, that the dot (.) is treated as a wildcard character and will match any letter in the base path.

    The code could even die horribly if a pathname contained something awful such as an unmatched parenthesis.

  • Freaky (unregistered)

    Roughly this code in PHP, in case anyone's from that side of the fence is still having trouble following it:

    return(preg_match('/^$' . join('|', array_map('realpath', ActionController_Routing::controller_paths())) . '/', $base));
    

    Note caveats; array_map() only works on arrays (Ruby will just call the each() method if you've mixed in Enumerable) and only accepts named functions as arguments (create_function() returns one if you want to specify it inline, but that will make a new one each execution as well as being fugly).

    If the array_map() confuses you (heh), it could also be written in the arguably more "phpish":

    $paths = array();
    foreach (ActionController_Routing::controller_paths() as $path)
    {
      $paths[] = realpath($path);
    }
    return(preg_match('#^' . join('|', $paths) . '#', $base));
    

    Note the lack of preg_escape() on paths, and the missing subexpression or need for '|^' in the join() so each potential match is anchored correctly. Also note the regexp could get rather large. Low on the WTF scale really; haven't you guys seen plain old marginally broken poor code before?

  • Freaky (unregistered) in reply to Freaky

    (And yes, the other caveat is "/" normally appears in paths, making the lack of escaping metacharacters more of a problem, hence my uncompleted replacing of // with ##).

  • Jx (unregistered)

    Okay, maybe I am a bit rusty on Regular Expressions (or too lazy to decipher them for the purpose of reading a WTF posting) and inexperienced on Ruby, but I don't see the WTF.

    I understand that the guy's code wasn't the best way to do this, but rolling your own instead of using a built in method is a pretty run-of-the mill programmer mistake. It doesn't seem to rise to the level of a WTF like using using HD files instead of variables because you don't trust them.

  • (cs) in reply to cout
    cout:
    I think that's better, but the code still isn't clear. IMO it should be split into multiple self-documenting lines:
      controller_paths = ActionController::Routing.controller_paths
      controller_full_paths = controller_paths.map { |p| File.expand_path(p) }
      is_allowed = controller_full_paths.any? { |p| p.starts_with?(base) }

    Now if an exception is raised at any point, it is clear which operation caused the exception.

    Also note that the code will still match '/tmp/foobar' to a base of '/tmp/foo', which might not be desirable. Without knowing what the original code was supposed to do it's hard to know whether any of our changes is correct.

    When I write like this, people accuse me of being a Pascal hack, which I am <G>

    I do not understand people who are proud of obtuse and cryptic code (Perl, Lisp, etc.)

    There are two ways of constructing software: one way is to make it so simple that that there are obviously no deficiencies; the other way is to make it so complicated that there are no obvious deficiencies. -- Charles Hoare
  • Jon (unregistered) in reply to Sad Developer
    Sad Developer:
    who would ever want to use a language that you can't even read out loud (let a lone understand a single line of it) after 15 years of programming experience in C, C++, VB, Java and PHP (which is quite a wide choice, isn't it?)
    If it helps, I'll go over the syntax you need to know to understand (and even pronounce!) the code. I'll explain it from the innermost part to the outermost part. Here's the code again:

    base.match(/^#{ActionController::Routing.controller_paths.map { |p| File.expand_path(p) } * '|'}/)

    The map function has the same purpose as PHP's array_map function. In Ruby syntax, { |foo| ... } is an anonymous function (or "block", in Ruby terms), where |foo| is the parameter list. So, it takes ActionController::Routing.controller_paths, calls File.expand_path on each element, and builds a new array of the results.

    Next, the multiplication operator is applied to that array and '|', performing the same job as PHP's join() or implode() function.

    The #{} syntax can appear in strings or regex literals. Ruby evaluates the code inside the #{} and inserts it into the string or regex.

    That leaves base.match(/^some regex/), which needs no explanation.

  • (cs) in reply to James
    James:
    Being the guy responsible for that monstrosity, this is kinda funny.

    ...

    Certainly you're right about the RegExp not providing the intended behaviour. Thankfully, its misbehaviour only restricts Rails' loading permissions further, and more importantly none of that patchwork is even necessary with the next release, due "any day soon", as I understand it.

    So, you got me. I assumed the path should be fixed, not a prefix. But the RegExp will still let /usr/good/path/../../bad/path through, right? (Or is this path canonized before it gets to the check function?)

    Essentially, lazily compiling the expression (and adding parentheses) will just produce the equivalent of a prefix-tree lookup, which should run O(log n).

  • Rubyist (unregistered) in reply to Dark
    Dark:
    One problem I haven't seen anyone mention yet is that by stuffing the pathnames into a regexp, you interpret them as regexps. This means, for example, that the dot (.) is treated as a wildcard character and will match any letter in the base path.

    The code could even die horribly if a pathname contained something awful such as an unmatched parenthesis.

    Yes, a period could be a problem. Ruby is, however, smart enough to escape the / character for you.

    They should have used Regexp#escape though.

  • D (unregistered)

    Does Ruby boast about its TIMTOWTDI luxury(?!) like Perl does?!

    IMHO, more often than not, people stop trying to make their code more readable - forget about optimizing - simply because they can always attribute it to tim-toady.

  • Jon (unregistered) in reply to D
    D:
    Does Ruby boast about its TIMTOWTDI luxury(?!) like Perl does?!

    IMHO, more often than not, people stop trying to make their code more readable - forget about optimizing - simply because they can always attribute it to tim-toady.

    I don't know much about the Ruby community, but intuitively, TIMTOWTDI is MORE of a reason to go for readability and optimisation.

  • Fabian (unregistered)

    Hmm, many people are arguing about the use of "any?" or "starts_with?", but that is wrong. As I mentioned in the submission, "match" does a full match, not a search!

    'foo'.match(/foobar/) => nil 'foo'.match(/foo/) => #MatchData:0xb753fc94

    Don't be confused with the caret (only in the first alternative) and the missing dollar at the end. It's just a hasty copy from the original function.

    You can see that also from the intended functionality: base should be one of those controller_paths, not just start with one.

    And as for the "quality" of this WTF:

    1. I didn't intend it as an insult. I write these little WTFs myself and am not ashamed to admit that. They are rewritten the next take my head hurts from banging it somewhere after seeing it again.
    2. Of course it cannot compete with the VirtuaDyne Saga and I'm sorry people don't get it.
    3. Still it's funny and the comments (half of them justifying it, half of them rewriting it wrong) are IMO just as funny.
  • :o (unregistered) in reply to D
    D:
    Does Ruby boast about its TIMTOWTDI luxury(?!) like Perl does?!

    yes, but the community usually also say that there is a best way to do it as well.

    D:
    IMHO, more often than not, people stop trying to make their code more readable - forget about optimizing - simply because they can always attribute it to tim-toady.

    that's not really my experience with ruby. more often than not I find it easy to understand, except for when someone else has been OTT clever with their code.

  • Nova (unregistered) in reply to Cthulhu

    Anybody or any language that uses a 'goto' should be shot on sight. Not to mention using a 'for' loop when you do not necessarily know if you want to iterate through the entire data structure. I'm not a great programmer, but please people, can we not at least try to be civilized?

  • (cs) in reply to Nova
    Nova:
    Anybody or any language that uses a 'goto' should be shot on sight. Not to mention using a 'for' loop when you do not necessarily know if you want to iterate through the entire data structure. I'm not a great programmer, but please people, can we not at least try to be civilized?

    I don't need for that code to have gotos. It's in VB, so I'd shoot it anyway. ;) (And remember, if you're doing assembler, JMPs are GOTOs in microcode!!!)

    I've used C, C++, Foxpro (ah, the old days), BASIC (before M$ crapped on it and produced VB), LISP, APL, PHP, Java, Prolog and x86 assembly code.

    LISP, APL and Prolog have their specific uses, so I wouldn't diss that much on them; APL kind of makes sense (and I love to do complex operations on it, I can do entire complex operations with one line!), so does Prolog, but LISP just makes me dizzy.

    For real-work languages, I prefer something I can actually read, so I'd go for something similar to C (which is why I do C++ or Java mostly these days).

    I got lost with that small snippet for Ruby, so I doubt I'll use that in the near future.

  • aberant (unregistered) in reply to danixdefcon5
    danixdefcon5:
    I got lost with that small snippet for Ruby, so I doubt I'll use that in the near future.

    that's probably some of the most esoteric ruby code you will ever see... as they say, you can write fortran in any language... but most ruby is significantly more human readable...

  • regeya (unregistered)

    Good Lord, that's ugly. This is why it took me so long to catch on to Ruby: While it is possible to write nice, clean Ruby, the practice is discouraged. wink

    Seriously, I'm beginning to feel comfortable with the language, and I had to grab my Pickaxe book and puzzle through it. Talk about going to a lot of trouble to make a one-liner...

  • Ian (unregistered) in reply to cout
    cout:
    I think that's better, but the code still isn't clear. IMO it should be split into multiple self-documenting lines:
      controller_paths = ActionController::Routing.controller_paths
      controller_full_paths = controller_paths.map { |p| File.expand_path(p) }
      is_allowed = controller_full_paths.any? { |p| p.starts_with?(base) }

    Now if an exception is raised at any point, it is clear which operation caused the exception.

    Also note that the code will still match '/tmp/foobar' to a base of '/tmp/foo', which might not be desirable. Without knowing what the original code was supposed to do it's hard to know whether any of our changes is correct.

    Right on! Just because you can do it on one line doesn't mean you should. I thought I was the only one who thought code should be easy to read, even at the expense of being 1337 efficient and kewl.

  • bleh (unregistered) in reply to Sad Developer
    Sad Developer:

    hmmm, actually PHP can do that too

    he said by default, you monkey.

  • :o (unregistered) in reply to Ian
    Ian:
    Right on! Just because you can do it on one line doesn't mean you should. I thought I was the only one who thought code should be easy to read, even at the expense of being 1337 efficient and kewl.

    Why, no, not at all: "It's OK to figure out murder mysteries, but you shouldn't need to figure out code. You should be able to read it." - Steve C McConnell

  • Ben Yogman (unregistered) in reply to Baggy McBagster

    Or in LISP, a complex tree of conses.
    On the plus side, that abuse was easy to spot: c(a|d){3,}r

  • Anonymous (unregistered) in reply to Ben Yogman

    Someone should submit a fix to the Rails team, get some recognition for yourself!

  • darwin (unregistered) in reply to James
    James:
    Being the guy responsible for that monstrosity, this is kinda funny.

    To appreciate its raison-d'etre, you'll need to look back at the major security issue that Rails had around 1.1.4-1.1.6, where arbitrary Ruby files could be loaded from the browser, wiping out the database. The resulting patches to fix it were developed in something of a state of panic, which might explain that particular line since fixing their security hole is its only purpose.

    Certainly you're right about the RegExp not providing the intended behaviour. Thankfully, its misbehaviour only restricts Rails' loading permissions further, and more importantly none of that patchwork is even necessary with the next release, due "any day soon", as I understand it.

    Can the paths contain any characters that would be meaningful in a regular expression? Because they will be interpreted as such, not taken literally:

    x = "." => "." "blah".match(/#{x}/) => #MatchData:0x3ad630

    (That would have answered 'nil' if it hadn't matched.)

    Do the paths come from the environment? Of course, even if they did, there would be no way for a remote web user to inject anything, but still, it's not the right way to go....

Leave a comment on “RubyOnRegexp”

Log In or post as a guest

Replying to comment #112412:

« Return to Article