• KattMan (cs)

    Sounds like the perfect example of people doing something simply so they can say "I used RE's!" Even a simple loop over a list to compare to the base would be better than this.

    It just goes to show, just because you CAN do something doesn't mean you SHOULD do something. In this case, it doesn't even work and many people would not know why.

  • anonymous (unregistered) in reply to KattMan
    *==-- .. <= joke
    
     O
    /|\
     |  <== me
    / \
    

    Anyway, thanks for code wtf!

  • Ohnonymous (unregistered)

    Truly a WTF. I have no idea what any of it does.

  • savar (cs) in reply to anonymous
    anonymous:
    *==-- .. <= joke
    

    O /|
    | <== me / \

    Anyway, thanks for code wtf!

    Instant winner for best ASCII art on TDWTF. Let the competition begin for round 2.

    Alex: this forum NEVER remembers me, no matter how many times I check the "Remember me next time" box. Is that box just a tease? I'm in IE6, default settings, all cookie and other security settings essentially set to "wide open, come on in and look around, please don't steal stuff". (It's my work machine, go figure.)

  • InfiniteBlue (unregistered)

    and the monstrosity is appended to a carrot...

    Have we actually reached the point in programming where we can attach strings to vegetables? :)

    I think the author means "caret."

    CAPTCHA: was tastey, now it's bathe.

  • not so sure (unregistered) in reply to InfiniteBlue

    Given that I know nothing about Ruby (well, I can spell it), I can see how someone new to the language (applies to any language/technology) might not know the right way to do something, so they invent something.

    Of course the Real WTF is that they don't bother to search to see if someone else has already done it...

    Captcha: doom (how appropriate)

  • not so sure (unregistered) in reply to not so sure
    not so sure:
    Given that I know *nothing* about Ruby (well, I can spell it), I can see how someone new (assuming they are new here) to the language (applies to any language/technology) might not know the right way to do something, so they invent something.

    Of course the Real WTF is that they don't bother to search to see if someone else has already done it...

    Captcha: doom (how appropriate)

    rats - forgot to log in - can't edit my own post

    captcha: cognac: mmmm... doom and cognac at 9:27 AM :)

  • Gareth (unregistered)

    Ok, I haven't looked at this for too long, but isn't it relevant that only the ^ is used, and not a $?

    It seems that a base of '/abc/def' should match a controller_path of '/abc' (If that's the right kind of path to be talking about)

    In which case, the regex solution is still baloney, and a nicer solution would be

      ActionController::Routing \
          .controller_paths \
          .map { |p| File.expand_path(p) } \
          .any? { |path| path.starts_with? base }
    
  • pete (unregistered)

    this is a quite common behaviour of coders in any language...

    If someone can't handle a datastructure properly (in this case an array) he'll reduce it to a string. especially the not-statically-typed-language-guys (not to be confused with typeless as ruby is strictly typed) coming from the vb and php world do that damn pretty often. you should introduce a string-less day once a year so they'll learn there's more datatypes than strings.

  • rmr (cs)

    I know exactly 0 ruby - can someone explain what p is and what |p| means? Is that one of those mystical closures I've been hearing about?

  • Baggy McBagster (unregistered)

    I agree that people from purely 'dynamically typed' backgrounds often tend to pathologically reduce everything to a string.

    But the idea of 'stringless days' is new and brilliant!

    Why, on 'stringless Friday' nobody will be able to convert between European and American date formats by expressing the date as a string and using a regex to twiddle the string -- that in itself would probably be a 50% increase in productivity on some projects I could mention.

  • Gareth (unregistered) in reply to rmr
    rmr:
    I know exactly 0 ruby - can someone explain what p is and what |p| means? Is that one of those mystical closures I've been hearing about?

    'map' is a nice way of turning an array into a different array.

    [1,2,3].map { |i| i*3 }

    [1,2,3] is a 3 element array.

    map passes each element to the block of code specified with { ... }, where |i| is a parameter list for this anonymous function (a.k.a. 'block'). you can also use do...end instead of {...}

    Whatever is returned from the block is the value used in the new array.

    So,

    [1,2,3].map { |i| i*3 } # => [3,6,9]

  • Coditor (cs)
    Fabian:
    One should appreciate the "^" at the beginning of the RE. It is probably supposed to ensure that base does not match only a part of a path entry. Unfortunately it would do that only for the first entry, if not the match function would already ensure a complete match from beginning to end.

    Without knowing any ruby, just reading thru the explanation, it seems that a match is being done like this:

    a check for "/some/path" against "^/a/path|/another/path|/some/path" will return true

    a check for "/some/path/as/well" against "^/a/path|/another/path|/some/path" will return true

    a check for "this/is/some/path" against "^/a/path|/another/path|/some/path/" will also return true if the caret is only used for the first pattern.

    "if not the match function would already ensure a complete match from beginning to end" sounds strange behaviour to me. Depending on the implementation of the caret it would match any string that starts with one of the options, or any string that either starts with the first option or contains some of the other options.

    As said for a full match, also the $ should be used and preferably the options should be enclosed by brackets to ensure the effect of ^ and $ are clear:

    a check for "/some/path" against "^(/a/path|/another/path|/some/path)$" will return true

    a check for "/some/path/as/well" against ^(/a/path|/another/path|/some/path)$" will now return false.

    a check for "this/is/some/path" against ^(/a/path|/another/path|/some/path)$" will now return false.

    It would have been better looking to itterate thru the array and match every occurrance but I can imagine that joining the array and doing only 1 match would be better performancewise.

    I don't know what include? does but I can imagine you can include a file that's not in the file system paths.

    Apart from the regexp not working properly - what's the big WTF about this one? Am I really that blonde?

    Coditor

  • cout (cs) in reply to Gareth

    I think that's better, but the code still isn't clear. IMO it should be split into multiple self-documenting lines:

      controller_paths = ActionController::Routing.controller_paths
      controller_full_paths = controller_paths.map { |p| File.expand_path(p) }
      is_allowed = controller_full_paths.any? { |p| p.starts_with?(base) }

    Now if an exception is raised at any point, it is clear which operation caused the exception.

    Also note that the code will still match '/tmp/foobar' to a base of '/tmp/foo', which might not be desirable. Without knowing what the original code was supposed to do it's hard to know whether any of our changes is correct.

  • Dan (unregistered)
    Comment held for moderation.
  • Jim (unregistered) in reply to KattMan
    Some people, when confronted with a problem, think “I know, I'll use regular expressions.” Now they have two problems.
    - attributed to JWZ
  • Gareth (unregistered) in reply to Dan
    Comment held for moderation.
  • Dan (unregistered)
    Comment held for moderation.
  • El Guapo (unregistered)

    Wow... if it needs that much explanation, its probably not a WTF. I'm as lost as the guy who wrote it.

  • Hoxha (unregistered)

    Seems we already have the most stupid 'WTF' submission of the year.

  • D2orus (unregistered) in reply to El Guapo
    El Guapo:
    Wow... if it needs that much explanation, its probably not a WTF. I'm as lost as the guy who wrote it.
    Something with Regexp...or something...

    I'm sure it's all very stupid and therefore hilarious:) With the laughing and the pointing.

  • James (unregistered)

    Being the guy responsible for that monstrosity, this is kinda funny.

    To appreciate its raison-d'etre, you'll need to look back at the major security issue that Rails had around 1.1.4-1.1.6, where arbitrary Ruby files could be loaded from the browser, wiping out the database. The resulting patches to fix it were developed in something of a state of panic, which might explain that particular line since fixing their security hole is its only purpose.

    Certainly you're right about the RegExp not providing the intended behaviour. Thankfully, its misbehaviour only restricts Rails' loading permissions further, and more importantly none of that patchwork is even necessary with the next release, due "any day soon", as I understand it.

  • Sam (unregistered) in reply to Hoxha

    Because he forgot a pair of parens? You've gotta be kidding me.

    As far as the earlier "reducing everything to strings" goes... Both sides of the expressions are strings, so that's just silly. Claiming a "fear of iteration" when the one-liner that is the topic actually contains an iterator is... well, just stupid.

    Agree with the other posts that a String#starts_with would probably be the cleaner solution though. The solutions posted so far do the opposite of what seems to be the intent of the regex though. So here you go:

    ActionController::Routing \ .controller_paths \ .any? { |path| base.starts_with? File.expand_path(p) }

    And yeah, breaking into multiple lines for "exception" catching strikes me as very UnRuby. It ignores the beauty of method-chaining, but more to the point, you're unlikely to throw an Error expanding a path unless your routes are messed up. Which is going to be pretty obvious anyways when your links blow up with a big fat "Rails Application Error".

  • pete (unregistered)

    Both sides of the expressions are strings

    nope: one is an array of strings, the other is a string

  • Sam (unregistered) in reply to pete

    You got me there. Of course your comment is still BS considering the same iteration is occurring either way. The real difference is where (and how) the comparison takes place.

    String =~ Regexp vs. String#starts_with(String)

  • pete (unregistered)

    james:

    • your plugin is great - period.
    • on a wtf-scale between 1 (low wtf) and 10 (max wtf) yours is at most a 0.5
    • even if it's a wtf at least it's still working
    • if you wanna see real wtf, there's better around (windows source, loads of php webapps, java senior architects, cobol legacy stuff, ...)
  • Karl von L. (unregistered)

    Okay, for those who don't understand Ruby, or don't get the WTF, let me explain:

    There is an array of paths, and the code wants to know if a specific path exists as an element in the array. So the original code takes all the paths in the array and concatenates them, delimited by "|" characters, into a single regular expression. Then it tries to match the given path against that regular expression.

    The correct way to do this is to pass the given path into the array's include?() method, which is a nifty method that simply asks the array whether it contains the item passed in.

  • pete (unregistered)
    Comment held for moderation.
  • quamaretto (cs) in reply to Dan
    Dan:
    Ruby and/or Rails make me think back to the good ol' days of APL. If I can write a whole program on one line using wingding characters it has to be good.

    Well, that isn't so much a past thing...

  • diaphanein (unregistered) in reply to Baggy McBagster
    Baggy McBagster:
    I agree that people from purely 'dynamically typed' backgrounds often tend to pathologically reduce everything to a string.

    But the idea of 'stringless days' is new and brilliant!

    Why, on 'stringless Friday' nobody will be able to convert between European and American date formats by expressing the date as a string and using a regex to twiddle the string -- that in itself would probably be a 50% increase in productivity on some projects I could mention.

    How nice that you agree with an unfounded supposition. Myself, I code about 75% in python (dynamic, yet strictly typed, same as Ruby), and I never do this sort of nonsense. My other 25% of time is spent in C++, where again, I would never do this sort of thing.

    Regexes have there place. I'm not sure this is one of them. It would help if I understood Ruby, or knew clearly the intent of the original author. I will admit to using regexs to match a whole literal string, but its usually when I'm writing generic code and won't always be matching a literal string and want the flexibility.

  • Sam (unregistered) in reply to pete

    Karl: Enumerable#include? is not the solution since it only tests straight equality.

    Pete: Your post said nothing about expense. I don't disagree that Regexps aren't the ideal solution here (despite Ruby having a very fast regex implementation, and the fact that Regexp literals are cached). Frankly I just took offense at your characterization (sp?) of Dynamic Language programmers as being afraid of iteration, when the example put forth actually contains an iterator. Now come on, that's just silly.

  • Gareth (unregistered) in reply to cout
    cout:
    Now if an exception is raised at any point, it is clear which operation caused the exception.

    Unlike PHP, Ruby provides a full stack trace by default with every exception and not just a line number, so you'll always know exactly which method call failed.

  • Cthulhu (cs)

    VB makes a whole lot more sense than that. In VB you can egelently just say:

    For i = 1 To pathsize step 1

    if path(i) == thepath then goto getout endif

    Next i print "the path doesnt match" goto jumpovr getout: print "the path matches" jumpover

  • James (unregistered) in reply to Karl von L.
    The correct way to do this is to pass the given path into the array's include?() method, which is a nifty method that simply asks the array whether it contains the item passed in.

    Not exactly. What the code really wants to do is see if any of the elements of the array starts_with the given string - hence the

    ^
    in the code, which is actually the bug that seems to have inspired this 'wtf'.

    Sam's correction above seems to hit the spot, though is probably a bit slower. Whether or not this is issue enough to improve the readability is pretty much a personal choice.

  • Sad Developer (unregistered) in reply to Gareth
    Comment held for moderation.
  • Shareware (unregistered) in reply to savar
    savar:
    Instant winner for best ASCII art on TDWTF. Let the competition begin for round 2.

    I ask for a bonus, because seems monospace break the page layout and push the banners outside the visible area.

  • Gareth (unregistered) in reply to Sad Developer
    Comment held for moderation.
  • cout (cs) in reply to Gareth
    Gareth:
    cout:
    Now if an exception is raised at any point, it is clear which operation caused the exception.

    Unlike PHP, Ruby provides a full stack trace by default with every exception and not just a line number, so you'll always know exactly which method call failed.

    I didn't know PHP had that behavior.

    Regardless, even with the method name in the stack trace, it's often not clear where the exception is coming from. Though not the case here, this is especially true when there are multiple chained calls to the same method in the same line; in that case it's often impossible to tell which operation caused the exception.

    Often chained methods return nil or some other unexpected value. In that case, a NoMethodError will result, and there is no information in the stack trace to indicate which method returned nil.

    In other cases chained methods still make things difficult. The reader must first figure out the sequence of events, noting that chained method calls (foo.bar.baz()) are evaluated left-to-right but nested method calls (foo(bar(baz()))) are evaluated right-to-left. Only after that analysis is done can the sequence of events leading up to the error be determined.

  • markerstore (unregistered) in reply to Cthulhu
    Cthulhu:
    VB makes a whole lot more sense than that. In VB you can egelently just say:

    For i = 1 To pathsize step 1

    if path(i) == thepath then goto getout endif

    Next i print "the path doesnt match" goto jumpovr getout: print "the path matches" jumpover

    Ah, gotos. How egelent.

  • Scott (unregistered) in reply to Sad Developer

    Are you kidding me? This DailyWTF is an example of a poor choice that someone made, not a language limitation. In fact, the solution that the author offers does read nicely...

    ActionController::Routing.controller_paths.map { |p| File.expand_path(p) }.include?(base)

    Inside the ActionController::Routing module, there is an array called "controller_paths", map (or collect) the results after calling File.expand_path on each element, and check to see if the value stored in the variable "base" matches any element in the array. It could also be written like this:

    ActionController::Routing.controller_paths.any? { |p| base == File.expand_path(p) }

    The largest part of that code is drilling down to access the variable you want to use "ActionController::Routing.controller_paths", the code to determine a match is almost self explanitory. I'm suprised to see someone of your supposed expereince dissing a great langauge based on a silly porrly-coded example.

  • Scott (unregistered) in reply to Scott

    .... My comment above is directed at "Sad Developer", I forgot to click "Quote"....

  • cout (cs) in reply to pete
    pete:
    so regexp - comparison is by definition more expensive in terms of processing time and memory

    It's the combination of compilation and comparison that is expensive, not the comparison itself. Often the regexp can be compiled ahead of time, and in that case can even be faster than a sequence of string comparisons.

    irb(main):006:0> require 'benchmark'
    => true
    irb(main):046:0> Benchmark.bm(20) { |x|
    irb(main):047:1*   x.report('string compare') { N.times { [ 'apple', 'ape', 'apogee', 'apprentice'].include?('foo') } }
    irb(main):048:1>   x.report('regexp precompiled') { r = /^(apple|ape|apogee|apprentice)$/; N.times { r.match('foo') } }
    irb(main):049:1>   x.report('regexp not precomp') { N.times { r = /^(apple|ape|apogee|apprentice)$/; r.match('foo') } }
    irb(main):050:1> }
                              user     system      total        real
    string compare        8.660000   0.000000   8.660000 (  8.706296)
    regexp precompiled    3.260000   0.000000   3.260000 (  3.258284)
    regexp not precomp    3.870000   0.010000   3.880000 (  3.905989)
    => true
  • Mitch Marcus (unregistered) in reply to InfiniteBlue

    What, you never heard of string beans? They go great with Java!

    CAPTCHA: cognac (my favorite: hope it's ***)

  • David Vallner (unregistered) in reply to Sad Developer
    Comment held for moderation.
  • David Vallner (unregistered) in reply to quamaretto
    Comment held for moderation.
  • ted (unregistered)

    I understand nothing of this, but the attaching monstrosities to carrots part made me laugh.. haha. Don't people learn from previous WTFs? As I recall the carrot/caret thing has come up in at least 2 previous WTFs.

    captcha: dubya.. what?

  • JTK (unregistered) in reply to Cthulhu
    Cthulhu:
    VB makes a whole lot more sense than that. In VB you can egelently just say:

    For i = 1 To pathsize step 1

    if path(i) == thepath then goto getout endif

    Next i print "the path doesnt match" goto jumpovr getout: print "the path matches" jumpover

    Nice use of the double goto. Makes me feel young again!

  • Rubyist (unregistered)

    Actually as far as performance goes the Ruby Regexp Engine is pretty damn fast, though Array#include? is definitely faster in this case. There's actually a bigger wtf here though that you didn't mention...

    From the actual source:

    paths = $LOAD_PATH.select do |base|
        base = File.expand_path(base)
        # Check that the path matches one of the allowed paths in controller_paths
        base.match(/^#{ActionController::Routing.controller_paths.map { |p| 
          File.expand_path(p) } * '|'}/)
    end

    Array#select returns a new array with all elements for which the block returned true. Yes, that's right, its going to map, expand and compile this regexp all over again for every element in the $LOAD_PATH!

    A better way to do this would be:

    expanded_load_paths = ActionController::Routing.controller_paths.map { |p| 
      File.expand_path(p) }
    paths = $LOAD_PATH.select do |base|
        # Check that the path matches one of the allowed paths in controller_paths
        expanded_load_paths.include? File::expand_path(base)
    end

    Even that is doing a little much though since we get to search, and in turn iterate the array for every element again.

    Since duplicates are impossible how about using a Set operation?

    #expand load and routing paths
    routing_paths = ActionController::Routing.controller_paths.map { |path| 
      File.expand_path(path) }
    load_paths = $LOAD_PATH.map {|path| File::expand_path(path) }
    # find all routing paths in load paths using a set intersection
    paths = routing_paths & load_paths

    :)

  • Sam (unregistered) in reply to cout

    Re: cout:

    Well that's certainly interesting. It's a tad unfair as a comparison since the topic code wouldn't be instantiating an Array each time though. That's a lot of extra object creation going on. Here's my result:

    irb(main):001:0> require 'benchmark'
    => true
    irb(main):002:0> a = %w{ apple ape apogee apprentice }
    => ["apple", "ape", "apogee", "apprentice"]
    irb(main):003:0> N = 100000
    => 100000
    irb(main):004:0> Benchmark::bm(25) do |x|
    irb(main):005:1*   x.report("string compare") { N.times { a.include? 'foo' } }
    irb(main):006:1>   x.report("regexp literal (cached)") { N.times { 'foo' =~ /^(apple|ape|apogee|apprentice)$/ } }
    irb(main):007:1>   x.report("regexp new (not cached)") { N.times { 'foo' =~ Regexp.new('^(apple|ape|apogee|apprentice)$') } }
    irb(main):008:1> end
                                   user     system      total        real
    string compare             0.172000   0.000000   0.172000 (  0.172000)
    regexp literal (cached)    0.172000   0.000000   0.172000 (  0.172000)
    regexp new (not cached)    1.703000   0.046000   1.749000 (  1.766000)
    => true

    So it's not quite the difference, but it does reveal that you were on-track. It also demonstrates the difference in performance between Regexp literals, and new Regexp instances. Really Ruby has hands down the fastest regex engine I've used, which is kinda surprising to be honest. (Not saying it's the fastest, just the fastest I've used among VBScrpt, JavaScript, Ruby, Python, c#.)

  • RobertB (unregistered) in reply to JTK
    JTK:
    Cthulhu:
    VB makes a whole lot more sense than that. In VB you can egelently just say:

    For i = 1 To pathsize step 1

    if path(i) == thepath then goto getout endif

    Next i print "the path doesnt match" goto jumpovr getout: print "the path matches" jumpover

    Nice use of the double goto. Makes me feel young again!

    In defense of the poor guy, I remember that TRS-80 BASIC didn't have "Exit For". But it did at least have "If i > pathsize Then"!

    Refactoring in the TRS-80 days was great, BTW. Lots of cut and paste. As in scissors. And tape. I took a printout of my spaghetti code full of gotos, cut it into pieces, rearranged them, and typed it back in. Then CSAVE.

    Captcha: ewww. Also my reflexive reaction to those GOTOs.

Leave a comment on “RubyOnRegexp”

Log In or post as a guest

Replying to comment #:

« Return to Article