The Daily WTF: Curious Perversions in Information Technology

KattMan · 2007-01-17 Reply Admin

Sounds like the perfect example of people doing something simply so they can say "I used RE's!" Even a simple loop over a list to compare to the base would be better than this.

It just goes to show, just because you CAN do something doesn't mean you SHOULD do something. In this case, it doesn't even work and many people would not know why.

2007-01-17 Reply Admin

*==-- .. <= joke

 O
/|\
 |  <== me
/ \

Anyway, thanks for code wtf!

2007-01-17 Reply Admin

Truly a WTF. I have no idea what any of it does.

savar · 2007-01-17 Reply Admin

anonymous:
*==-- .. <= joke
O
/|

|  <== me
/ \
Anyway, thanks for code wtf!

Instant winner for best ASCII art on TDWTF. Let the competition begin for round 2.

Alex: this forum NEVER remembers me, no matter how many times I check the "Remember me next time" box. Is that box just a tease? I'm in IE6, default settings, all cookie and other security settings essentially set to "wide open, come on in and look around, please don't steal stuff". (It's my work machine, go figure.)

2007-01-17 Reply Admin

and the monstrosity is appended to a carrot...

Have we actually reached the point in programming where we can attach strings to vegetables? :)

I think the author means "caret."

CAPTCHA: was tastey, now it's bathe.

2007-01-17 Reply Admin

Given that I know nothing about Ruby (well, I can spell it), I can see how someone new to the language (applies to any language/technology) might not know the right way to do something, so they invent something.

Of course the Real WTF is that they don't bother to search to see if someone else has already done it...

Captcha: doom (how appropriate)

2007-01-17 Reply Admin

not so sure:
Given that I know *nothing* about Ruby (well, I can spell it), I can see how someone new (assuming they are new here) to the language (applies to any language/technology) might not know the right way to do something, so they invent something.
Of course the Real WTF is that they don't bother to search to see if someone else has already done it...

Captcha: doom (how appropriate)

rats - forgot to log in - can't edit my own post

captcha: cognac: mmmm... doom and cognac at 9:27 AM :)

2007-01-17 Reply Admin

Ok, I haven't looked at this for too long, but isn't it relevant that only the ^ is used, and not a $?

It seems that a base of '/abc/def' should match a controller_path of '/abc' (If that's the right kind of path to be talking about)

In which case, the regex solution is still baloney, and a nicer solution would be

  ActionController::Routing \
      .controller_paths \
      .map { |p| File.expand_path(p) } \
      .any? { |path| path.starts_with? base }

2007-01-17 Reply Admin

this is a quite common behaviour of coders in any language...

If someone can't handle a datastructure properly (in this case an array) he'll reduce it to a string. especially the not-statically-typed-language-guys (not to be confused with typeless as ruby is strictly typed) coming from the vb and php world do that damn pretty often. you should introduce a string-less day once a year so they'll learn there's more datatypes than strings.

rmr · 2007-01-17 Reply Admin

I know exactly 0 ruby - can someone explain what p is and what |p| means? Is that one of those mystical closures I've been hearing about?

2007-01-17 Reply Admin

I agree that people from purely 'dynamically typed' backgrounds often tend to pathologically reduce everything to a string.

But the idea of 'stringless days' is new and brilliant!

Why, on 'stringless Friday' nobody will be able to convert between European and American date formats by expressing the date as a string and using a regex to twiddle the string -- that in itself would probably be a 50% increase in productivity on some projects I could mention.

2007-01-17 Reply Admin

rmr:
I know exactly 0 ruby - can someone explain what p is and what |p| means? Is that one of those mystical closures I've been hearing about?

'map' is a nice way of turning an array into a different array.

[1,2,3].map { |i| i*3 }

[1,2,3] is a 3 element array.

map passes each element to the block of code specified with { ... }, where |i| is a parameter list for this anonymous function (a.k.a. 'block'). you can also use do...end instead of {...}

Whatever is returned from the block is the value used in the new array.

So,

[1,2,3].map { |i| i*3 } # => [3,6,9]

Coditor · 2007-01-17 Reply Admin

Fabian:
One should appreciate the "^" at the beginning of the RE. It is probably supposed to ensure that base does not match only a part of a path entry. Unfortunately it would do that only for the first entry, if not the match function would already ensure a complete match from beginning to end.

Without knowing any ruby, just reading thru the explanation, it seems that a match is being done like this:

a check for "/some/path" against "^/a/path|/another/path|/some/path" will return true

a check for "/some/path/as/well" against "^/a/path|/another/path|/some/path" will return true

a check for "this/is/some/path" against "^/a/path|/another/path|/some/path/" will also return true if the caret is only used for the first pattern.

"if not the match function would already ensure a complete match from beginning to end" sounds strange behaviour to me. Depending on the implementation of the caret it would match any string that starts with one of the options, or any string that either starts with the first option or contains some of the other options.

As said for a full match, also the $ should be used and preferably the options should be enclosed by brackets to ensure the effect of ^ and $ are clear:

a check for "/some/path" against "^(/a/path|/another/path|/some/path)$" will return true

a check for "/some/path/as/well" against ^(/a/path|/another/path|/some/path)$" will now return false.

a check for "this/is/some/path" against ^(/a/path|/another/path|/some/path)$" will now return false.

It would have been better looking to itterate thru the array and match every occurrance but I can imagine that joining the array and doing only 1 match would be better performancewise.

I don't know what include? does but I can imagine you can include a file that's not in the file system paths.

Apart from the regexp not working properly - what's the big WTF about this one? Am I really that blonde?

Coditor

cout · 2007-01-17 Reply Admin

I think that's better, but the code still isn't clear. IMO it should be split into multiple self-documenting lines:

  controller_paths = ActionController::Routing.controller_paths
  controller_full_paths = controller_paths.map { |p| File.expand_path(p) }
  is_allowed = controller_full_paths.any? { |p| p.starts_with?(base) }

Now if an exception is raised at any point, it is clear which operation caused the exception.

Also note that the code will still match '/tmp/foobar' to a base of '/tmp/foo', which might not be desirable. Without knowing what the original code was supposed to do it's hard to know whether any of our changes is correct.

2007-01-17 Reply Admin

I checked to see if that line is still there - it is: http://svn.rails-engines.org/plugins/engines/lib/engines/routing_extensions.rb

2007-01-17 Reply Admin

Some people, when confronted with a problem, think “I know, I'll use regular expressions.” Now they have two problems.

- attributed to JWZ

2007-01-17 Reply Admin

Dan:
I checked to see if that line is still there - it is: http://svn.rails-engines.org/plugins/engines/lib/engines/routing_extensions.rb

Ha! So much for anonymization!

2007-01-17 Reply Admin

Ruby and/or Rails make me think back to the good ol' days of APL. If I can write a whole program on one line using wingding characters it has to be good.

2007-01-17 Reply Admin

Wow... if it needs that much explanation, its probably not a WTF. I'm as lost as the guy who wrote it.

2007-01-17 Reply Admin

Seems we already have the most stupid 'WTF' submission of the year.

2007-01-17 Reply Admin

El Guapo:
Wow... if it needs that much explanation, its probably not a WTF. I'm as lost as the guy who wrote it.

Something with Regexp...or something...

I'm sure it's all very stupid and therefore hilarious:) With the laughing and the pointing.

2007-01-17 Reply Admin

Being the guy responsible for that monstrosity, this is kinda funny.

To appreciate its raison-d'etre, you'll need to look back at the major security issue that Rails had around 1.1.4-1.1.6, where arbitrary Ruby files could be loaded from the browser, wiping out the database. The resulting patches to fix it were developed in something of a state of panic, which might explain that particular line since fixing their security hole is its only purpose.

Certainly you're right about the RegExp not providing the intended behaviour. Thankfully, its misbehaviour only restricts Rails' loading permissions further, and more importantly none of that patchwork is even necessary with the next release, due "any day soon", as I understand it.

2007-01-17 Reply Admin

Because he forgot a pair of parens? You've gotta be kidding me.

As far as the earlier "reducing everything to strings" goes... Both sides of the expressions are strings, so that's just silly. Claiming a "fear of iteration" when the one-liner that is the topic actually contains an iterator is... well, just stupid.

Agree with the other posts that a String#starts_with would probably be the cleaner solution though. The solutions posted so far do the opposite of what seems to be the intent of the regex though. So here you go:


ActionController::Routing \
      .controller_paths \
      .any? { |path| base.starts_with? File.expand_path(p) }

And yeah, breaking into multiple lines for "exception" catching strikes me as very UnRuby. It ignores the beauty of method-chaining, but more to the point, you're unlikely to throw an Error expanding a path unless your routes are messed up. Which is going to be pretty obvious anyways when your links blow up with a big fat "Rails Application Error".

2007-01-17 Reply Admin

Both sides of the expressions are strings

nope: one is an array of strings, the other is a string

2007-01-17 Reply Admin

You got me there. Of course your comment is still BS considering the same iteration is occurring either way. The real difference is where (and how) the comparison takes place.

String =~ Regexp vs. String#starts_with(String)

2007-01-17 Reply Admin

james:

your plugin is great - period.
on a wtf-scale between 1 (low wtf) and 10 (max wtf) yours is at most a 0.5
even if it's a wtf at least it's still working
if you wanna see real wtf, there's better around (windows source, loads of php webapps, java senior architects, cobol legacy stuff, ...)

2007-01-17 Reply Admin

Okay, for those who don't understand Ruby, or don't get the WTF, let me explain:

There is an array of paths, and the code wants to know if a specific path exists as an element in the array. So the original code takes all the paths in the array and concatenates them, delimited by "|" characters, into a single regular expression. Then it tries to match the given path against that regular expression.

The correct way to do this is to pass the given path into the array's include?() method, which is a nifty method that simply asks the array whether it contains the item passed in.

2007-01-17 Reply Admin

comparing two strings using starts_with?:

compare character by character until difference occurs

comparing two strings using regexp:

compile regexp into a DFA or even NFA see: http://www.softec.st/en/OpenSource/DevelopersCorner/RegularExpressions/RegularExpressionEngines.html
run your input with it

so regexp - comparison is by definition more expensive in terms of processing time and memory

quamaretto · 2007-01-17 Reply Admin

Dan:
Ruby and/or Rails make me think back to the good ol' days of APL. If I can write a whole program on one line using wingding characters it has to be good.

Well, that isn't so much a past thing...

2007-01-17 Reply Admin

Baggy McBagster:
I agree that people from purely 'dynamically typed' backgrounds often tend to pathologically reduce everything to a string.
But the idea of 'stringless days' is new and brilliant!

Why, on 'stringless Friday' nobody will be able to convert between European and American date formats by expressing the date as a string and using a regex to twiddle the string -- that in itself would probably be a 50% increase in productivity on some projects I could mention.

How nice that you agree with an unfounded supposition. Myself, I code about 75% in python (dynamic, yet strictly typed, same as Ruby), and I never do this sort of nonsense. My other 25% of time is spent in C++, where again, I would never do this sort of thing.

Regexes have there place. I'm not sure this is one of them. It would help if I understood Ruby, or knew clearly the intent of the original author. I will admit to using regexs to match a whole literal string, but its usually when I'm writing generic code and won't always be matching a literal string and want the flexibility.

2007-01-17 Reply Admin

Karl: Enumerable#include? is not the solution since it only tests straight equality.

Pete: Your post said nothing about expense. I don't disagree that Regexps aren't the ideal solution here (despite Ruby having a very fast regex implementation, and the fact that Regexp literals are cached). Frankly I just took offense at your characterization (sp?) of Dynamic Language programmers as being afraid of iteration, when the example put forth actually contains an iterator. Now come on, that's just silly.

2007-01-17 Reply Admin

cout:
Now if an exception is raised at any point, it is clear which operation caused the exception.

Unlike PHP, Ruby provides a full stack trace by default with every exception and not just a line number, so you'll always know exactly which method call failed.

Cthulhu · 2007-01-17 Reply Admin

VB makes a whole lot more sense than that. In VB you can egelently just say:

For i = 1 To pathsize step 1

if path(i) == thepath then goto getout endif

Next i print "the path doesnt match" goto jumpovr getout: print "the path matches" jumpover

2007-01-17 Reply Admin

The correct way to do this is to pass the given path into the array's include?() method, which is a nifty method that simply asks the array whether it contains the item passed in.

Not exactly. What the code really wants to do is see if any of the elements of the array starts_with the given string - hence the

in the code, which is actually the bug that seems to have inspired this 'wtf'.

Sam's correction above seems to hit the spot, though is probably a bit slower. Whether or not this is issue enough to improve the readability is pretty much a personal choice.

2007-01-17 Reply Admin

Gareth:
Unlike PHP, Ruby provides a full stack trace by default with every exception and not just a line number, so you'll always know exactly which method call failed.

hmmm, actually PHP can do that too : http://uk2.php.net/debug_backtrace (oh and better, you can actually recover programmatically from an error if you're clever enough, because you get an array of the stack trace!)

Anyway, next time someone talks to me about how great Ruby is and I'll point it to that WTF... who would ever want to use a language that you can't even read out loud (let a lone understand a single line of it) after 15 years of programming experience in C, C++, VB, Java and PHP (which is quite a wide choice, isn't it?)

2007-01-17 Reply Admin

savar:
Instant winner for best ASCII art on TDWTF. Let the competition begin for round 2.

I ask for a bonus, because seems monospace break the page layout and push the banners outside the visible area.

2007-01-17 Reply Admin

Sad Developer:
Gareth:
Unlike PHP, Ruby provides a full stack trace by default with every exception and not just a line number, so you'll always know exactly which method call failed.

hmmm, actually PHP can do that too : http://uk2.php.net/debug_backtrace

c Yes, my subtle emphasis on the phrase 'by default' was meant to show that Ruby realises that it's quite a good idea, rather than waiting for the 4th major release to implement error handling.

In any case, it wasn't meant to be a big PHP-poke, just to point out to someone that Ruby does have this function (and I was assuming they were familiar with PHP as it's the only mainstream language I know which doesn't provide decent error details by default). Obviously it's good that some people know about PHPs backtrace, but everyone who's seen a Ruby exception knows about the stack trace.

Sad Developer:
(oh and better, you can actually recover programmatically from an error if you're clever enough, because you get an array of the stak trace!)

Rather than throwing different exceptions based on the error and using the type of exception to identify the problem? Is that not possible/good enough for PHP?

(okay, that was a poke at PHP and PHP developers)

cout · 2007-01-17 Reply Admin

Gareth:
cout:
Now if an exception is raised at any point, it is clear which operation caused the exception.

Unlike PHP, Ruby provides a full stack trace by default with every exception and not just a line number, so you'll always know exactly which method call failed.

I didn't know PHP had that behavior.

Regardless, even with the method name in the stack trace, it's often not clear where the exception is coming from. Though not the case here, this is especially true when there are multiple chained calls to the same method in the same line; in that case it's often impossible to tell which operation caused the exception.

Often chained methods return nil or some other unexpected value. In that case, a NoMethodError will result, and there is no information in the stack trace to indicate which method returned nil.

In other cases chained methods still make things difficult. The reader must first figure out the sequence of events, noting that chained method calls (foo.bar.baz()) are evaluated left-to-right but nested method calls (foo(bar(baz()))) are evaluated right-to-left. Only after that analysis is done can the sequence of events leading up to the error be determined.

2007-01-17 Reply Admin

Cthulhu:
VB makes a whole lot more sense than that. In VB you can egelently just say:
For i = 1 To pathsize step 1

if path(i) == thepath then goto getout endif

Next i print "the path doesnt match" goto jumpovr getout: print "the path matches" jumpover

Ah, gotos. How egelent.

2007-01-17 Reply Admin

Are you kidding me? This DailyWTF is an example of a poor choice that someone made, not a language limitation. In fact, the solution that the author offers does read nicely...

ActionController::Routing.controller_paths.map { |p| File.expand_path(p) }.include?(base)

Inside the ActionController::Routing module, there is an array called "controller_paths", map (or collect) the results after calling File.expand_path on each element, and check to see if the value stored in the variable "base" matches any element in the array. It could also be written like this:

ActionController::Routing.controller_paths.any? { |p| base == File.expand_path(p) }

The largest part of that code is drilling down to access the variable you want to use "ActionController::Routing.controller_paths", the code to determine a match is almost self explanitory. I'm suprised to see someone of your supposed expereince dissing a great langauge based on a silly porrly-coded example.

2007-01-17 Reply Admin

.... My comment above is directed at "Sad Developer", I forgot to click "Quote"....

cout · 2007-01-17 Reply Admin

pete:
so regexp - comparison is by definition more expensive in terms of processing time and memory

It's the combination of compilation and comparison that is expensive, not the comparison itself. Often the regexp can be compiled ahead of time, and in that case can even be faster than a sequence of string comparisons.

irb(main):006:0> require 'benchmark'
=> true
irb(main):046:0> Benchmark.bm(20) { |x|
irb(main):047:1*   x.report('string compare') { N.times { [ 'apple', 'ape', 'apogee', 'apprentice'].include?('foo') } }
irb(main):048:1>   x.report('regexp precompiled') { r = /^(apple|ape|apogee|apprentice)$/; N.times { r.match('foo') } }
irb(main):049:1>   x.report('regexp not precomp') { N.times { r = /^(apple|ape|apogee|apprentice)$/; r.match('foo') } }
irb(main):050:1> }
                          user     system      total        real
string compare        8.660000   0.000000   8.660000 (  8.706296)
regexp precompiled    3.260000   0.000000   3.260000 (  3.258284)
regexp not precomp    3.870000   0.010000   3.880000 (  3.905989)
=> true

2007-01-17 Reply Admin

What, you never heard of string beans? They go great with Java!

CAPTCHA: cognac (my favorite: hope it's ***)

2007-01-17 Reply Admin

Sad Developer:
(oh and better, you can actually recover programmatically from an error if you're clever enough, because you get an array of the stack trace!)

http://ruby-doc.org/core/classes/Kernel.html#M005725

Sad Developer:
Anyway, next time someone talks to me about how great Ruby is and I'll point it to that WTF... who would ever want to use a language that you can't even read out loud (let a lone understand a single line of it) after 15 years of programming experience in C, C++, VB, Java and PHP (which is quite a wide choice, isn't it?)

No, that's about 1.5 actually different programming languages in total, with VB being a weird notation for an old version of C. Not really a wide choice, throw in functional Lisp / Scheme, maybe Prolog, advanced SQL, and a ML, and you'll get close to a wide choice. Appeal to own authority failed. (The fact you can't read a non-C language is arguably your own problem here.)

Oh, also, while we're at the topic of missing the point, that's a WTF with a library in Ruby, not the language. You can do the same class of screwup in any language.

Gods how I hate amateurish programming language trolling.

2007-01-17 Reply Admin

quamaretto:
Well, that isn't so much a past thing...

The goggles! They do nothing!

Captcha: slashbot. I take offense at the accusation.

2007-01-17 Reply Admin

I understand nothing of this, but the attaching monstrosities to carrots part made me laugh.. haha. Don't people learn from previous WTFs? As I recall the carrot/caret thing has come up in at least 2 previous WTFs.

captcha: dubya.. what?

2007-01-17 Reply Admin

Cthulhu:
VB makes a whole lot more sense than that. In VB you can egelently just say:
For i = 1 To pathsize step 1

if path(i) == thepath then goto getout endif

Next i print "the path doesnt match" goto jumpovr getout: print "the path matches" jumpover

Nice use of the double goto. Makes me feel young again!

2007-01-17 Reply Admin

Actually as far as performance goes the Ruby Regexp Engine is pretty damn fast, though Array#include? is definitely faster in this case. There's actually a bigger wtf here though that you didn't mention...

From the actual source:

paths = $LOAD_PATH.select do |base|
    base = File.expand_path(base)
    # Check that the path matches one of the allowed paths in controller_paths
    base.match(/^#{ActionController::Routing.controller_paths.map { |p| 
      File.expand_path(p) } * '|'}/)
end

Array#select returns a new array with all elements for which the block returned true. Yes, that's right, its going to map, expand and compile this regexp all over again for every element in the $LOAD_PATH!

A better way to do this would be:

expanded_load_paths = ActionController::Routing.controller_paths.map { |p| 
  File.expand_path(p) }
paths = $LOAD_PATH.select do |base|
    # Check that the path matches one of the allowed paths in controller_paths
    expanded_load_paths.include? File::expand_path(base)
end

Even that is doing a little much though since we get to search, and in turn iterate the array for every element again.

Since duplicates are impossible how about using a Set operation?

#expand load and routing paths
routing_paths = ActionController::Routing.controller_paths.map { |path| 
  File.expand_path(path) }
load_paths = $LOAD_PATH.map {|path| File::expand_path(path) }
# find all routing paths in load paths using a set intersection
paths = routing_paths & load_paths

:)

2007-01-17 Reply Admin

Re: cout:

Well that's certainly interesting. It's a tad unfair as a comparison since the topic code wouldn't be instantiating an Array each time though. That's a lot of extra object creation going on. Here's my result:

irb(main):001:0> require 'benchmark'
=> true
irb(main):002:0> a = %w{ apple ape apogee apprentice }
=> ["apple", "ape", "apogee", "apprentice"]
irb(main):003:0> N = 100000
=> 100000
irb(main):004:0> Benchmark::bm(25) do |x|
irb(main):005:1*   x.report("string compare") { N.times { a.include? 'foo' } }
irb(main):006:1>   x.report("regexp literal (cached)") { N.times { 'foo' =~ /^(apple|ape|apogee|apprentice)$/ } }
irb(main):007:1>   x.report("regexp new (not cached)") { N.times { 'foo' =~ Regexp.new('^(apple|ape|apogee|apprentice)$') } }
irb(main):008:1> end
                               user     system      total        real
string compare             0.172000   0.000000   0.172000 (  0.172000)
regexp literal (cached)    0.172000   0.000000   0.172000 (  0.172000)
regexp new (not cached)    1.703000   0.046000   1.749000 (  1.766000)
=> true

So it's not quite the difference, but it does reveal that you were on-track. It also demonstrates the difference in performance between Regexp literals, and new Regexp instances. Really Ruby has hands down the fastest regex engine I've used, which is kinda surprising to be honest. (Not saying it's the fastest, just the fastest I've used among VBScrpt, JavaScript, Ruby, Python, c#.)

2007-01-17 Reply Admin

JTK:
Cthulhu:
VB makes a whole lot more sense than that. In VB you can egelently just say:
For i = 1 To pathsize step 1

if path(i) == thepath then goto getout endif

Next i print "the path doesnt match" goto jumpovr getout: print "the path matches" jumpover

Nice use of the double goto. Makes me feel young again!

In defense of the poor guy, I remember that TRS-80 BASIC didn't have "Exit For". But it did at least have "If i > pathsize Then"!

Refactoring in the TRS-80 days was great, BTW. Lots of cut and paste. As in scissors. And tape. I took a printout of my spaghetti code full of gotos, cut it into pieces, rearranged them, and typed it back in. Then CSAVE.

Captcha: ewww. Also my reflexive reaction to those GOTOs.

RubyOnRegexp

Leave a comment on “RubyOnRegexp”