• (cs)

    "Let me tell you a story of a man named Jed. Poor mountaineer, but he kept his family fed..."

  • yoda (unregistered) in reply to ParkinT
    ParkinT:
    "Let me tell you a story of a man named Jed. Poor mountaineer, but he kept his family fed..."
    I believe the proper wording should have been: "FRIST MOTHAF**k*s!!!!"
  • (cs)

    My reverse polish notation calculator has a customary complaint about vbnet being trwtf...

    And go with the code that works better.

  • Ben Jammin (unregistered) in reply to ParkinT
    ParkinT:
    "Let me tell you a story of a man named Jed. Poor mountaineer, but he kept his family fed..."
    Then one day, to Bob he showed, a better version of Bob's current code. (Revision 2.0 that is)
  • AB (unregistered)

    This really p!sses me off:

    If matches.Count > 0 Then Return True Else Return False End If

    Its so pointless. Why do sheeple do it? Can they not see the total redundancy of 5 lines of code? Do they really think its "more readable"?

    Return matches.Count > 0

  • (cs)

    Wait, is that regex solution supposed to elegant?

  • Will Murnane (unregistered)

    ... except Jed's solution doesn't work. Suppose I want to search for "$10". Not to mention all the string manipulations, use a StringBuilder.

  • Pro Coder (unregistered)

    If you do didn't immediately think that this should be done with a hash, a sort, and a binary search, kill yourself now and save us all future aggravation.

  • Jan de Vos (unregistered) in reply to AB
    If matches.Count > 0 Then Return True Else Return False End If

    Its so pointless. Why do sheeple do it? Can they not see the total redundancy of 5 lines of code? Do they really think its "more readable"?

    I believe this is one of those cases where, once you've made a certain mental step, you simply can't go back to thinking the way you did before.

    In this case, the mental step is that 'a < b' is really something that results in a boolean value, and is not 'some special syntax that can be used in if and while constructions'.

  • Iscu (unregistered)

    Management can always 1up you.

  • (cs) in reply to ParkinT
    ParkinT:
    "Let me tell you a story of a man named Jed. Poor mountaineer, but he kept his family fed..."
    It's been years since I've seen that show, but I always heard that line as "...barely kept his family fed."
  • Martin Milan (unregistered)

    There - fixed that for you...

    Public Function DocContainsAtLeastOneTerm( _ ByVal searchTerms() As String, _ ByVal searchText As String) As Boolean

    	Dim searchTextWords As String() = searchText.ToLower().Split({" "}, StringSplitOptions.RemoveEmptyEntries)
    	Return searchTextWords.Any(Function (x)
    	                           	Return searchTerms.Any(Function (y)
    	                           	                       	
    	                           	                       	Return y.ToLower() = x
    	                           	                       	End function
    	                           	End Function
    End Function
    
  • Martin Milan (unregistered)

    That looked a lot more elegant in SharpDevelop lol...

  • Fixed you say? (unregistered)

    Can anyone say N*M complexity?

  • Hoffmann (unregistered)

    minuses for Jed:

    • is looking case sensitive
    • regEx is expensive
    • not stable for regEx-search-tearms
    • pointless "if (boolen) then true else false"

    In the end Jed should just have shut up!

  • Philipp (unregistered)

    The second solution doesn't look much better than the first one to me.

    Creating a regular expression by concatenating strings without escaping them is dangerous. When the strings contain any characters which have a special meaning in regular expressions, they will break everything.

    Also, I don't know much about VB, but why are they both passing the arguments by value ("ByVal")? They seem to be pretty large (an array of strings and a very long string) and aren't changed in the code. Wouldn't it be better to pass them by reference?

  • (cs) in reply to Martin Milan
    Martin Milan:
    There - fixed that for you...

    Public Function DocContainsAtLeastOneTerm( _ ByVal searchTerms() As String, _ ByVal searchText As String) As Boolean

    	Dim searchTextWords As String() = searchText.ToLower().Split({" "}, StringSplitOptions.RemoveEmptyEntries)
    	Return searchTextWords.Any(Function (x)
    	                           	Return searchTerms.Any(Function (y)
    	                           	                       	
    	                           	                       	Return y.ToLower() = x
    	                           	                       	End function
    	                           	End Function
    End Function</div></BLOCKQUOTE>
    

    You are working too hard. This function simply needs to return a boolean. If you return matching search results you will confuse the other developers. You have to add a check for no results and return false, else true. Then go shoot bob in the face.

  • (cs)
    I don't believe that your experience working on the Web places you in a position to critique my work

    Anyone who responds like that needs to be fired swiftly and brutally. This industry has no place for people who think their code is infallible and refuse to take critiques.

  • (cs) in reply to Philipp
    Phillipp:
    I don't know much about VB, but why are they both passing the arguments by value ("ByVal")? They seem to be pretty large (an array of strings and a very long string) and aren't changed in the code. Wouldn't it be better to pass them by reference?
    As far as performance, it doesn't make much of a difference because it's just copying a pointer (basically). However, from an interface perspective, ByRef implies the parameter could be pointing to a new object after calling the method.
  • anonym (unregistered)

    need to push this update

    If searchTerms.Length >= 1 Then For i As Integer = 1 To searchTerms.Length - 1 Step 1 regexString += "|" + searchTerms(i) Next End If

    or

    If searchTerms.Length > 0 Then For i As Integer = 1 To searchTerms.Length - 1 Step 1 regexString += "|" + searchTerms(i) Next End If

  • (cs) in reply to Pro Coder
    Pro Coder:
    If you do didn't immediately think that this should be done with a hash, a sort, and a binary search, kill yourself now and save us all future aggravation.
    Killing oneself seems to be a little overkill. Anyways... In the case of only one term, doing all of the extra work of hashing and or sorting would negate any time savings of a binary search. So your solution might not be the best, depending on the business need.
    Hoffmann:
    minuses for Jed: - is looking case sensitive - regEx is expensive - not stable for regEx-search-tearms - pointless "if (boolen) then true else false"

    In the end Jed should just have shut up!

    I could not agree more.

  • Dilbertino (unregistered)

    A regex wtf, and nobody mentioned the perl oneliner yet? :)

    return grep { $searchText =~ $_ } @searchTerms;

    It continues to amaze me how verbose other languages are for the most basic operations...

  • Martin Milan (unregistered)

    PiisAWheeL - read it again... I think you'll find I actually do return a boolean, and not matches.

    Martin.

  • Melnorme (unregistered)

    Why would need a sort and binary search? Just put all the words in the text into a hash set and do a contains (or whatever the VB equivalent is) for each word.

  • (cs)

    The documented requirements:

    Write a function that returns true if at least one string in the /search terms/ array is found in /bigstring/, otherwise returns false.

    So, my version, written in pseudocode because I'm lazy:

    func containsAtLeastOneOf( array-of-strings searchTerms, string bigString ) returns truefalse
    
    for each searchTerm in searchTerms:
      if bigstring contains searchTerm:
        return true
    
    return false

    No regex, no stacks, no hash tables, nothing. Just enumerate the search terms and return true as soon as you find one that's in the big string. If you don't find one, return false.

    Jed is an idiot, as is Bob. So are most of you.

  • Herr Otto Flick (unregistered)

    TRWTF is that Jed thought that his solution was right.

    Some people, when confronted with a problem, think “I know, I'll use regular expressions.” Now they have two problems.

  • (cs)

    Is this a WTFer competition? Both solutions are awful

  • (cs) in reply to Melnorme
    Melnorme:
    Why would need a sort and binary search? Just put all the words in the text into a hash set and do a contains (or whatever the VB equivalent is) for each word.
    That's fine if you know you are looking for whole words, and never fragments of words, nor short groups of words. Not so hot if I want to find "not so hot" in the text...
  • Black Bart (unregistered) in reply to Pro Coder
    Pro Coder:
    If you do didn't immediately think that this should be done with a hash, a sort, and a binary search, kill yourself now and save us all future aggravation.

    And don't forget the Linq query

  • Melnorme (unregistered) in reply to Steve The Cynic
    Steve The Cynic:
    Melnorme:
    Why would need a sort and binary search? Just put all the words in the text into a hash set and do a contains (or whatever the VB equivalent is) for each word.
    That's fine if you know you are looking for whole words, and never fragments of words, nor short groups of words. Not so hot if I want to find "not so hot" in the text...

    Ah, that's true. It's not clear how you can divide the text into discrete elements to be hashed, since the terms you're looking for can be pretty much anything.

  • Pro Coder (unregistered) in reply to Melnorme
    Melnorme:
    Why would need a sort and binary search? Just put all the words in the text into a hash set and do a contains (or whatever the VB equivalent is) for each word.
    Because Contains() is O(n) and a binary search is O(log n). If you work for me, I demand mandatory suicide from all team members who can't find a solution that is at least as good as O(log n). \m/

    This illustrates the point as to why using built-in library functions is considered harmful. It short-circuits critical thought and analysis.

  • Melnorme (unregistered) in reply to Pro Coder
    Pro Coder:
    Melnorme:
    Why would need a sort and binary search? Just put all the words in the text into a hash set and do a contains (or whatever the VB equivalent is) for each word.
    Because Contains() is O(n) and a binary search is O(log n). If you work for me, I demand mandatory suicide from all team members who can't find a solution that is at least as good as O(log n). \m/

    This illustrates the point as to why using built-in library functions is considered harmful. It short-circuits critical thought and analysis.

    Please read up on what a hash set is.

  • Melnorme (unregistered) in reply to Melnorme
    Melnorme:
    Pro Coder:
    Melnorme:
    Why would need a sort and binary search? Just put all the words in the text into a hash set and do a contains (or whatever the VB equivalent is) for each word.
    Because Contains() is O(n) and a binary search is O(log n). If you work for me, I demand mandatory suicide from all team members who can't find a solution that is at least as good as O(log n). \m/

    This illustrates the point as to why using built-in library functions is considered harmful. It short-circuits critical thought and analysis.

    Please read up on what a hash set is.

    Wait a minute, I'm being trolled, right?

  • pjt33 (unregistered) in reply to Steve The Cynic
    Steve The Cynic:
    Melnorme:
    Why would need a sort and binary search? Just put all the words in the text into a hash set and do a contains (or whatever the VB equivalent is) for each word.
    That's fine if you know you are looking for whole words, and never fragments of words, nor short groups of words. Not so hot if I want to find "not so hot" in the text...
    It's functionally equivalent to Bob's code, but much more efficient.

    TRWTF is that Bob thinks splitting the big string m*n times (where m is the number of search terms and n is the number of words in the big string) isn't insanely stupid. If he pulled that out of the loop then his code wouldn't be great, but it would probably be perfectly adequate for expected usage.

  • airdrik (unregistered) in reply to pjt33

    Oh, but he isn't splitting it nm times, he's splitting it nn times. Not that it makes that much of a difference if you have at least as many search terms as words in your document, and regardless of the number of search terms it is still insanely stupid.

  • Anon (unregistered) in reply to Herr Otto Flick
    Herr Otto Flick:
    TRWTF is that Jed thought that his solution was right.

    Some people, when confronted with a problem, think “I know, I'll use regular expressions.” Now they have two problems.

    +1

  • bedroom (unregistered)

    For me, TRWTF in the second solution was the building of the regex.

    Dim regexString As String = "(" + String.Join("|", searchTerms) + ")"

    Look ma, no hands!

  • DGM (unregistered) in reply to Dilbertino
    Dilbertino:
    A regex wtf, and nobody mentioned the perl oneliner yet? :)

    return grep { $searchText =~ $_ } @searchTerms;

    It continues to amaze me how verbose other languages are for the most basic operations...

    Hear hear! I about choked trying to figure out what that verbose code was trying to do, but this one line makes sense!

  • NN (unregistered) in reply to AB
    AB:
    This really p!sses me off:

    If matches.Count > 0 Then Return True Else Return False End If

    Its so pointless. Why do sheeple do it? Can they not see the total redundancy of 5 lines of code? Do they really think its "more readable"?

    Return matches.Count > 0

    better:

    Return (New Regex(regexString)).Matches(searchText) > 0

  • Martin Milan (unregistered) in reply to Fixed you say?

    Nope - but I can say "quick, simple, and exits on first match"...

  • (cs)

    Sorry, but I got stuck on the fact that the other developers inboxes were on the floor. Did I miss a wild party?

  • (cs) in reply to pjt33
    pjt33:
    Steve The Cynic:
    Melnorme:
    Why would need a sort and binary search? Just put all the words in the text into a hash set and do a contains (or whatever the VB equivalent is) for each word.
    That's fine if you know you are looking for whole words, and never fragments of words, nor short groups of words. Not so hot if I want to find "not so hot" in the text...
    It's functionally equivalent to Bob's code, but much more efficient.

    TRWTF is that Bob thinks splitting the big string m*n times (where m is the number of search terms and n is the number of words in the big string) isn't insanely stupid. If he pulled that out of the loop then his code wouldn't be great, but it would probably be perfectly adequate for expected usage.

    Upon doing some careful reading of both Jed and Bob's code they are not even functionally the same. Bob's code does a wonderful Split(" ") which means it only does single exact word match, but it does not appear to take in account periods or commas. Jed's code will support terms containing spaces, but some hilariousness can occur when dealing with unintended special characters in terms. So Jed's solution does not do what Bob's does, so Jed's might not even meet the business need (not saying that Bob's solution meets the it either).

  • Beaux Jackson (unregistered) in reply to Steve The Cynic
    Steve The Cynic:
    The documented requirements:

    Write a function that returns true if at least one string in the /search terms/ array is found in /bigstring/, otherwise returns false.

    So, my version, written in pseudocode because I'm lazy:

    func containsAtLeastOneOf( array-of-strings searchTerms, string bigString ) returns truefalse
    
    for each searchTerm in searchTerms:
      if bigstring contains searchTerm:
        return true
    
    return false

    No regex, no stacks, no hash tables, nothing. Just enumerate the search terms and return true as soon as you find one that's in the big string. If you don't find one, return false.

    Jed is an idiot, as is Bob. So are most of you.

    </thread>
  • ¯\(°_o)/¯ I DUNNO LOL (unregistered)
       If searchTerms.Length > 1 Then
         For i As Integer = 1 To searchTerms.Length - 1 Step 1
           regexString += "|" + searchTerms(i)
         Next
       End If
    

    DocContainsAtLeastOneTerm(("Wal*Mart", "$49.95"), searchText)

    Jed's "solution" makes Little Bobby Tables cry.

  • ¯\(°_o)/¯ I DUNNO LOL (unregistered) in reply to ¯\(°_o)/¯ I DUNNO LOL
    DocContainsAtLeastOneTerm(("Wal*Mart", "$49.95"), searchText)

    Jed's "solution" makes Little Bobby Tables cry.

    Also, as written, wouldn't it always return TRUE? The regex would look something like "(|ONE|TWO|THREE)", with a zero-length term that would always be found.

  • Rob (unregistered) in reply to ¯\(°_o)/¯ I DUNNO LOL
    ¯\(°_o)/¯ I DUNNO LOL:
    DocContainsAtLeastOneTerm(("Wal*Mart", "$49.95"), searchText)

    Jed's "solution" makes Little Bobby Tables cry.

    Also, as written, wouldn't it always return TRUE? The regex would look something like "(|ONE|TWO|THREE)", with a zero-length term that would always be found.
    That's why the first element is added separately, to prevent that leading |

    And why didn't Jed use a StringBuilder?

  • jarfil (unregistered)

    Whoa...

    For i As Integer = 0 To searchText.Split(" ").Length - 1 Step 1 searchTable.Push(searchText.Split(" ")

    ...

    For i As Integer = 1 To paramTable.stackHeight() - 1 Step 1 If searchTable.Contains(paramTable.Pop()) Then

    Way to go! NN complexity just for the sake of it, followed by MN to top it off.

    No wonder they had a performance bottleneck, if everything else was written like that.

  • Rafael (unregistered)

    Too bad Jed didn't remember to sanitize the search terms... Try and search for 1.56 and you shall fail when matching with 1056. That should be nice if we are speaking of figures.

  • (cs) in reply to Melnorme
    Melnorme:
    Melnorme:
    Pro Coder:
    Melnorme:
    Why would need a sort and binary search? Just put all the words in the text into a hash set and do a contains (or whatever the VB equivalent is) for each word.
    Because Contains() is O(n) and a binary search is O(log n). If you work for me, I demand mandatory suicide from all team members who can't find a solution that is at least as good as O(log n). \m/

    This illustrates the point as to why using built-in library functions is considered harmful. It short-circuits critical thought and analysis.

    Please read up on what a hash set is.

    Wait a minute, I'm being trolled, right?

    PROTIP: If you're posting here, and you're not trolling, you're being trolled.

  • jarfil (unregistered) in reply to Hoffmann
    Hoffmann:
    minuses for Jed: - regEx is expensive
    No, is not.

    RegEx initial parsing is somewhat expensive. RegEx execution on the other hand, is log(N), blazing fast for large amounts of search terms.

Leave a comment on “The Regex Code Review”

Log In or post as a guest

Replying to comment #:

« Return to Article