• faoileag (unregistered) in reply to Mark
    Mark:
    jarfil:
    Give me a break.
    Or just return false within the loop itself and return true immediately after.
    That would violate the principle of a single exit point for functions, which some coding styles mandate.

    Yes, it can make for ugly code. It nevertheless can be found in a lot of rule books because it has a nice effect on cyclomatic complexity.

  • Tinctorius (unregistered) in reply to jarfil

    What they meant was allPreviousElementsWereAlpha, but anyone with a brain would see what isChar means.

  • Jason (unregistered)
    I almost choked on a Frito when I read this. I had to try this myself to believe it.

    Interestingly, SQL Server will also sort the results even if you omit the ORDER BY clause. You can't make this stuff up.

    UNION doesn't guarantee ordering, but it is likely that the DBMS will sort the intermediate results in order to de-duplicate them. The observed ordering in this case is just a side effect of the implementation, and shouldn't be relied upon (as the specification will probably say that the order of results returned is arbitrary)

  • JJ (unregistered) in reply to 4c's A.G.N
    4c's A.G.N:
    Cantabrigian:
    "regex scares me a little, OK, alot actually"

    Aaaaargh! It's "a lot" - two words. You wouldn't write "afew", would you? Then again, it's probably better than writing "allot" (when meaning "a lot", not "share out"). </petpeeve>

    Oh, great, another grammar nazi. As if we needed more here, you guys are diamond dozen.
    For all intensive purposes, you're correct.
  • (cs) in reply to JJ
    Jason:
    UNION doesn't guarantee ordering, but it is likely that the DBMS will sort the intermediate results in order to de-duplicate them. The observed ordering in this case is just a side effect of the implementation, and shouldn't be relied upon (as the specification will probably say that the order of results returned is arbitrary)

    yeah, sounds like the order of an identity field. tends to appear in order, but no guarantee.

    JJ:
    For all intensive purposes, you're correct.

    troll bait?

  • David (unregistered) in reply to Peteris

    [quote user=Peteris]The OR solution would have different results in cases where a record matches multiple search criteria - in that case, the current solution would include duplicate rows.[/quote]

    No OR and UNION should give the same result. You are thinking of UNION ALL. UNION is the same as "select DISTINCT * from (select UNION ALL select)".

  • Paul Neumann (unregistered) in reply to planB
    planB:
    Nonody != Nobody
  • Paul Neumann (unregistered) in reply to Cantabrigian
    Cantabrigian:
    "regex scares me a little, OK, alot actually"

    ]b]<petpeeve>[/b]Aaaaargh! It's "a lot" - two words. You wouldn't write "afew", would you? Then again, it's probably better than writing "allot" (when meaning "a lot", not "share out"). </petpeeve>

    Please use proper markup.
  • Norman Diamond (unregistered) in reply to Steve The Cynic
    Steve The Cynic:
    Yank:
    So you're one of those who likes to think the UK knows anything about good English?
    I think England might know something about English.
    Spike Milligan:
    The English, The English, The English are best. I wouldn't give tuppence for all the rest.
    ("tuppence": Two pence, that is £0.02, which is worth more than your feeble $0.02, but still not worth all that much.)
    If you're English you oughta know that Spike Milligan's tuppence were 2/240 of a pound not 2/100 of a pound.
  • Norman Diamond (unregistered) in reply to Cbuttius
    Cbuttius:
    If Spike Milligan made this quote before 1971 then tuppence was 2d not 2p and was worth 1/120th of a pound as there were 240 pence in the old pound.
    Yes, you beat me to it.
    Cbuttius:
    Therefore for 2d to be worth more than 2 US cents, the exchange rate GBPUSD would have to have been more than 2.4 (2.4 US dollars to the pound sterling). I'm not sure it ever got that high, even before 1971.
    You mean like when 1 GBP was 5 USD? The US copied the dollar from Spanish Mexico. England copied some too: two kings' heads weren't worth a crown.
  • Norman Diamond (unregistered) in reply to pencilcase
    pencilcase:
    I'm not surprised at you being a little off the mark.
    OF COURSE he is. They've converted to the euro now. (Though not the part that still uses pounds.)
    pencilcase:
    Now, where did I leave my specs...?
    WTF? We're computer professionals here, you know? We don't get no stinking specs.
  • Ben Vail (unregistered) in reply to MySQL geek

    Indeed, union means the individual parts of the query can use indexes, whereas an OR would prevent any index, so it's not actually a bad idea here, :)

  • Ben Vail (unregistered) in reply to chubertdev
    chubertdev:
    JJ:
    For all intensive purposes, you're correct.

    troll bait?

    It's an eggcorn. I used to think this was the phrase too... :)

  • Meep (unregistered) in reply to MySQL geek
    MySQL geek:
    Actually, here UNION is the same as OR, but in the MySQL point of view, presents a better performance ;)

    Bullshit.

    There is no way

    select x from foo where a = 1 union select x from foo where a = 2
    is more performant than
    select x from foo where a = 1 or a = 2
    .

    You simply can't do multiple table scans and beat a single one; aside from the redundant IO, your UNION requires a hash join to toss duplicate rows.

    And if MySQL's optimizer is not complete shit, the two should have an identical query plans.

  • W. (unregistered) in reply to jarfil
    jarfil:
    for god's sake, don't name them "isChar" like it was the actual result...

    Thanks! I've spent the last two days redoing all my code to include as many premature assertions as possible :)

  • nuby (unregistered) in reply to @Deprecated
    @Deprecated:
    Larry:
    The problem with regexen is a lot like the problem with library functions: you can never be entirely sure what they're doing. So, in cases like this where the results are important, it is safest to write your own.

    In many cases it is safest NOT to write your own! EG., encryption. Or did I just win a "whooosh"?

    Did he really have to use charAt for two linear ranges? I think that if (c >= 'a' && c <= 'z') is less error-prone than "abcdefghijklmnpqrstuvwxyz".chartAt(c) != -1 Quick, spot the error!

    the error seems to be that in your keyboard the 'o' key is malfunctioning

  • Josh Lange (unregistered) in reply to MySQL geek

    The union query makes a lot of sense in this situation. It is an optimization over the OR.

    Most database engines don't do a very good job with OR, especially if it is the only condition (there are no columns which can be used as the first column in an index to make the data returned more selective), the table has a lot of rows, and the data is very sparse.

    The OR will be downgraded into a table scan. If there are millions of records, that means millions of compares, and a lot of disk IO.

    In the union query, since the like condition is bounded on the left side (and assuming each of these columns are the first column in at least one index), each sub query will be very selective. The union would then sort and filter out the duplicates (which shouldn't be expensive, as long as the query was very selective).

  • Barf 4Eva (unregistered) in reply to Meep
    Meep:
    MySQL geek:
    Actually, here UNION is the same as OR, but in the MySQL point of view, presents a better performance ;)

    Bullshit.

    There is no way

    select x from foo where a = 1 union select x from foo where a = 2
    is more performant than
    select x from foo where a = 1 or a = 2
    .

    You simply can't do multiple table scans and beat a single one; aside from the redundant IO, your UNION requires a hash join to toss duplicate rows.

    And if MySQL's optimizer is not complete shit, the two should have an identical query plans.

    This is not an example of the problem in the query listed by the OP. Granted, I feel for the OP's example it is due to a lot of "WTF?", but here the filter is an OR on different columns checking for the same attribute data.. Sometimes these scenarios lend themselves to exlcuding use of an index, depending on the complexity and index setups of course.

  • ledtvreviews (unregistered)

    Thank you, very much.................................

  • TOuchy (unregistered) in reply to jarfil
    jarfil:
    var isChar = true; 
    Obvious WTF apart, I really HATE it when people state things in their code that are just not true.

    Do you know it's a Char? NO, not yet. So don't freaking set "isChar" to true! Make your checks, use whatever temporal variables you need, but for god's sake, don't name them "isChar" like it was the actual result... like it IS the actual result, and then you get on and on checking it on each iteration of the loop... WTF!

    Oh, right, you have to be able to break out of the look somehow. Let me guess, maybe "break" would work? Maybe setting the counter out-of-bounds so the loop condition is no longer met? But oh no, you had to use "isChar", the freaking result variable, to break out of the loop.

    Give me a break.

    Other than the naming of the variable, I think your wrong - a LOT.....

    The boolean is used to indicate whether we have found something that invalidates our string from an alpha-only perspective. Assuming you agree NULL/Empty strings are ok, there is little issue startiing in the TRUE state.

    I also have no issue with is being used as a loop condition. If it were better named, it adds clarity that we might shortcut the loop.

    Of course it might have been better to simply return false rather than play with a var, a'la...:

    for (var i=0;(i<sStr.length);i++)    
        {
        Char = sStr.charAt(i);
        if (alpha.indexOf(Char)==-1)
            { 
            // This is not an alpa
            return false;
            }
        }
    return true;
    //oops, forgot to use num
    </pre>
    
  • JimFin (unregistered) in reply to jarfil
    jarfil:
    var isChar = true;

    Obvious WTF apart, I really HATE it when people state things in their code that are just not true.

    Do you know it's a Char? NO, not yet. So don't freaking set "isChar" to true.

    OK, what do you suggest?

    function checkAlpha(sStr)
    {
    var alpha = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
    var Char;
    var isSoFarAssumedToBeAlpha = true; 
    for (var i=0;(i<sStr.length) && (isChar==true);i++)    
        {
        Char = sStr.charAt(i);
        if (alpha.indexOf(Char)==-1)
            { 
            isSoFarAssumedToBeAlpha=false;
            }
        }
    return isSoFarAssumedToBeAlpha;
    } //end function
    </pre>
    

    Bummer. That, too, states something that is not true. It's no longer assumption within the test, it's a fact there.

    Let's try again.

    function checkAlpha(sStr)
    {
    var alpha = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
    var Char;
    var isSoFarAssumedToBeAlpha = true; 
    for (var i=0;(i<sStr.length) && (isChar==true);i++)    
        {
        Char = sStr.charAt(i);
        if (alpha.indexOf(Char)==-1)
            { 
            isSoFarAssumedToBeAlpha=false;
            }
        }
    var isReallyAlpha = isSoFarAssumedToBeAlpha;
    return isReallyAlpha.
    } //end function
    </pre>
    

    There. Now, we have explicitly coded the transition from assumption to a fact. But, there's still a problem. Assumption actually changed inside the loop, not after it. How do you state something that should always be true when it comes to meaning of the works?

    So, once more.

    function checkAlpha(sStr)
    {
    var alpha = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
    var Char;
    var isReallyAlphaDuringAndBeforeLoop = true;
    for (var i=0;(i<sStr.length) && (isChar==true);i++)    
        {
        Char = sStr.charAt(i);
        if (alpha.indexOf(Char)==-1)
            { 
            isReallyAlphaDuringAndBeforeLoop=false;
            }
        }
    var isReallyAlpha = isReallyAlphaDuringAndBeforeLoop;
    return isReallyAlpha.
    } //end function
    </pre>
    

    There, we nailed it! And now that we see what's really going on, we can start fixing the actual bugs there. Also, that is why regexes are so wrong: you really don't know if there are such assumptions, too.

  • G (unregistered) in reply to MySQL geek

    I can imagine with a too-dumb query planner it will not use the indexes in the "OR" version... do you mean to say MySQL is in fact that dumb though?

    But the "UNION" version is not in fact equivalent in its result, it will return the same row multiple times if it matches on multiple fields. To fix that he would need to add "DISTINCT" or "GROUP BY", and that may very well kill any performance advantage there was...

  • shinyemptyhead (unregistered) in reply to Larry

    Or, you know, test it.

  • shinyemptyhead (unregistered) in reply to shinyemptyhead
    shinyemptyhead:
    Or, you know, _test_ it.

    Bah, was intended to be a reply to the "you can't know what regexen do" comment, but hit reply instead of quote.

  • Paul W (unregistered) in reply to Larry

    You're an idiot. If that logic were applied, nobody would ever use library functions, but would write everything themselves. How is that safe, or a good idea? Library functions have, we assume, been checked a lot more thoroughly than your ad-hoc code!

  • AllThatJazz (unregistered) in reply to HAL

    The EntityTypeDescription should have been 'AE35'

  • old mysql user (unregistered) in reply to Meep
    Meep:
    MySQL geek:
    Actually, here UNION is the same as OR, but in the MySQL point of view, presents a better performance ;)

    Bullshit.

    There is no way

    select x from foo where a = 1 union select x from foo where a = 2
    is more performant than
    select x from foo where a = 1 or a = 2
    .

    You simply can't do multiple table scans and beat a single one; aside from the redundant IO, your UNION requires a hash join to toss duplicate rows.

    And if MySQL's optimizer is not complete shit, the two should have an identical query plans.

    Mysql before 5.0 had a limitation of 1 index (you are using indexes, right?) per table per query. So if you had an index on (ProjectCode), an index on (ProjectDesc), an index on (TaskCode), and an index on (TaskDesc), doing the query as written would be much faster because it could use each index once, and merge the results together. Otherwise, with the OR clauses, you'd be looking at a single table scan.

    e.g. http://www.mysqlperformanceblog.com/2012/12/14/the-optimization-that-often-isnt-index-merge-intersection/

Leave a comment on “The New TODO and More”

Log In or post as a guest

Replying to comment #:

« Return to Article