- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
http://blog.stevenlevithan.com/archives/javascript-regex-and-unicode
captcha: facilisi JS lacks the facilisi to match ελληνικές λέξεις
Admin
Tru, my bad - looked at my old sql code, and it's UNION ALL where needed. But even then the statements are not identical - if the table contains multiple identical rows, then UNION would throw out duplicates, but OR criteria would return as many rows as there were in the original dataset.
Admin
Word, yo.
captcha: ingenium - quite so
Admin
Assuming the last one is Javascript, and it looks like it, it isn't so trivial. Can be done with a regex.
See http://stackoverflow.com/questions/6800536/isalpha-replacement-for-javascript
Admin
Admin
I'm the one who submitted that or/union sample. I can appreciate that sometimes less logical SQL can be faster, but I can assure you that that though would never have entered this guy's head. The query completes close to instantly either way.
Admin
Not only MySQL. Also in some specific cases in Sybase (which Mssql derives from). So this might "just" be premature optimization ;)
Admin
Admin
Admin
Except, of course, when you're using a language that compiles regexes, in which case it will most likely produce faster code. And in this particular case the regex will be more readable and certainly easier to check (are you sure the uppercase O hasn't been miskeyed as a zero?)
Admin
That comment is so 1971.
Admin
Need to fix this
Admin
Has anyone considered turning option strict on for comments?
Admin
I really like how this was chosen as a featured comment so that it would immediately piss of anybody who views the article. Hilarious!
Admin
I was under impression that modern regex libs compiled the regex to a provably optimal version - so if you're really good, you may code a version that's as fast in processing and lacks some initialization overhead; but if you're not perfect, then your code will be slower.
Admin
Actually, while UNION and OR yield the same result in this example, the UNION requires a sort to remove duplicate rows (because it returns DISTINCT rows). It will not perform as well as an OR.
(My basis for this conclusion is based on IBM host DB2, but I confirmed similar behavior for MYSQL; see UNION vs UNION ALL performance.)
As he notes in the link, UNION ALL will perform much better (because no sort is required) but it is not exactly the same as OR, since overlapping predicates could cause a row to be returned more than once.
Addendum (2012-11-28 14:22): Oh, and a side-case for UNION (UNION DISTINCT):
Suppose a table exists where there is no unique key, such that duplicate rows could exist (that is, multiple rows having exactly the same value in each column). On a table of that type, UNION is not the same as OR either, since OR would return all matching rows, but UNION would delete the duplicate rows.
That's because UNION operates on the distinct data value of the whole row. So if two rows have the same values in all columns, UNION will remove the second row.
This can't happen if the table has a unique key because no duplicate rows can exist.
Admin
The problem with prepackaged CPUs is that you can't never be sure how the gates are wired. So in cases like this, where the results are important, it is safest to layout your own IC.
Admin
And as for "EntityTypeDescription", I don't think it's enterprise-y enough. He also needs "EntityTypeMetaType", "EntityTypeMetaTypeDescription", "EntityMetaEntityType", "EntityMetaEntityTypeDescription", "EntityMetaEntityTypeMetaType", and "EntityMetaEntityTypeMetaTypeDescription". Just to achieve that full enterprise flavor.
Admin
Also true for Oracle, provided the referenced columns are indexed...
Admin
Gives false positives in EBCDIC.
Admin
After playing around with UNIONs and ORs in SSMS, it's interesting to note that the ORDER BY clause has a bigger performance hit on the UNION version than the OR version. At least in my tests.
Admin
a) That's only a valid excuse if the code is running on one of these alleged old systems. b) "Think a bit" is bullshit - this isn't something you could magically figure out just by thinking if you didn't already know it worked like that.
Conclusion: you're a worthless retard.
Admin
Maybe if the results needed to be ranked by the number of criteria matches? Group the results and sort by highest count of duplicates. Then filter out the duplicates and you have a list sorted by relevancy. Probably a better way though.
Admin
It's not limited to using regular expressions, though. I have seen a massive amount of code where the developer definitely was not entirely sure what they were doing.
Admin
Same with IBM DB2. I once re-wrote a large query with 20 ORed predicates into 20 UNIONed queries. Massive performance gain - due to each sub-query being run in parallel.
The bigger WTF is the use of 'SELECT *'.
Admin
Not sure how I feel about the UNION issue. That could very well be valid, if the OR statement is going to cause the RDBMS to generate a bad exec plan per not being able to logically choose the right indexes. In the example shown here, the UNION might make sense. However, if we had something that would filter down a majority of rows from the table such that it would read as follows:
WHERE bestFilter = some_value and (something1 like 'abc%' or something2 like 'abc%' or something3 like 'abc%' )
The UNION will end up being more costly, most likely. I have already filtered out most rows with an index that will certainly be used.
Whether or not it will be a WTF, well... It just depends. :)
Admin
Then again, if the example here is close to the orginal, the querying of three different attributes for the same data seems a bit flaky. :)
Perhaps the UNION is not the WTF, but everything leading up to it is... More information is needed?
Admin
It is, but I think for ease of writing views that need to expose everything to the sql developer, it's not all that bad as long as you rebuild your views nightly to capture any schema changes that won't be picked up otherwise...
Admin
Admin
The Linux makefile used to right clean code (not sure if it still does or not): make mrproper
Admin
Admin
Och aye the noo
CAPTCHA: conventio : fellatio at a convention?
Admin
Admin
Obvious WTF apart, I really HATE it when people state things in their code that are just not true.
Do you know it's a Char? NO, not yet. So don't freaking set "isChar" to true! Make your checks, use whatever temporal variables you need, but for god's sake, don't name them "isChar" like it was the actual result... like it IS the actual result, and then you get on and on checking it on each iteration of the loop... WTF!
Oh, right, you have to be able to break out of the look somehow. Let me guess, maybe "break" would work? Maybe setting the counter out-of-bounds so the loop condition is no longer met? But oh no, you had to use "isChar", the freaking result variable, to break out of the loop.
Give me a break.
Admin
I read that as text directed at Dave, not written by Dave. When attributing something to myself I usually use a double dash at the end. -- Mark
The smiley face makes me think you're kidding, but I can't parse any humor out of your statement, so I'm not 100% sure.
Which processor are you talking about? A regex engine certainly doesn't save any CPU power when compared against a purpose-built parser. I'm surprised when engineers can't understand that terse code isn't necessarily fast code, and vice-versa.
It may save some processing power in your wetware, however, which is entirely the point of using abstractions like regex in the first place.
Probably not. The LIKE operator can theoretically use an index if the wildcard is at the end of the query string. Of course, the likely SQL injection means a user can submit "%" in their query string in order to force a full table scan.
I almost choked on a Frito when I read this. I had to try this myself to believe it.
Interestingly, SQL Server will also sort the results even if you omit the ORDER BY clause. You can't make this stuff up.
It shouldn't. UNION is a set operator, and as such, it should deduplicate results.
http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.help.ase_15.0.commands/html/commands/commands89.htm
+1
+1
+π
Admin
My preferred way of doing this is to return false within the loop and just have return true at the end of the function.
Admin
Or just return false within the loop itself and return true immediately after. No need for temporal or result variable and no extra condition to check during iteration. In the case of this function, the intent would be perfectly clear.
Admin
Because they know SQL?
Admin
Actually, the variable iSChar isn't needed at all. Just return false for the first char that's not in alpha. If the loop ends, return true.
Like : for (var i=0;(i<sStr.length);i++)
{ Char = sStr.charAt(i); if (alpha.indexOf(Char)==-1) { return false; } } return true;
Admin
(Cue Spike Milligan:
)("tuppence": Two pence, that is £0.02, which is worth more than your feeble $0.02, but still not worth all that much.)
Admin
yeah, you should initialise it to FileNotFound
Admin
Admin
If Spike Milligan made this quote before 1971 then tuppence was 2d not 2p and was worth 1/120th of a pound as there were 240 pence in the old pound.
Therefore for 2d to be worth more than 2 US cents, the exchange rate GBPUSD would have to have been more than 2.4 (2.4 US dollars to the pound sterling). I'm not sure it ever got that high, even before 1971.
Of course, back in those days, 2d was worth something, more than 2p is worth now.
Admin
Admin
I thought that was part of a song by Flanders and Swann. Basic googling didn't show any connection to Spike Milligan.
Admin
Before 1970 the pound sterling regularly traded at above $2.40 to the pound.
Admin
And yes, I know about pounds, shillings, and pence. I was born before the old system was abandoned. As opposed to my youngest colleagues, who weren't even born when the UK stopped putting "NEW PENCE" on coins, in favour of the simpler "PENCE".
Admin
Admin
What I really HATE is people ranting about how the style is wrong of otherwise bug-free code.
Admin
Frankly, my dear, I don't give adamn.