- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Me no want neither.
PS. No funny comments to read yet...
Admin
wow. Sad thing is that I have seen worse. But... wow
Admin
OMG - Where can I hire the boy-genius that came up with this? :)
Admin
Wouldn't it be great if it turned out that 'replace' and 'REPLACE' were two completely different functions? ;)
Admin
Wow, I'm a Complete Noob at SQL coding, and I've could of come up with something better then that, i wonder, is this PHP?
Admin
Maybe some art of SQL obfuscation?
Admin
i'm writing sql right now @ work and this makes me want to quit and go skydiving with no parachute.
greets from Argentina, the bad-english country!
Admin
No, this is just SQL. SQL has functions too.
Admin
Well AFAIK there is no better way of doing character substitution in SQL. Unless you create a table & perform lookup from that table using a stored procedure.
Admin
I agree. Thing I cannot figure out is that this person seems to be replacing all control characters between 1 and 31, inclusive, w/ nulls. My guess is that somewhere, in either the translate or filter functions, these nulls are removed. I understand the desire to eliminate the control chars...but to do it in sql? What a pain in the arse.
Admin
I'm guessing this is Oracle, which only introduced regular expressions in 10g - so granted, in earlier versions this sort of thing can be annoying - but then such cleanup would normally be done in an input filter, not as an in-table update. At least use a UDF...
Admin
Good night... Can anyone tell me what exactly that does? I lost interest somewhere around the 15th "Replace".
Admin
Similar gotchas exist in MySQL.
Admin
If this is Oracle (kinda looks like), "translate" is a build-in function. Using a UDF would have been by far more beautiful, but it comes with a small performance penalty.
Admin
Can oracle not call out to c functions? If so, wouldn't character replacement be done more efficiently and comprehensibly there?
Admin
Admin
This can't be SQL Server .. replacing a single character in a string (i.e., varchar) with a NULL results in NULL for the entire string:
select replace('jeff', 'j',null)
returns
NULL
(1 row(s) affected)
As expected .... I mean, why would you think that you could replace a single character with NULL ? Makes no sense. If the goal is to take a string with ANY of these characters ANYWHERE and simply return NULL for the entire string, I guess it would work "properly", but I doubt that's it ...
Admin
get a better database
REPLACE isn't an SQL95 standard call either
postgresl has
translate(string text, from text, to text) Any character in string that matches a character in the from set is replaced by the corresponding character in the to set
e.g. translate('12345', '14', 'ax') a23x5
so we have to translate any of \1-\31 with sentinel (I've chosen \1) and then replace that with our desired value (NULL for some reason)
set name=replace(translate(new_name, '\1\2\3\4\5\6\7\8\9\10\11\12\13\14\15\16\17\18\19\20\21\22\23\24\25\26\27\28\29\30\31', '\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1'), '\1', NULL);
but this will output a NULL string if any of \1-\31 are encountered
Admin
Maybe that code was written in a contest to see who could call replace() the most.
Admin
PostgreSQL would return NULL for that input just like SQL Server.
Admin
I'd guess this is Oracle, too. Replacing a character with null will effectively delete that specific character from the string:
select replace('abcdef','a',null) from dual;
REPLA
bcdef
1 row selected.
Admin
There's different ways to do that, but the most elegant solution would be something like:
Then, you can just call that ONE function for ALL columns. And since the functions are compiled optimized, the code will also run faster.
Admin
This is when you need to write a (Perl/Python/PHP/C/whatever) program that writes your program. It's the only thing that will keep you out of the padded office if you have to, dog forbid, change it.
Admin
I think the function idea would be best - something along the lines of a series of replaces, each on a different line.
however, the original writer may have been trying to avoid the need for create function or procedure privileges - so just jamming it all in one script leaves you with less options.
if it is Oracle (i suspect so) then at least the script could be written as a PLSQL block - and it could look a little more like traditional code
captcha = craptastic
Admin
In Oracle, the "translate" function does exactly the same; unfortunately, escaping special characters like '\1' doesn't work.
Admin
and god invented for loops for a reason...
Admin
I don't see why anyone would design a table that had a record that would need to be updated like this. This code is just the tip of the WTF.
Designing a database, you should have fields that change often, and fields that change rarely if at all. Records that change often should be foreign keyed to fields that change seldom, and when they need to be changed, they should be REPLACED, not abused with some crazy ass serial update stored procedure.
Temporary items like semaphore implementations, should be deleted, or the entries should be moved to a log table or something. They should never be cleaned.
If you do need to clean up your tables, it should be done by dates, which is easy as pie.
This sort of thing is almost always the result of poor table design. It's just absurd.
Admin
In SQL Server:
DECLARE @BadStrings VARCHAR(50) SET @BadStrings = 'Hello' + CHAR(5) + ' There'
SELECT REPLACE(@BadStrings, SUBSTRING(@BadStrings, PATINDEX('%[^a-zA-Z0-9 '''''']%', @BadStrings), 1), '')
That shoudl about do it.
Admin
I was going to put something witty here, however my captcha is "giggity" and that seems to sum it up pretty well.
Admin
I dunno about you, but I would use a trigger that calls a cleanup stored procedure for each field... :new.name=fix(:new.name) ....etc???
Admin
Data cleanup scripts should only be used once, and then only after every bit of data entry has been fixed to not insert undesired data in the first place.
Admin
You people are morons. This is the only way to perform the required operation in sql without sacrificing performance. Not all programming is rainbows and sunshine. There will come days with the code gets ugly in oreder to perform. Get a grip, and let this one go.
Admin
It should about do it if you only want to replace only the very first "bad" character, and if you are defining bad characters as including punctuation and other things that you probably would want to keep.
Admin
Admin
In Oracle, replace('foo', null) is just a stupid way of writing replace('foo', ''), because in Oracle NULL and '' are the same thing. (Which is a WTF for another day...)
There are (at least) three WTFs here:
(1) They regularly have to run a script to remove this sort of thing from a well-respected data source (can't they just fix whatever keeps putting them in there, and run it once on the already-entered data?)
(2) The writer of this fragment apparently didn't know how to write functions in SQL, though someone must have written "filter" and "translate". Even if this horrible expresssion were justified once, there's no reason to write it out once for each of the 32 columns.
(3) They aren't using regular expressions, or built-in translate function correctly...they could at least do
which would strip the control characters without all the nested calls. (I guess their translate is something else...IIRC, Oracle would allow a user-defined translate/1 (one-argument translate) to co-exist with the builtin translate/3 (three-argument translate).Admin
Mind that this is clean-up code : write once, execute once!
Not quite T-SQL I think, so I can't be sure on what other 'better' functions might be available; but apart from using a UDF (that holds more or less the same code, be it better 'hidden' and most probably a lot slower) I don't think there is much improvement to be made on this.
From personal experience, more often than not you're given 'full' access (SUID) to a set of tables, but find yourself totally locked away from making your own objects (eg. functions) when doing ETL stuff, so the guy might have had to do what was available and did so nicely. Although it might not be the nicest thing to look at, it's correct and it's fast! Where's the WTF ?
I'm pretty sure the author in question wrote a little script to generate given code (32 x 32 is A LOT of replaces =), if not, well hmm, THAT might be a WTF indeed =)
PS: translate is a 'standard' SQL function according to "SQL in a nutshell", not quite sure I understand 'what it can look like' ?!
Translate : Converts a string from one character set to another.
Admin
That will eliminate the first instance of a bad character, but not more.
For example:
DECLARE @BadStrings VARCHAR(50) SET @BadStrings = 'Hello' + CHAR(5) + char(2)+' There'
SELECT REPLACE(@BadStrings, SUBSTRING(@BadStrings, PATINDEX('%[^a-zA-Z0-9 '''''']%', @BadStrings), 1), '')
(I added a second, different, undesired character)
leaves a char(2) in the result.
A UDF with a loop in it like:
while patindex(@string, '%[-]%', @string) > 0 select @string = replace(@string, substring(@string, patindex(@string, '%[-]%', @string), 1), '')
Will clean up the whole string. Performance isn't too bad, and it's much easier to maintain than 30+ nested replace() functions.
(I hope the characters in my example show. I copied-and-pasted them into the editor here.)
Admin
deroby, you're wrong on all points. It sounds like they're running this regularly, it's not a question of access (someone of similar skill must have written filter and translate), and they're not using the built-in translate function...it's overloaded with their own. The performance penalty for user-defined functions obviously is not a concern if they're calling two of them and nesting 32 calls to a built-in (31 of which are unnecessary). They must not have written a script to generate this, since the case is not consistent.
Admin
And what if the feed is actually sending ISO-8859-1 or UTF-8? Then what? I would hope there's a better solution than enumerating all the "missing" characters in the Unicode space!
Oracle supports in-DB Java since at least Oracle 8.
Admin
I lost interest a lot sooner, but yes, I can tell you what it does. It replaces in the "name" column any character with an ASCII value of 1 to 31 with a null value. In other words, it strips out most "control" characters.
Of course, it COULD be written like this:
variable SRCH_STR VARCHAR2(40)
BEGIN :SRCH_STR := chr(1)||chr(2)||chr(3)||chr(4)||chr(5) ; :SRCHSTR := :SRCHSTR || chr(6)||chr(7)||chr(8)||chr(9)||chr(10) ; :SRCHSTR := :SRCHSTR || chr(11)||chr(12)||chr(13)||chr(14)||chr(15) ; :SRCHSTR := :SRCHSTR || chr(16)||chr(17)||chr(18)||chr(19)||chr(20) ; :SRCHSTR := :SRCHSTR || chr(21)||chr(22)||chr(23)||chr(24)||chr(25) ; :SRCHSTR := :SRCHSTR || chr(26)||chr(27)||chr(28)||chr(29)||chr(30)||chr(31) ; END; /
UPDATE OWNER_USER.all_candidates SET name = filter(translate( dbms_standard.translate( name, 'a'||:SRCH_STR, 'a' ) )), ...
This perversion probably occurred because somebody DID write their own TRANSLATE() function (therefore my "dbms_standard." junk).
But I don't even want to go there ...
Admin
Probably written by a DBA with a obsessive compulsive disorder
Admin
The biggest WTF here is that it is being done in the database. Doesn't matter what RDBMS it is. This sort of string manipulation belongs in a scrubbing application. It's like building HTML in the database. Why?
Admin
The main problem is that it is done at the wrong level, it should of been implemented as a stored function, so that the 32 other columns could call it. Or it should of been part of the applications job to validate the columns. Also it could of been implemented as a trigger.
The other problem is the way it was done. It could of been done like this (implemented using PostgreSQL, Note I've only known about Stored Procedures for a week, so if there is any little errors, blame it on lack of experience with them):
Admin
I wouldn't be surprised if this code was actually generated by a program, maybe something which translates SQL queries from one database to another...
Admin
I'm not going to read the comments.
I just want to say.. "RegEx primer"
PS this place has the funniest CAPTCHA, thanks :)
Admin
I was working for a very large US-company (which actually should know better), and we did all the system integration in VBScript (!) - where any kind of necessary text encoding cleanup was done like this:
out = replace(out, "ä", "ae") out = replace(out, "Ä", "Ae") ... out = replace(out, "é", "e") ...
etc., etc.. hundreds of lines like this. This didn't even take into account that the replacement Ä -> Ae only works with some languages.
Oh yes, and the target encoding was actually UTF-8...
I quit there after half a year.
DM
Captcha: Java
Admin
I will admit to having written code which did something like this, for a routine which generated XML files from Oracle tables. You see, ASCII 0-8, 11-12, 14-31 are not valid XML characters, so I had to strip them out from the data. I just used a single REPLACE call which was nowhere near as ugly as this example.
Cheers, APC
Admin
slamb: not going to start a YesNo war here. I simply wanted to point out that sometimes there are situations where you need 'do' with the tools you've been offered. Nested replaces are not that terrible IMHO. Sure it can be done nicer/faster/better/whatever, but without proper context I consider it a bit harsh on the author to call this a proper WTF.
Makes me think of situations where you're sent to some company with the mission to convert some mapping from the "mainframe" to "the application server". You're fully briefed and prepared on the functional part, you know plenty of the app-servers architecture and best of all, you get the entire week to 'massage' a measly couple of ten-thousand lines into the new format, easy! Off course the local IT guys will love to help you out ("damn external overpaid consultant I don't have time for"), so they give you terminal access to a 'well-respected data source' (not familiar with AS400 ? Don't worry, it's self-explanatory). After 2 days lost you manage to write the files to disk and FTP them to your server where you then get a 'just-a-bit-more-than-public' account to manipulate and load the data again, only to find they're full of strange characters after you BCP the file in. So you start looking for problems and fix them one by one, hoping it will all be done before the end of the week because you don't want to miss your flight back home. Sometimes getting it working is a matter of economics (read: within the allotted time-frame and budget), not proper design. Sad but true. Does that make it a WTF, imho not, YMMV.
FWIW :
As a general note on this web-site : get real guys, there's way too much elitists (sp?) here. Sure it's cool to write "... I quit the week after that ...", but that doesn't make the problem go away. If everyone would react like that its no surprise that people get replaced by technical "know-nothings" with total lack of "coding-style" and maybe a funny accent, because at least THAT guy got something working the way "the boss" had requested. (sort off). Although I like to think the opposite too sometimes, coding is NOT a call, it's a JOB.
Admin
So this is what happens when you have bad karma: you come back as a SQL programmer writing crap like this.
And "repeated nearly verbatim for the 32 other columns in the table". That really is icing on the WTF cake.
Admin
mmm, I love the way troll smells in the morning