• ray10k (unregistered)

    Looks like some swears were left uncensored in the article.

    Other than the list of swears, what's the WTF here? I mean, are we supposed to laugh about the narrow subset of all swears shown here? The fact that it's in a hard-coded array rather than a file/database? The fact that the body has to be processed once for each swear?

    Just looks a little underwhelming. Worth a chuckle for the list of swears, but not really WTF-worthy IMO.

  • (nodebb)

    probably that they didnt use a regex and used a loop instead since that thing literally asks for getting regex'd, first one could be /sh(#|!|@)t/ (# being the block)

  • Fristo (unregistered)

    Nice censoring fool

  • timmy (unregistered)

    I like how Fuck and shit are so offensive that they require 'blocking' but cunt is OK.

    Next thanksgiving, while sitting around the table, try saying "holly shit" and check the reaction. Then try "You cunt". I bet one gets a stronger reaction than the other!

  • (nodebb)

    i don't see how count is initialised

  • OlPeculier (unregistered)

    That's a bit wank.

    Especially if you live in Scunthorpe or Cockermouth and want to talk about blue tits.

    Arsenal supporters must be happy.

  • Some guy (unregistered)

    Clbuttic

  • Talis (unregistered) in reply to Some guy

    Be careful - you could be buttbuttinated for that! :-p

  • Pim (unregistered)

    And if you use strpos to test for "shit", you don't also have to test for "shitty" and "shittiest", because you've already covered that.

  • Herr Otto Flick (unregistered)

    Where is the WTF? Titter titter, naughty words?

    I've had to do this for URLs containing randomly generated base64 auth tokens, some users were occasionally not receiving the redirects. Turned out Cluecoat was seeing a "fuck" or a "tit" or (if very lucky) a "titfuck", dropping the response and returning an empty 200 response.

    We just check each URL generated for naughty words, and if there are any re-generate the URL.

  • Oliver Jones (google)

    This code isn't even standards-compliant. https://en.wikipedia.org/wiki/Seven_dirty_words

  • JustSomeDudette (unregistered)

    We have one of those in our legacy code but the programmer also decided our company name was a swear word too.

  • Hanzito (unregistered)

    It's not only clbuttic, checking for a substring match for fucking after checking for fuck is superfluous. I don't think it catches uppercase, though.

  • Wes (unregistered)

    Not censoring the c word a Hot Fuzz reference? http://i.imgur.com/MggBg.jpg

  • EatenByAGrue (unregistered)

    Worst of all, this doesn't catch "Semprini".

  • EatenByAGrue (unregistered) in reply to timmy

    "Next thanksgiving, while sitting around the table, try saying "holly shit" and check the reaction."

    I actually knew a guy who accidentally blurted that out in reaction to a loud noise ... at St Peter's Basilica during a Papal Mass. He apparently got away with it.

  • chreng (unregistered) in reply to EatenByAGrue

    Nor WTF

  • Paul Neumann (unregistered)

    Funny enough, that looks like the opposite of one of my SpamAssassin (SpamButtButtin?) rules. If an email DOESN'T contain similar words it probably isn't intended for me!

  • Mr Lister (google) in reply to EatenByAGrue

    A loud noise? Oh, that would have been the Pope farting. Holy shit indeed.

  • Dave (unregistered)

    There's a group of us who have post-traumatic stress from having to explain to our CEO why the CRM routine from his anointed favourite shadow IT person refused to contact people in Scunthorpe.

  • (nodebb)

    That code is a $pos.

  • Developer Dude (google)

    That is a bunch of bovine fecal matter!

  • (nodebb) in reply to ray10k

    This supposedly clean comment I'm writing right now would get scrapped because I made the mistake of using the word "scrapped."

    See the big problem now?

  • I dunno LOL ¯\(°_o)/¯ (unregistered)

    I have a sad story to tell you
    It may hurt your feelings a bit
    Last night when I walked into my bathroom
    I stepped in a big pile of ...shhhhh . . . aving cream,
    be nice and clean. . . .
    Shave ev'ry day and you'll always look keen.

  • Carl Witthoft (google)

    and my favorite anti-sanitizer: 'sofa king'

  • Ulysses (unregistered)

    Lol @ the company name as a swear word. A better list would be of fruity Britishisms: lorry, lollipop lady, trolley, fortnight, wank, colour, at around, maybe be, etc. :D

  • DeezNuts (unregistered)

    Good ol' OpenCart.

    What a cluster fark that code base is.

    Source: Was forced to code in it for 6 months.

  • Loren Pechtel (google)

    Programmers should be embar***ed to write stuff like this.

    (That's what a prudish website did to me once.)

    And what about the batter who mishit the ball?

    Or the decorator that wants some crape?

    And can't the casino advertise its craps game?

    And the scrapper can't pick up the scrap?

    And the scrappy painter can't scrape the wall?

    And the builder can't build a skyscraper?

    You might have a crap problem if you can't get a ballcock for your toilet.

    It's a cockamamie idea that Scunthorpe is the limit of the problem.

    And how am I supposed to get a cockatoo from the pet store?

    And how am I even supposed to get there with no pilot in the cockpit?

    At least he's not going to get shot for it because the gun wasn't cocked.

    On the other hand he can't eat because the fisherman didn't bring in any cockles.

    Nor can he drink because there are no cocktails.

    He can't even enjoy the scenery because there's no peacocks about.

    Nor can he play because there's no shuttlecock in the badminton set.

    (Note: There are a lot more words than this. I only took ones that were common enough to be in Firefox's dictionary, were words I recognized, had no profane implications and only one per theme.)

  • löchlein deluxe (unregistered)

    About three years back, Big Brown Soda Corp. did a print-your-own-label publicity thing, including a publicly readably json list of banned terms, so you couldn't "Have a brown one with Hermann Göring", nor "have a brown one with diarrhea", nor about two thousand other terms which were obviously chucked into the bin of ban by some poor temp-sty inhabitants, tyops and all.

  • Kaewberg (unregistered)

    They're not checking that the words are surrounded by whitespace. That's a clbuttic misstake.

  • Leo (unregistered)

    So mentioning tits is prohibited, but mentioning a single tit is OK?

  • I dunno LOL ¯\(°_o)/¯ (unregistered) in reply to Loren Pechtel

    The decorator better be wanting crepe or she won't find anything but... scraps.

  • Mikey Dread (unregistered)

    Tits like coconuts, sparrows like worms

  • Loren Pechtel (google) in reply to I dunno LOL ¯\(°_o)/¯

    "Crape" is an old spelling of "crepe". The decorator is ok.

  • Chuck Ritter (unregistered)

    Last year, I was looking for a new job. Of course, that means you have to deal with half a gajillion poorly designed HR websites.

    One of them had a filter on it that would look, not just for cuss words, but anything that might be offensive or pejorative. It flagged my resume and would not let me post it until I replaced the word "Amazon" with "AWS".

    We really need to get away from this shit. Let people be who that are and speak as they will.

  • Herr Otto Flick (unregistered) in reply to Hanzito
    It's not only clbuttic, checking for a substring match for fucking after checking for fuck is superfluous. I don't think it catches uppercase, though.

    It's not at all clbuttic, there is no replacement going on here, it is simply checking if a string contains words which might be considered swears.

  • Shmoopy (unregistered) in reply to Helix

    count is a function so...

  • Lee (unregistered)

    Fresh out of uni I wrote some password reset functionality into an internal application used by my employer. It would send an email asking for confirmation by clicking on a link, and would then send out a temporary password in a second email. (This was an old Oracle PL/SQL / application server environment, there are probably far better ways of doing it today). I put in a check for bad words against a table and a colleague and I inserted as many as we could think of.

    Anyway said colleague and I tested it and were fairly confident about its operation so we went to show the end result to the manager. He clicked on the password reset link, got the confirmation email, clicked on the link, got the second email...

    "Hi Manager,

    You've requested to reset your password for ApplicationX

    Your new temporary password is: babyrape

    You may now use this to log on at http://applicationx.intranet.company.com/

    Regards"

    "I think you need to add some more words to the bad-word list" ... was all he said.

  • Gary (unregistered)

    Looks like what happened to an SMS application I worked on years ago for a previous employer. It was determined by someone else that the SMS messages that the company sent, which were created in the software, needed a dirty word filter even though the SMS provider already did all of that. The dirty word filter needed a server and a list of dirty words and every time an SMS message was sent, several times a second, the message had to be sent through the dirty word filter before it was sent to the SMS provider. I dropped out of that project before it was put into operation.

  • KD (unregistered)

    Had a similar list at a place I worked at. Oddest entry was 'Ramjam D Funky Boogaloo Smythe'. Finding that amongst all the generic swearing etc raised quite a titter.

  • Howard Lewis Ship (unregistered)

    I actually did something like this for a project that generated six character verification codes as part of notification system. Anyway, six characters is enough to stumble on some four letter words, so I created a namespace (this in Clojure) that listed all the banned names along with a comment to the effect of "Yes, I know creating something like this can be a career limiting move!".

  • Ashley Sheridan (unregistered) in reply to Kaewberg

    No, no, no! Don't check for whitespace surrounding words you're trying to match, if you're going to try doing it properly at least look to match against line endings and word boundaries. Otherwise tit-fuck would be allowed.

    Mind you, this kind of blind approach is just begging for someone to get around it in 5 seconds by writing stuff like t1t or c0ck. If you're going to use regular expressions, do it and check against all characters that could be used.

  • Tomasz Struczyński (unregistered) in reply to Herr Otto Flick

    There are several.

    First of all, they only check for occurence of strings, not words. There are many words containing 'bad' parts (as Loren Pechtel pointed out).

    Secondly - no regular expression. Actually, this one connects with first. Searching for words (word boundaries) and compiling several words onto one (sh(i|[^a-zA-Z])t as a quick and dirty example) will make this script better.

    Third - the loop. This is not a WTF, but can be avoided (and possibly increase performance) by using regular expression. You make one regexp by concatenating all array elements with '|' (and surrounding it with word boundary markers). I am pretty sure, that this will be faster, than a loop.

    And last, but not least: reinventing the wheel. Quick Google search found https://github.com/snipe/banbuilder , being a library doing just what authors wanted to achieve, but in more elegant manner (and more languages).

  • seo tools group buy (unregistered)

    Not often do I encounter a weblog that is both educated and entertaining, and let me tell you, you may have hit the nail on the head. Your concept is excellent; the issue is something that not sufficient individuals are speaking intelligently about. I am very happy that I stumbled across this in my quest for something relating to this. seo tools group buy

Leave a comment on “Clean Up Your Act”

Log In or post as a guest

Replying to comment #:

« Return to Article