- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Damn and in my app I really needed to get Jesus's birthday verified in a reg -- oh well back to the drawing board
Admin
There ain't nothing regular about THAT
Admin
Well, 0001 would have to be allowed in order to squeeze a date in (AD or CE, your choice) 1 into a system that assumes the current century if no century value exists (I've actually had to code that sort of thing for historical time-line entries). 0000 does not represent a year, since the year before 0001 is 1 BC(with optional E). That being said --- oy, ve!!!!
Even if the system had no native date support (is there one?), there are easier (and more maintainable) ways of validating a formatted date than RegEx. Okay -- test for and fail on "not-digit, not-virgule" (deliberately not in character group format), or strip (replace with nuthin') before continuing if desired -- but that should be about the end of the game. As powerful as RegEx is, it is also nearly unreadable at the best of times when taken in quantity. Any code monkey coming behind can read (or learn how to read) something short like the "not-digit, not-virgule" example, and can adjust allowable dates using alternative methods (the Boss doesn't like April 7 -- ever). What happens to the RegEx when the boss doesn't like April 7?
Admin
This is my all time favourite regexp:
http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html
Admin
Admin
Yes, indeed -- that parrot is deceased [:|] Thanks for the link. WOW!
Admin
Blink Blink,, Waaaaaaaaaaa[:'(]
Admin
Obviously, the built in date validation routines should have been used; much easier to tell exactly what he's trying to do that way. If there weren't validation routines [and there was], the more verbose approach with if statements is a million times easier to comprehend, test, and change.
That said, I did read through the RegEx, and, on a first pass, it looks like it will do what I'm guessing was intended. As Matt says, even granting the context of using a RegEx for this, the mixture of non-capturing and capturing groupings when none are backreferenced [and none look reasonable to be backreferenced] is at least somewhat of a WTF. The weird year construction seems to be due to the fact that I don't know of a way to specify a multiple character string to not match in the middle of a RegEx that you are trying to match; he couldn't say "match any four digits in a row, except 0000." Of course, if year != "0000" is very easy and should be a clue to approach the problem in a different way.
Admin
Admin
The point was, this is all for a company Intranet site, and we certainly haven't been in business long enough to worry about allowing any numbers into a "request date" form before this millenia, let alone two thousand years ago.
Admin
The built in validation routines are tightly nesteled into system.web so you would not want to use it in a desktop app
Admin
<font face="Georgia">How odd - why not? Breathes there a desktop so finely tuned that its user has deleted all the system libraries that aren't relevant to the precise setup? Breathes there a language programmer who hasn't read up on smart linking and dead code elimination? Surely including a few routines from one library wouldn't also include ten million others unless they were directly referenced?
(I ask from a position of unaccustomed ignorance here, since the languages I use are smarter than this, but I realise it's possible they're not normal.)
Incidentally, obWTF: <font face="Courier New">{^(\d+)[-./](\d+)[-./](\d+)$}</font> should do the trick; anything further inside a regex is a sign of a diseased mind. Validate data in code, not in your regexes; that, or wait for Perl 6, which looks like it finally fixes regexes permanently.
</font>
Admin
(Sigh... this forum software is really broken, you know that? Anyone else seeing my last comment in Flyspeck 3pt?)
Admin
Admin
another WTF
Admin
There are other options, obviously. In VB.NET, IsDate would be an obvious choice (and C# has similar functionality).
Admin
Evidently you have not learned what the word 'preview' means.
Admin
For some reason, this one reminds me of BrainFuck. [:S]
Admin
Admin
It's fast! You can't argue with fast! (It's a good argument that the grammar of the address field is too complex. It's a good argument for committal of the regex creator, too.)
One of the biggest changes to perl 6 will be turning regexes from an increasingly overburdened and complicated grammar, into something more context-based and . Sure it won't all fit into 400 characters of line noise, but it also won't have to be recreated every time it needs to change because even the author needs an hour to puzzle it out later.
btw, tinyanon, you should use \d{1,2}[./-]\d{1,2}[./-]\d{2,4} etc. Unless months with hundreds of days are now in vogue. ^_~
Admin
<FONT style="BACKGROUND-COLOR: #efefef">99% of the time you need to parse the date anyway, so just stuff it into DateTime.Parse and see if it complains ;) Faster than validating, then parsing XD</FONT>
Admin
Actually I can argue with fast. Who cares how fast it is when nobody other than the author can understand it? From my point of view its effective performance is ZERO.
http://www.codinghorror.com/blog/archives/000185.html
code that makes sense is code which can be analyzed and maintained, and that makes it performant.
Admin
Fug smucker, aren't you?
Yep - first time I've ever posted without previewing. Goes to show.
Seems to be a bug relating to indented text. Lemme check... Nope. Maybe it was the italics at the start... Nope. Must be related to selecting a bit of text from a parent message and pasting it in. I could probably debug that... Naah.</font>
Admin
Word, bat.
I still haven't figured it out either... But in all (or at least most of that I can recall) the cases where it happened to me, I didn't copy, paste, or quote anything that I can recall. Just clicked "Reply", typed in some text, and clicked "Post". I see no need to have to preview "trivial" postings just to make sure the forum software (or more specifically, the edit control) didn't screw me in the process...
Admin
Actually, after thinking about it more, I did in fact cut and paste a few times without remembering to paste into notepad and recopy. That must be related...
Ok, so I'm a doofus who should preview his posts. Fair enuff.
Admin
Don't worry, the RSS gateway is a bit munted too -- not only can't you see any of the comments, but every few days I mysteriously get a duplicate copy of the last 11 posts, for no readily apparent reason.
(And I'm fairly sure it's not my reader, because I'm subscribed to a number of feeds and this is the only one it happens to.) Sigh.
Admin
I get the duplicates too - Thunderbird.
Admin
Better yet: \d{1,2}([./- ])\d{1,2}\1\d{2}\d{2}?
Which forces the date separator to be the same either side of the month, and only allows 2 or 4 digit years. Of course, neither solution restricts months or days to be valid.
Admin
And of course, neither of those actually validates an international ISO standard date.
Admin
...which is exactly what?
Admin
If you download and read the source code for module Mail::RFC822::Address you will notice that it is quite easy to read and understand, presuming that you have some understanding of regural expressions. The big beast on that page is only for display purposes.
Admin
Here's a doc on the subject: http://www.cl.cam.ac.uk/~mgk25/iso-time.html
(I'm starting to understand the gripes about the forum software. Is it /really/ necessary to use a bleeping word processor to compose these posts? Not to mention one that doesn't even work in one of the popular alternatives to that other wtf that people use to infect their computers with spyware.</rant>)
Admin
Now I don't know much about RegExp but does this code try and validate for 29th Feb only on leap years? If so, is it doing it properly (every 100 years it's not a leap year unless it's also a multiple of 400 see: http://www.codeproject.com/datetime/leap_year.asp) or just the 'is the year divisible by 4' rule?
Admin
I can assure you, writing the check hardcoded will be a lot faster [:)] (at least if you're working in a compiled language and not an interpreted where the regex is a native library, then it could become close, depending on the language)
Anyway - I hate regex in code, it's a script thing, a quick hack, a commandline tool, but please not in code... It's hell to debug or extend something like that.
For me it's the same as invoking a perl-interpreter to execute a small perl script because that particular thing is easyer to write in perl, and sadly enough - I can't say I haven't seen such practices. Ok - the guy that did that was so nice to add a comment where he explained what the perl script did, but it was slight overkill to add a complete perl-installation to a windows-client program that was supposed to be "lightweight"... His argument was also "yeah but perl regex is fast"... [:@]
Admin
that crazy address regex could be simplified greatly by splitting it up into sensible parts, like:
the /o on the end of the last one means it'll only be compiled once, so this should be no slower.
(and yes, I know it's only supposed to be an example of why you don't want to do it by regex)
Admin
Yep. It's been a little while since I broke the regex down and tried to figure it out (and submitted it here) but I believe one of the three main groups in the expression was devoted solely to that.
Admin
I appreciate that this is kinda missing the point but there was no year "0000" but there was a year "0001" so that feature is kinda ok depending on whether we want dates going back that far...
Admin
Actually the is the coup-de-taut from Mastering Regular Expressions, so it's supposed to be an example of RexEx zen. I believe the point (if memory serves me correctly) is that it doesn't have any NFA-style rollbacks, so its really fast and doesn't end up consuming a lot of money in the process of NFA-to-DFA conversion by a regex compiler.
Admin
I'm sure you mean a <FONT color=#0000ff>coup d'état</FONT>. Okay I have no idea where that link is gonna go...
I'm all for using regular expressions instead of a series of 'instr' commands in VB.Net. Just don't make them too complex. It needs to be maintainable too.
Drak
Admin
The main thing I hate about using regex in code is that you have to escape (ie ", or even worse, escaping the 's in the expression so they become \) so many of the metacharacters, etc, that it becomes a nightmare to seperate the actual regex expression from the mangling you had to do to get it into a string variable.
Thank god C# allows the literal string construct (prefix with @), so it is no longer quite so bad for me.
Admin
Actually, I think he mean coup de grâce.
Admin
I prefer a coup soleil, though... [:P]
Admin
I don't know about "coup de grace" either -- which is usually defined as a mercy stroke, designed to kill a (usually) badly wounded foe who would suffer unnecessarily otherwise. (A death blow given in other contexts may be wrongly termed a coup de grace in English, but it misses the whole "grace" part of the deal.) Coup d'état, a sudden, violent overthrow of the government, is definitely wrong. Coup de génie (stroke of genious) may fit, but it's hardly a common find in English, as would chef d'oeuvre (masterpiece). The most probable fit for French-originated-but-common-in-English phrases would be "tour de force"; the effect it has on you may be likened to a "coup de foudre".
Admin
I think he didn't know what he meant. [N]
Admin
No need for the /o. The great thing about qr// is that it precompiles regexe(s|n)... :)
Admin
I believe the point of the forum software is to get you in the mood for a proper appreciation of the collection of wtfs.
I didn't know ISO supported leaving the dashes and colons out. Nice, the Exslt and XPath specs never goes over that, and probably don't support the full 'standard'.
Why the hell would anyone include a perl binary/installer with a compiled project? wtf? PCRE exists for a reason, and will definitely be much faster than marshalling arguments into perl scripts, calling perl, and (sometimes) getting the results back. Just call it all from C/C++ and be happy.
Admin
Perhaps het just meant 'Coup':
coup (k)
n. pl. coups (kz)
Sorry, I overlooked the fact that d'etat wasn't in there for definition 1 the first time round[:S]
Drak
Admin
Re
<script src="chrome://greasemonkey/content/scripts/1102161148673"></script><script src="chrome://greasemonkey/content/scripts/1102237157909"></script>http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html
I once worked at one of the first companies to offer design-your-own egreet. Well, it was a pay service, and the folks using it weren't too hip to tech, and we really really wanted people to get the messages (so we'd have happy customers).
I was tasked with writing an email address validation routine that would specify exactly what was wrong with the address.
It checked parts, it validated domain names, etc.
It would return messages like "the domain name (after the @ sign) must not begin with a numeral."
It took about 3 days and was about 400 lines of VBScript.
I'm not sure if it actually sold any more cards...
It wasn't until much later that I realized a parser was probably already available (though doubtfully in VBScript.
Admin
<font face="Georgia"><font face="Georgia">I saw the ex-parrot URL and then read the next like as being about a company that offered a "design-your-own egret". I skimmed the rest of the comment looking for other references to birds, assuming this was some meme new to the blogosphere and before I knew it I'd be knee-deep in obscure species of winged creature. All your geese are belong to us?
Topic? What topic?</font>
</font>
Admin
There is, in fact, nothing wrong with a domain name starting with a numeral.