- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
I suppose you could just SWiG it from C. Good luck on getting that one past an idiot boss such as the OP has. In fact, the idiot boss might actually be right, for once.
Admin
It is possible to write recursive regexes in C++ with boost.xpressive.
Admin
It's not like anybody would notice you getting the language completely wrong.
Admin
You are assuming that my comments were responses to the article. They were not, they were responses to the first guy who did it in C. It wouldn't make sense for me to criticize someones C code and say "you could have done it better in language X," would it? (Ignoring the fact that the original poster did exactly that.)
I am told that C# supports pointers to some extent. Probably it wouldn't take much massaging to get my little snippet into C#. I wouldn't know. I'm a lowly UNIX sysadmin, not a super-genius programmer. That means I only know C, Perl, shell, etc. So I am told, anyway.
Admin
Why ask for two spaces or more? Replacing one space with one space is perfectly acceptable, and (without having done any benchmarking) I'd expect a simpler regex to work faster.
regexReplace(itemDesc, " +", " ");
strikes me as more sensible.
Admin
When somebody told you that C# supports pointers to some extent, did you ask them why? And when you should use them?
Thought not. I'm a C and C++ programmer, myself, and I'm damned if I'm going to descend to pointer semantics like "unsafe" if I program in C#.
Best left to idiot SysAdmins who have support from their idiot PHBs. And well worthy of the next WTF.
Admin
So what's your problem anyway? Asperger's?
Admin
From at least that measure, I get that NFA vs DFA isn't a big deal. What is a big deal is either FA-based approach vs. what many real regex libraries (Perl, PCRE, Python, and Ruby) do, which is a third option -- recursive backtracking.
And the point of the article is that recursive backtracking algorithms have extraordinarily poor worst-case behavior, as compared to the Thompson FA construction (either DFA or NFA).
Obviously the recursive backtracking construction is usually OK, or it wouldn't be so prevalent. But it's harder to characterize the performance of such an engine then one based on a FA, and ukslim's original comment When you write a regex, you're programming a finite state machine is flat-out wrong for most regex libraries.
Admin
Got an example of "a large class of very complex regular expressions <that back-references make> much simpler?"
Or are you just bull-shitting?
That's the great thing about blogs. They appeal to the worst in all of us.
You'll note (or perhaps you're too stupid to note) that I didn't specifically object to using PCRE and back-tracking where they make "a large class of very complex regular expressions much simpler."
I'm merely offering links to other possible solutions.
Can I help you with your problem?
Admin
I'm yet to be convinced that a non-Thompson FSA is the way to go. In thirty years of computers, I've come across too many cases where "you don't need to worry about that exponential thang, honey..." right up to the point where you do.
I was just trying to add a few relevant references. They won't help many people. You've parsed stuff (I parsed, you parse, they fuck up). Just hoping it would give you a few good leads ... that's all.
PS I'll go back and look at the thin blue line. Good to have an intelligent response, for once.
Admin
So... did the guy fix it or did he just quit because there are gasp bad code in the company? Even Microsoft has shit code for chrissakes.
If he quit because of that then I think his expectations are too high. And the phrase "many past jobs" would indicate that he is either a very old guy or just some arrogant prick who loves jumping ship every few months.
Admin
What always amazes me is that people producing this kind of code never look at it and think, "Wow, that really looks weird. Can this really be the best approach?"
Admin
Great, I just spit sunflower seeds all over the carpet.
Admin
FTFY
Admin
Three pages of RegEx talk, and no one thought of posting this link?
Where is this all going to end? Did you guys give up reading other sites completely?
Shame on you!
;-)
Admin
There are many scenarios where regular expressions are unnecessarily complicated and unreadable. But this is not one of them. This is exactly the kind of situation where a simple regex is appropriate.
Admin
I reckon the " {2,}" version might be a tiny bit faster. That said, I'd go with yours (" +") for readability unless it proved to be a bottleneck.
Somewhere in the comments is a really nice one: "[ ]+", which is effectively the same expression as " +" but even more readable.
My favourite solution is this one:
Effectively exactly the same expression, but so readable. One I'll be using in future.
Admin
reads thread
sighs
Admin
Say no! gasp!
Admin
Delphi uses := as an assigment operator
Admin
catch(DullAndPredicatableStockResponseException yawn)
Admin
Admin
I'd suggest the same about someone unable to understand " +".
Admin
Admin
I'd say this is bullshit. See, 24 replaces of " " to " ", means that 2^25 spaces is turned into two spaces. So for two subsequent spaces to be left there must be 34MB! of spaces. I don't believe anyone will actually have 34MB of spaces, really.
For those who don't believe me, imagine 8 spaces. After replacing " " with " " once there will be 4 spaces. Again, there will be 2. It will divide by 2 every time.
Unless, of course, stringReplace would replace only the first occurance of a string, which would be another wtf on itself.
Admin
Hooray for the board replacing two spaces with one! I'm tempted to try 2^25 spaces. Let's try just a few: " ".
Admin
How many stringReplaces does it take to change a lightbulb?
Admin
[0-9]
Admin
On the "goto" issue:
In C\C++ at least, the code will end up the same in assembly if you use a do\while loop or using if\goto. They both say the same thing.
Admin
(Schlemiel the Painter)
Interesting. So part of the inefficiency problem is that strcat(s1, s2) returns s1. For multiple concatenations it would be much better if it returned a pointer to the NUL terminator (called strcat_z() below).
While returning s1 allows you to nest calls, you might as well write s1 since it has the same value:
strcat(strcat(s, "a"), "b");
is the same as the barely longer
strcat(s, "a"), strcat(s, "b");
but
strcat(strcat_z(s, "a"), "b");
would be more efficient.
Further optimizations are left as an exercise for the reader.
Admin
TRWTF is that he used regex instead of just enabling global search.
Admin
And, therefore, you are also a nob for the same reasons.
Oh, shit...
Admin
w00t Compilers...my FAVOURITE class. C coding and language theory, awesome.
Admin
I hate RegEx!!! RegEx Die! Die! Die!
how about:
Admin
Wouldn't regexReplace(itemDesc, " +", " "); be slightly slower, though, because it is running extra replaces?
Admin
Any flavor of regex is guaranteed to be more efficient than this if you have more than 3 spaces. The n spaces will be replaced in one pass, whereas with this code it would take log n.
Admin
I would imagine that it would be slower, but only neglibibly so. I'd leave it in as it's more readable unless it actually proves to be a bottleneck.
Admin
Maybe it's because I have a fairly strong compiler background, but I don't understand the objections to something like "[ ]+". How is a regexReplace call with that as the regex at all hard to understand?
You say the loop is only negligibly slower than the regex, but it's also negligibly easier to understand, if any.
Admin
I think the commenter meant that " +" is slower than " {2,}"
Admin
I know sod all about C#, but I do know that in this case the language is perfectly feature-rich for the task. I'd either look up the various string library features, or else I'd settle for a simple regexp.
There are typically more important ways to utilise your mad skillz.
Admin
When your superior does a task in a certain way you shouldn't rip all his code out and replace it with something different. That could make him look bad or even be considered insubordination. If your boss used a large stack of stringReplace()s instead of a single regular expression, he probably had a good reason and you shouldn't question it, and especially not change the whole thing without at least asking him about it respectfully.
And you know what they say about regular expressions...
Admin
Maybe not, but don't you think it might take someone who has seen Delphi or C# code before?
Didn't care. I don't have any intention of using C# any time soon.
Has anyone mentioned you're kind of an asshole?
Admin
Admin
Good, at least they use source control.
Admin
I envy all of you that get and understand this programming regex stuff. I just can't seem to wrap my little brain around it all. Maybe I haven't tried hard enough, who knows. It's just depressing to think that it could very well be a mental capacity issue, that's kind of hard to swallow or admit to. Anyhow, it may not be related to mental capabilities so where would be the best place to start learning all of this in your opinions??
Thanks in advance.
Admin
fucking idiot SysAdmins might well be relevant to you, on the other hand. It's just not what I said.
Nobody I give a shit about, no. Did Mommy never tell you that it's impolite to misrepresent other people with fraudulent quotes?The point is, this is written in C#. I don't like it. You don't like it. Get real. It should be fixed in C#, not in some fantasy reality involving random choices based on your own preference.
You won't be maintaining it.
Admin
In fact, http://lmgtfy.com?q=learn+regular+expressions.
May the burning shame you now feel imbue you with the power to answer your own questions in the future. I see good things for you.
IOW, retarded troll is retarded.
Admin
Admin
Admin