- Feature Articles
- CodeSOD
-
Error'd
- Most Recent Articles
- Secret Horror
- Not Impossible
- Monkeys
- Killing Time
- Hypersensitive
- Infallabella
- Doubled Daniel
- It Figures
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
I guess changing the format string would have been to simple.
Admin
String constructs have been around for a while. Can we stop teaching character arrays in language classes/books?
Admin
Hear hear! I am sick of CIS classes that teach language agnostic code in favor of OO concept but fall back to pointless character array massaging that has nothing to do with the implementation of a String object anyway. I have been picking this bone for quite a long time.
Admin
Anding with 0xFF wasn't very necessary either.
Admin
Another reason not to use char arrays:
char tempStr[16];
Is an overflow waiting to happen. The input could be 18 chars long.
Admin
The natives must have taken one look at this and went "Ooh. Scary!" then toddled off and wrote a CS101 implementation of a replace (remove) routine.
Admin
It's a one-off function, probably copy-pasted from elsewhere and unlikely to every be called anywhere else... although I'll grant there's a chance one of these code-monkeys could end up calling RemoveColonsFromMacAddr to remove the colons from a path or something.
Wow, it must have been lovely coming back to find a bunch of twits have hacked in whatever they needed with a crowbar and dynamite.
Admin
But of course it was necessary! What if they were going to run this code on a computer designed in the 50s or early 60s*, before a 'byte' became universally defined as 8 bits**? Machine independence aside, let's say I did
this:
char macstr[18]; // 01:23:45:67:89:AB + \0 terminator
If one of my bytes was greater than 255, it would overflow my buffer, smash the stack, and someone could take control of an networked machine that may well have almost as much computing power as a barcode scanner!
* Ignore the fact that these machines preceded Ethernet by at least a couple of decades.
** See http://dict.die.net/byte/
Admin
I agree. Learning how to use arrays can't hurt, and character arrays can be good for demonstrating array functionality, but aspiring programmers should learn how to use what's already out there (string libraries) before they learn how to circumvent it (character arrays).
Admin
Well, apparently original code was in C so whoever maintained that code had no choice really.
Admin
Most of the responses here seem to be a WTF. The WTF here has nothing to do with the WAY the colons are being removed. This is C. IT DOESN"T HAVE STRINGS. The only way to remove the colons is to do it on a character by character basis.
The WTF is that the colons are added by the sprintf format string:
Admin
<FONT face="Courier New" size=2>wow. this is how it happened:</FONT>
<FONT face="Courier New" size=2>customer: "our application wants to read in the mac address as a string without colons."
project manager: "yes, of course. we'll get on that right away! steve-dave, the customer wants the format of the mac address changed. the new requirement is that the mac address string should have no semi-colons. top priority! can you get this done by 5:00?"
steve-dave (looks at watch, reads 4:28): "sure thing!"</FONT>
<FONT face="Courier New" size=2>meanwhile, inside steve-dave's head:</FONT>
<FONT face="Courier New" size=2>"hrm. hexadecimal. those consultants know their shit!"
"arg. there's no way i can figure out this sprintf crap."
"they're using pointers, too! this is impossible."
"4:47!? shit shit shit."
"ah! i got it! i'll just loop through and take out the colons!"
"there, all done. check-in...done! i'm outta here!"</FONT>
<FONT face="Courier New" size=2>monday morning:</FONT>
<FONT face="Courier New" size=2>project manager: "laird, we're getting tons of bug reports about the app crashing. get right on it!"</FONT>
<FONT face="Courier New" size=2>later that day:</FONT>
<FONT face="Courier New" size=2>laird: "i can't find the problem. the system looks hosed. it's like some kind of fundamental function was changed to do something entirely different."
project manager: "noooo! we need to get the consultants here right away! i just knew they had ripped us off with their buggy system!"</FONT>
<FONT face="Courier New" size=2>later that week:</FONT>
<FONT face="Courier New" size=2>consultant: "what the..."</FONT>
Admin
I've had to use character arrays occasionally for input sanitisation...one more level of security to make sure someone wasn't typing "/#hahaha you suck in the password field in hopes of whacking a SQL statement.
Oooo, scary memory surfaces. I had to work with a TurboImage/XL database for a couple of years once...One of the old flat table jobs, where the "fields" are only defined in terms of numbers of characters, so you know that the "name" field starts at position 23 and ends at position 62, and every line of data fetched had to be split into it's components to be used. Talk about a nightmare.
No doubt some of those things are still around, so it's probably not time to kill characters just yet.
Admin
There are many good reasons to use a character array, size and speed being the two that immediately come to mind. For my current application, my code needs to compile to be as small as possible. Objects such as std::string or CString take a considerable amount of behind the scenes code to manage, compared to just using a built-in type. Besides the cost of actually using the library (STL Port's version of std::string adds something like 1.5K into your executable), the code associated with creating and destroying objects is considerable. I'm only talking about 40 or so bytes in this example, but 40 bytes extra in 1000 places would more than double the size of some of my executables.
The following code is considerably smaller (less than half the size with Dev Studio 6):
char *bob = new bob(strlen(bobA)+strlen(bobB)+strlen(bobC)+1);
strcpy(bob,bobA);
strcat(bob,bobB);
strcat(bob,bobC);
...
delete [] bob; //This is only bad if you need to throw exceptions or return in many places
than if you use an object:
std::string bob(bobA);
bob += bobB;
bob += bobC;
which is smaller than:
std::string bob = bobA + bobB + bobC; //This creates a temporary std::string to assign to bob
I'm guessing size was not a consideration in the code above though. Clearly they could have made their code smaller by removing tempStr completely and doing an inplace move in macAddrStr :
Admin
Sure. Just as soon as we stop teaching low level languages that don't HAVE strings.
Admin
Are you sure? I think each of those arguments are passed to sprintf as ints.
Admin
Depends whether the utf8 type is signed or not.
Admin
I'm wondering if all the code presented to the public is written by internal developers.
Admin
As long as there is hardware which does not directly implement string constructs, we will need character arrays.
Admin
Well, it's not utf8, it's uint8. Which conventionally mean unsigned 8-bit int... Which is usually an unsigned char. And if that's the case, then it works... but not because it's unsigned, but because it's 8 bits and passed by value.
But the fact that I had to spend 10 minutes figuring it out means that putting the 0xFF is worthwhile, since it removes all doubt.
Admin
Then the code wouldn't compile, because there's no way to get an uint8 type on that platform. Simple.
Admin
You've missed the point: you're optimizing code that shouldn't be there in the first place.
Admin
I just wonder where all these horrible programmers come from.
Admin
Here is a better idea that is exception safe:
char *bob = (char *)alloca(strlen(bobA)+strlen(bobB)+strlen(bobC)+1);
If you want to make it fast you could:
size_t alen = strlen(bobA);
size_t blen = strlen(bobB);
size_t clen = strlen(bobC);
char *bob = (char *)alloca(alen + blen + clen + 1);
strcpy(bob, bobA);
strcpy(&bob[alen], bobB);
strcpy(&bob[alen + blen], bobC);
bob[alen+blen+clen] = '\0';
Exception safe, faster than any string class, and cannot leak memory. Course the first non-expert programmer to come along will not understand it, go WTF, and introduce a lot of bugs. (Actually since I didn't test this I wouldn't be surprised if I wrote a bug in there, even though I've done essentially this more than once)
If you want to do it right:
No bugs (other than maybe a syntax error the compiler would catch), hard to mess up. Works are all modern operation systems except Linux because the glibc guys are convinced that real programmers never make a mistake in C strings, so they won't give you the safe strl* functions for handling strings.
As you might guess, I've had to work with C pointers andcharacter arrays an awful lot in my "C++" code. (Actually much of the time I'm working with void *, and treating it like a sequence of bytes. Embedded nulls are expected. I've done it with strings too though because out lead programmer hates the STL enough to forbid using it, but that is a different rant)
Admin
Um, no, it will ALWAYS be 13 characters long, including the '\0' at the end. Always.
But tempStr is completely unnecessary - this works just fine:
void RemoveColonsFromString(char *m)
{
for (char *p = m; *m != '\0'; m++)
{
if (*m != ':')
{
*p++ = *m;
}
}
*p = '\0';
return;
}
Admin
Everybody seems to have missed the most obvious replacement for the original chunk of code, so we might as well include it for completeness sake.
Admin
That'll return the hexadecimal value of the location of the pointer to the MAC address, not the actual MAC address, or even its location.
Admin
Oops. Correction: That'll return a 12 digit hexadecimal string of the first digit in the MAC. Eg, 0000000000xx.
Admin
The most obvious wrong replacement, since the third argument to sprintf() will be an unsigned char promoted to an int.
In other words, only one-sixth of the intended value.
ok
dpm
Admin
I'm not sure what you are getting at here. 'String' is a programming abstraction as is 'character array'. What does that have to do with hardware?
Admin
Your reason for assuming that RemoveColonsFromString()
will never be called from any other function is not apparent to me.
Do you make this mistake in real life?
ok
dpm
Admin
No, it will always be 18 characters long, including the '\0' at the end. You forgot about the colons! 6 * 2 (for the hex digits) + 5 * 1 (for the colons) + 1 = 18.
Admin
Be nice if borland included it in older C++ compiler versions. Since it didn't, I just put it in my project:
http://fxr.watson.org/fxr/source/libkern/strlcpy.c
http://fxr.watson.org/fxr/source/libkern/strlcat.c
They shouldn't add much to the size of any C program; my own rewrites are a barbaric combination of these and the optimizations of the regular strcat/spy, but that's not your goal.
Admin
I pray I never have you verifying my code.
Admin
Thank you for the suggestion Hank. I'm sure that code is faster, but unfortunately this code is actually 32 bytes bigger. But is the same size, and exception safe, if you simply use alloca instead of new:
char *bob = (char*)alloca(strlen(bobA)+strlen(bobB)+strlen(bobC)+1);
strcpy(bob,bobA);
strcat(bob,bobB);
strcat(bob,bobC);
Use of alloca or new depends on available stack space. alloca definitely uses less memory than an tr1::shared_ptr!!
Sadly my compiler does not support strlcpy and strlcat.
Thanks Jan, you made my day. ^_^ I was hoping someone would enjoy my pointless "correction".
Admin
The input will, sure. But the temp working area doesn't need to be as big as the input. 18 is better than 16, but 13 is better than both of them.
But your point is taken. Why introduce an artificial limit? Better to measure the length of the input string and do a malloc/free, or avoid the issue altogether by doing an in-place replacement.
To answer someone else's question, I assume that the function will only be called with MAC addresses, because of the way its named. But there's a good point there, too... MAC addresses aren't always separated with colons, sometimes people use dashes. So perhaps the if (...) should be
if (
('0' <= *m && *m <= '9') ||
('a' <= *m && *m <= 'f') ||
('A' <= *m && *m <= 'F')
)
{
/* extract *m ... */
}
Admin
RemoveColonsFromString("AreYou100PercentAbsolutelyPositivelySureAboutThat?");
Admin
Um, the function's name is RemoveColonsFromMacAddr( )
It's fair to assume that the input will be a MAC address, delimited either by colons or by nothing at all.
No, in real life I, too, try to avoid arbitrary limits. See my sample code.
Admin
I do make other mistakes, though. :) Here's a version of RemoveColonsFromString(char *s) that actually compiles and runs and works and everything :)
void RemoveColonsFromString(char *s)
{
char *t;
for (t = s; *s != '\0'; s++)
{
if (*s != ':')
{
if (t != s && *t != *s)
{
*t = *s;
}
t++;
}
}
if (*t != '\0') { *t = '\0'; }
return;
}
Admin
Okay, I'll try to defend this one...
I imagine that the Mac address will be needed both with colons (as probably required by the original code base), and without colons: probably the result of some new requirements and new additions to the code.
In this case, what are the solutions?
Now, because they are using C, the second method is actually more cumbersome and less readable than the first one, but I can appreciate where these guys (probably) came from: a language that gives you a string-replace function.
Admin
Admin
Except MacAddrToChar itself calls RemoveColonsFromMacAddr.
Admin
Now to take it back in the opposite direction: obfuscation (why? Because I'm bored.).
Can anyone make it more obfuscated?
Admin
Not really. I mean, that would make some sort of sense, but it can't be the explanation here, as they modified MacAddrToChar to call their take-the-colons-out function; so they no longer have a version *with* colons.
Admin
? Doesn't compile for me (gcc)
warning: pointer/integer type mismatch in conditional expression
I get that ':' == 58, but what does 0[t] do?
Admin
It gives a warning, but it still compiles just fine.
x[y] is exactly equivalent to y[x] is exactly equivalent to *(x + y). It's all just syntactic sugar. :)
Admin
Ah, I see... yes, beautiful :)
Admin
That's it? That's your reason?
I fail to see how the name of the function will prevent it from being called improperly.
Someday there will be an access violation, and your "reason" is not going to impress anyone.
Just admit that the code sucks in many ways, and this is one of them.
ok
dpm
Admin
Well, you should still remove the pointless add-colons-then-remove-them from MacAddrToChar(), whether you keep RemoveColonsFromMacAddr() around or not.
Admin
Well, sure, but only because some guy who likes to abuse the people who have to maintain projects didn't bother to put in comments explaining what the code is intended to do, and why it does it that way. At the very least, he could have included a '//WARNING COMPLEX CODE AHEAD'
Seriously, when writing code that someone will have to maintain (often some less experianced than the original author) its a good idea to explain intension, unusual methods or finer points that they might not see right away. There are a lot of new guys out there that get stuck on maintainance, show 'em why you do what you do so they can learn faster.