- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
I forgot to add a couple important details: adding the CR/LF and removing it serves no discernable purpose and is indeed a WTF, and just because you won't get an exploit from a buffer overflow doesn't mean you shouldn't protect against it. Most embedded programs are contrived in such a way that they can avoid the potentially costly bounds checking by always knowing it won't happen.
Admin
Finally a WTF in a language I fully understand.
If this is an embedded system, then most likely the lengths of header and value are known not to exceed a given length. If we obtain buffer contents from an internal trusted source, we just need to make sure our buffer size N is big enough to contain the longest string possible.
Ignoring return value from sprintf is ignorant but common among C programmers. Usually you don't care about it too much.
Dropping a NULL|0|'\0' into the middle of a string is a common way to truncate junk off the end of the string. We do it all the time. Just don't do it in anything security related like passwords, keys, etc....
Putting stuff into your own string constant that you then truncate away is a WTF. The rest is just apathetic programming.
Admin
In defense of using ZeroMemory if you don't have to worry about portability.
(BTW, neither g++ -W -Wall or Visual C++ at level 4 warnings warn about this.)
Admin
Not if the string is in UTF-8 (like the guy writing this code would care).
Admin
What? you mean I can't have a buffer of 3.5 characters? how awful!
Admin
Admin
Solution - implement own stpcpy :) stpcpy - copy a string returning a pointer to its end
char *stpcpy(char *dest, const char *src);
"CONFORMING TO: This function is not part of the C or POSIX.1 standards, and is not customary on Unix systems, but is not a GNU invention either. Perhaps it comes from MS-DOS."
stpcpy(stpcpy(stpcpy(buffer, header), ":"), value);
YAY!
Admin
You could also work with counted strings too. Not sure what the tradeoffs would be of the two approaches... would probably depend on how your strings were created and what modules they were passed to.
(BTW, I like the MS-DOS comment.)
Admin
YES! Somebody else has noted the WTF about snprintf!!!
Imagine... A function that was created to get rid of a long standing WTF (no bounds checking) introduces another WTF (not ensuring the string gets terminated).
Admin
Ahhh... so the WTF is the GNU documentation for snprintf! I won't quote it here, since I'd have to quote the whole thing to show it's not there, but the GNU/Linux man page for snprintf is quite ambiguous as to whether the string gets terminated or not.
Sorry 'bout my confusion in earlier post.
Admin
Are you american?
Admin
There is nothing wrong with this code, its fine. I'm presuming sprinf is a transcript typo, since the linker wouldn't have let it pass. Yes the call to strlen is redundant because of sprintf returning the length, but its really No Big Deal unless this code has to run at absolute lightning speed.
Get over it.
Jon
Admin
Some of the comments here suggest that their submitters are good candidates to become the next CodeSOD makers.
Watch this space...
(Captcha - digdug, means "a tickle" in Hebrew)
Admin
Well, I've visited their website, where I learned that "Your Require. Our Passion.", but I'm still not really sure who or what 'sprinf' is.
Admin
Yeah, my F indeed :).
Here're two variations on the correct version:
Admin
Actually there is something wrong with this code. Or do you routinely add stuff, and the remove it just because you can?
Just because code works, doesn't been there is nothing wrong with it. My predecessor would use the default MSVC templates to build a file, and then add his code to it. I kept asking him to clean the code up and remove extraneous calls (for example the call to ShowWindow(SW_HIDE); followed by an UpdateWindow(); for an app that uses a hidden window. I've been fighting the extraneous cruft he left in his code ever since.
The WTF isn't that we ignored the return result to sprintf. Its that he is remove something that he purposefully added for no good reason the line before.
Admin
Mwahahaha I love TDWTF ! Even the users comments are great WTFs :)
No, boy. N is the array length, not the string length. 0 == '\0' -> surprising isn't it ? This one has obviously never heard about the standard end-of-line marker in network protocols. Wrong ! man snprintf. This should be N, not N-1. Another one ! Your friend akatherder made the same mistake. Wrong ! snprintf is C99. Nope, he removes the LF (\n).Etc, etc. I could go on and on.
Admin
Good one! reddited.
Since it's zero length, it just compiles to a no-op. If memset() is defined as a macro, the savvy compiler might throw it out altogether.
Admin
And two more bonus point to the one that explains why the coder first added \n\r int the sprintf and then removed it.
Admin
I should probably fess up... that's not my idea to search for it, and in fact it's not even my regex. The search came from here which was linked to from a /. post. That link has a bunch of other Code Search links, mostly to amusing searches. For some reason most of the links are broken though... but the one to the memset one works.
Actually the really amusing thing is that even if you get it right, compilers may still throw it away when you use it to clear a sensitive buffer before freeing. The optimizer says "hmm, you assign 0 to this region, then free it immediately. Since you don't look at it afterwords, it's a useless assignment. I'll just remove that for you." There were a couple real-world products that had this sort of vulnerability, including a crypto library and something by MS. (This example is part of my research group's talking points for why (some) code analysis should be done on compiled binaries instead of source ;-).)
Admin
Hmmm... Just to get shure...
Assuming the following: 1.) this function doesn't need to be reentrant 2.) embedded system - so probably not multithreading 3.) multiple instances of buffer wouldn't be needed..
... so I'd say that the real WTF(TM) would be that buffer isn't static... Because: a.) static variables generate fewer Bytes in the Binary (no Stack magic - especially if the embedded CPU doesn't have ENTER/LEAVE-Opcodes) b.) faster access c.) some embedded CPU's don't have Stacks (or very small ones) - so the Stack is emulated in SW (Example: Compiler cc65) d.) Binaries using static variables are faster (see [a])
Admin
C guarantees that (int) 0 or (foo*) NULL will be converted to (char) '\0' here. However, the compiler may warn about converting a pointer to a char.
Admin
Sigh. Kids today don't even know how C-style zero-terminated strings work anymore. Reminds me of this blog entry someone posted a link to not too long ago on this very forum: http://www.joelonsoftware.com/articles/fog0000000319.html
Admin
the first half of the first page of comments for this article is so full of WTF's, that i simply cannot bear to read further.
Admin
For me, this is the first TDWTF in which the WTF-ness of the comments utterly eclipses the (already considerable) WTF in the article.
Way back when, TDWTF used to have plenty of idiots (GoatCheez, anyone?) but at least they were idiots who had actually done a fair bit of work related to computer programming.
The generation of idiots we have now seem not to be actual computer programmers, and as a result their comments are no longer grand follies that begin "Let me tell you, good sir, why it is that your solution can never work on AIX...". No, they are petty and silly follies that could be generated algorithmically without much trouble.
Shape up, people. I'm talking to the idiots in particular here. It's ok to be wrong, but you have to be wrong in a grand, majestic way -- not just plain thick.
Admin
But I just wanted to bring my all-time C-coding favorite: http://www.plethora.net/~seebs/faqs/c-iaq.html
Some of the statements here just reminded me :-)
Admin
But when you're running the thing once you call the function, that array would stick around if it's static. (In fact, it's probably taking up [virtual] address space even if you haven't yet called the function.) Is that worth maybe saving a couple bytes of code?
(You'd only save on the stack magic at all if there aren't any other locals; it's as cheap to allocate one word as a hundred on the stack in terms of code size. The only place you're likely to save is if instructions the reference the buffer would be longer with [reg+offset] than [address]. I don't know enough about embedded programming to be sure to what degree this is true.)
Also, why would static be faster, other than if making it static would let you drop the stack increment and decrement?
(c) seems like your best argument.
Admin
"Only sometimes. It's not portable, because in EBCDIC, i and j are not adjacent."
Ha, best answer ever.
My favorite bug I've seen was when I was working for IBM's z/VM group for some client software that would allow you to configure the servers. Our software connected to z/VM using sockets, and we'd send commands and receive the response over the sockets. Most of the talking was text, but in the responses z/VM prepended the length of strings before transmitting them, but the lengths were in binary rather than "123" strings. The whole load went through EBCDIC to ASCII translation before going out on the wire, so by the time we got them the lengths looked like garbage. (The sockets server was in development at the time, and this was a bug carried over from when it used RPCs.)
Admin
I've a different point of view :-)
1.) Stack-operations: Well, on "bigger" CPUs, you are right - especially when you have both a Base and Stack-Pointer and operations like "movl 8(%ebp), %eax" are possible. There are tons of CPUs that only allow PUSH/POP and few registers - so Stack-working is expensive (you want the 10th element on the Stack ? ok: POP A, POPA, .. :-)
2.) Some CPUs allow Stack-adressing with one "general-purpose"-Register - so you could write "LDX #8; LDA 100,X; " - again more work than "LDA VARIABLE".
Therefore I meant that this problem exists specifically on CPUs without Enter/Leave (or direct Stack-Access via ESP, EBP, ... )
And therefore static variables would be /in these cases/ faster :-)
Admin
On my Ubuntu 6.06 system, 'man snprintf' says (the manpage is dated 2000-10-16, btw):
"The functions snprintf() and vsnprintf() do not write more than size bytes (including the trailing '\0')."
That's clear enough for me.
Admin
Actually, the real wtf here is that they didn't just do:
Admin
Looking for the most efficient solution, you might call strlen() on both the source strings and then use memcpy to build the target string using the now known positions. memcpy could well be compiled more efficiently than any handwritten loop. strlen is only scanning for a 0 character we have to look for anyway.
Note I have assumed one of the later standards of C that allows variables to be declared near first use. If that isn't allowed in your standard then declare them at the top.
Note that if the text doesn't fit in the buffer I have written it to write what it can, with no 0 terminator.
Either way it returns what should be the required length of the buffer, i.e. target string + 1.
A revised strcat that returns the end of the string instead of the beginning, but uses strlen and memcpy might be preferred. In addition we may have an option of that that allows the length of the source string to be passed in (could be more efficient if we know it).
Admin
Inconceivable!
Admin
Ah, I see. I'm not apparently familiar enough with emebedded programming; I didn't realize that you wouldn't necessarily be able to do, say, esp-relative addressing.
You're still doing excess work; you're traversing str1 and str2 twice, so you've got more loop counter tests/increments than you need.
How much of a difference this makes, I don't know. But I'm pretty sure you could do measurably better than combinations of C library functions. I don't know if it'd be worth it though.
Admin
I'm completely with you...
We assume the following: 1.) You are using actual Compilers (therefore good optimization) 2.) the CPU is extremely limited (because the 2 bugs shown by the thread-opener are worth mentioning despite of [1]) 3.) Memory is /probably/ limited, too...
This means: (a) Because auf (3): The worst thing you can do is building a secondary libc. Even worse would be linking this "mylibc" into /each/ binary you write :-) (b) If you really need a general function like "mystringcat" - the function should be highly optimized for the specific CPU... Using (at least inline-)Assembler and Blockmoves.. Even good old Z80, 8086 and others had some commands for this... (c) If your real Bottleneck is only CPU - why writing functions ? Why not Macros ? (d) I assume I miss the point... Using the "most efficient solution" from Earl Purple: He wants to
So... Why wouldn't one use memccpy ?
Or do I miss something ??? I hope there are no assumptions that memmove, .. exist but this function doesn't :-)Admin
Seconded. C string handling generates a lot of WTFs anywhere.
I work in a company doing networking hardware code, embedded in C (no, not 'cisco). UI is all done in India, where it's easy to get C n00bs (but, of course, any programmer can do UI, right?) and that's where all our string-handling is.
I've had to clean up stuff like folks clearing out a buffer with:
As a bonus, lots of our targets are VxWorks where the above won't crash. Hit it on a Linux target, though... blam!
captcha: pointer (appropriate)
Admin
Because you have to know n before calling memcpy. Thus the dynamic execution looks like you have the following two loops (assuming a naive implementation of strlen and memcpy; in reality you can do rather better than this, but not in such a way that it affects my point; I also wrote these using array references while pointers would improve the code, but this is a good opportunity for the compiler to step up):
Admin
And therefore I included the manpage... You needn't go over the string 2 times...
Example - pseudo: startpoint_for_string2 = memccpy(buffer,string1,0,strlen(buffer)); if (startpoint_for_string2) { memccpy(startpoint_for_string2,string2,0,...); } buffer[sizeof(buffer)-1] = '\0';
Mixing up memcpy and memccpy is a real WTF :-)
Where is the problem ?
Admin
:-)
Admin
Smacks head This is why I shouldn't post on the internet after 3 hours of sleep.
I noted the memccpy at the end, but for some reason assumed it was a typo. Sorry about that.
Admin
Simple fix:
char *str = NULL; int len = asprintf(&str, "%s: %s", header, value);
Note: To remove the "\n\r", you might want to remove it from the format string. The get the length, you might use the return value of sprintf/asprintf. To allocate the buffer, you might leave it to the only one knowing the length of the string (sprintf is insecure). If asprintf() isn't available in your libc, COMPLAIN ! Sidenote: Wasn't it "\r\n" anyway ?
Fix for asprintf (dirty, costly): int fd = fdopen("/dev/null", "w"); // fastest write on earth int len = fprintf(fd, "my format string", ...); char *buf = malloc(len * sizeof (char)); sprintf(buf, "my format string", ...); // whatever with buf free(buf); // don't be stupid
Admin
sprintf ? Funny... So you'd
hehe, I think the original method (the one posted by the thread-owner) was extremely faster :-)
Admin
Finally, someone who actually knows "C". :-)
Admin
int len = strlen(buffer); buffer[len-2] = 0;
Hm.... it makes me wonder what happens when length of the buffer <= 1?
buffer[-1] = 0; <--- Oops!
Admin
char buffer[N]; ...... int len = strlen(buffer); buffer[len-2] = 0;
Don't we have 'N' that specifies that buffer length? what's the point to use strlen() then?
BTW - how many point do I get for this one? ;-)
Admin
GCC and any decent C++ compiler should have caught all these errors (I don't use "C" much anymore, but I do a lot of "C" compiled as C++, which is a lot nicer).
Addendum (2007-02-15 15:07): Whoops, shouldn't post while taking cold medicine. C++ will catch a few of the memset errors listed, but not most of them.
Admin
-1 because it's wrong, -1 because you didn't read earlier in the thread to see that this has already been posted an corrected. So that's, uh, negative two points.
N is the maximum length of a string the buffer will support, strlen gives the string length that is ACTUALLY stored in the buffer.
If you do, for instance,
#define N 100 char buf[N] = "Hello World"; printf("N=%d, strlen(buf)=%d\n", N, strlen(buf));
you'll see what I mean. Adding a '\0' at strlen(buf)-2 would drop off the "ld" from the end; adding it at N-2 would do nothing.
(If you're too lazy, the output is "N=100, strlen(buf)=11")
Admin
It may be on your machine, but the C standard does not guarantee it.
Admin
I assumed the following:
1.) Character constants are of type int.
2.) '\0' means a character constant (therefore an int) with all bits zero.
3.) (char) 0 means a character (NOT a character constant) with all bits zero.
4.) and integer with all bits zero is equivalent to a long with all bits zero is equivalent to a short with all bits zero is equivalent to a character with all bits zero ... and so on.
Therefore the C-Standard guarantees that the condition '\0' == char(0) matches.
5.) The only exception is the (in)famous NULL-Pointer.
If you don't know the machine, you can't say the bits of 'p' - because the NULL-Pointer just guarantees to compare unequal to a pointer to any object or function - not more. So the null-pointer is implementation-defined... whereas a char of value 1 isn't :-)
Please tell me where I'm wrong according to standards...
Admin
NULL is neither in C nor in C++ standard. It is a pre-processor artifact and usage dates back to beginning of C. Try compiling C/C++ the following program on a Unix system see what happens (clue: it does not compile)
int main(int c, char **v) { return NULL == 1; }