- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
I came across a similar bit of Java code in our project. Feeling all clever, I re-wrote the thing using Java's utility number format, then benchmarked a few million iterations so I could feel even more clever about improving both the codebase and the performance. Turned out that clunky old pile of WTF was actually 2 orders of magnitude faster than a one-liner using the standard number format libs. I knew the number format classes weren't exactly super fast, but that was an eye-opener. I learned a lesson, but I'm not sure it was a good one...
Admin
So is 1 000 000.
Admin
sprintf(buffer, "%d.%d", n/100, abs(n)%100);
Enough said.
Admin
Admin
That said, though, it took far less than "a few minutes" for me to see what this code was doing, and exactly how. More like a few seconds of looking at the first several lines for a general clue, the next few to be a bit more specific, and the rest is just confirmation and details. It has a few issues, like the potential buffer underflow if ported to 64-bit or if the external buffer isn't big enough, and yes the names could be clearer, and some of the predecrements aren't clear if you don't already grasp what it's doing, but other than that, this is perfectly good fast clean C.
TRWTF is people who can't understand this code pretty much instantly, having jobs in programming.
Admin
If it takes you a few minutes to understand it, go chose another profession. This function is pretty obvious and, in fact, it's a reasonable implementation, assuming the paradigm it was obviously written in.
One might look down on the global variable prtbuffer, but such things weren't always thought of as bad.
Admin
If it is a snippet inside of a small application, that might work, but still I don't like that. I guess I've just had to deal too much with problems because so many people skip localization issues. Here in Finland the locale is spaces between thousands etc. and comma as decimal separator. Of course that does not mean that all input will come in as such: the thousands separators may be missing, users may write decimal separators in GUI as commas OR as periods, when parsing output by another program or service you can never know for sure which locale it has used. So, even setting a locale or parsing with a certain locale is not a solution, in real life you need to put strings first through some kind of wrappers or filters.
I've worked mostly with Java enterprise stuff lately, which includes lots of moving data as xml or something so this is quite a common issue for me. As someone said that branch of IT often means using third party libraries and tools and too many of them just ignore the issue and assume that the whole world uses just one kind of formatting. It is quite annoying that such a basic thing like parsing a numeric value from a string must always be paranoidly checked and watched.
And of course it is not just numbers, Christmas Eve is 24.12. here, not 12/24, but the decimal separator issue is a clear winner here.
For not to be completely off topic, I also vote this to not to be real WTF. After rotting my brain with that Java EE stuff and not using pointers actively for almost a decade or so that seemed quite trivial. Maybe pointer arithmetics are like what they say about driving a bicycle, once you get it ...
Admin
Interesting, been to Sweden a couple of times, and hadn't noticed that... (which doesn't paint me in a good colour, I know ;-) Thanks for pointing it out!
Admin
It looks like old code. So you can assume long is 32 bits, and, thus, 20 is long enough. A global "print buffer" array was pretty common, and, in fact, wasn't really any burden on the environments we had.
It's not 2008's webservicy OO code, of course, but you can complain about that to me when you do an accounting program in 4 KB RAM.
Admin
In binary you just have to know is it little-endian or big-endian. Why bother with text formats, locales and stuff?
Admin
There are some (small?) WTF's in here:
However, it seems the real WTF is the submitter, in this case:
Doing that completely makes sense. Also, there is nothing goofy about using modulus math, especially not to do something like this. In fact, short of using built-in library functions (not always available on embedded devices), I would hardly have a clue how to implement this *without* using modulus math (although I'm sure it can be done in some convoluted way). "some reason" being that otherwise it would not fucking work, at all. Yup, seems fine to me. The remaining problem is that it does not fill the remaining buffer size with spaces, i.e. it assumes the buffer is filled with spaces beforehand, which is a bit awkward.Admin
I like the code and learned a bit by reading and playing with it.
The only thing I'd do different is this, but maybe there's a reason not to.
// only put a comma if there are more bytes in n if ( ( ( ++d % 3 ) == 0 ) && n ) *--buffer = ',';
Admin
Why would it need to fill the buffer with spaces? It returns a pointer to the start of the string it has generated, i.e. the '-' sign or the first digit.
Admin
Btw it returns the pointer to the last position it wrote to so it has no requirement that the buffer is filled with spaces or anything.
Admin
sadwings: That'll require a few more operations than just disregarding a surplus comma at the end. In fact, the modulus on d is going to be a lot slower than comparing d with 0 and resetting it to 3. These are very low-level differences with extremely small gains in speed but for an embedded device with a cruddy processor it might make a bit of a difference. Whoever wrote the code was obviously aware of this.
Admin
TODAY. Once, it was guaranteed to be 32 bits.
Admin
Yup, gonna sing with the chorus today. I saw the function name, saw it looped around the digits of a number, saw the insertion of commas, and immediately knew what this function was doing. Spent a few more seconds looking for anything WTF-worthy, and didn't find anything.
Sorry Alex, not today.
Admin
You did not read the code. It returns the pointer at the beginning of the number, NOT the beginning of the buffer.
Admin
Admin
I won't argue with you. I deal with a lot of financial software and converting strings to numbers which means I have seen a LOT of ways of representing -2910.74 such as "(291074)", "2,910.74-", "2910.74 (overdrawn)", "-2911", "10000291174", etc., along with a bunch of date formats I won't get into. That's how it the numbers are shown on screen for humans to read. Bizarre just starts the conversation.
OTOH: A financial app written for the US market is likely closely tied to US tax code and regulation anyway... nobody is going to pick up a copy while they are on holiday and install it when they get home...or, if they do, the localization oddities won't be the biggest problem. :)
Not everything is innately international.
Admin
Sure, it's Implementation Defined, and you can find out what your implementation defines it as by reading LONG_MAX and LONG_MIN from <limits.h>. Does that make it a WTF?
Assuming that the original programmer read the documentation for his implementation and was aware that it was non-portable, I still don't see a WTF. Ok, so maybe he should assert(sizeof(long) == 4).
Admin
Admin
♫ Program in C, Program in C, Nothing is better than that single letter, take it from me! Interpreters take your speed away and python is totally gay, while we're devoting our time to floating -point numbers in C
Admin
Admin
There is nothing wrong with this function. A good programmer understand it in far less than a minute. Of course it would be better to have a comment on top of it, but that's all. If your are not a good programmer (or if you are just a Java XML drone) you might of course want to fill it with storytelling comments.
A good programmer also understand why it has been written like that.
Indeed a good programmer already have written lots of functions like that.
Please come back to us when you have enough programming experience, in various fields.
Admin
Admin
Have we really sunks this low?
Okie, repeat and summerise...
In a lot of places, there is no justifiable reason for calling out to sprintf for this. It's very clear what it does.
With the exception of the mentioned overflow, it's good.
Comments: If you need comments to read something like this, you probably need more coffee or a book on C. Thread safetfy: Not all programs are threadded. If this is embedded code, or a heavy linear cruncher, threading could well be counter productive. Going in tens/Modulus use: There's no other way to do it. It's very simple char offset printing. Variable names: um, why? digits_left_til_comma wouldn't gain you much over d in this case. Overflow possiblity: Cash register? If it's outputting to a fixed size display, or outputting a limited set of data, it probably has bounds checks elsewhere· Spaces to the front of the buffer: why? that assumes the author wanted to pad.
A couple of special mentions ...
Which in most numbers is less efficient, since you're doing an extra comparison every loop, instead of a single comparison and increment at the end.
Yeah... sprintf to one buffer and loop the array assigning to a second buffer which is returned. So you have all the overhead of sprintf and then do a manip loop anyway, with the added bonus of having to pull the data from a buffer instead of compute it.
Talk about cutting your nose off to spite your face lol.
As has been mentioned, there are library functions for this. And considering the first paragraph of your post, I find the comment of C being about getting stuff done to be funny. Yes, I know you wouldn't use sprintf, but it's still funny :p
The global may have been defined a couple of lines before the function, or may be shared between several such formatters. That doesn't make in unmaintainable. Neither do magic numbers... the only one I see is the buffer size, and we don't even know for sure that the buffer isn't a 4k buffer shared between formatters. :pNow goodness help us if someone had done int len = (buffer - prtbuf) - 20;
Or... for (i=in,o=out;o;(i++)=*(o++));
Heck... I suppose you'll find this a wtf:
Addendum (2008-11-28 13:52): for (c=s;*c;c++) if (ALLOW[c] == 0) return (char)0;
coughs, looks sheepish, and hopes no one replied yet
Admin
OK, in the vein of minor improvements (other than maintainability, like variable names), I think this would improve the speed by a couple of clicks:
It takes just as long while looping through n, but because it tries to add a comma at the beginning of the loop and then decrements d at the end, it won't add any superfluous commas at the beginning of the string. Thus, we save time by not adding and then having to check and remove it.
Admin
Admin
Damn, it looks like a perfectly good function to me... I hope I don't end up there some day.
Admin
It's a very self-descriptive function name, perfectly understandable variable names, totally obvious control flow, and only idiots who having learn PHP a week ago think they know about programming think comments in a function like that are anything but an obstruction that makes it harder to understand. Not to mention that what it does is on the level of the second hour of programming 101 class. Too bad Jake apparently hasn't heard about modulo division or positional number systems.
That's a perfectly fine function, and guess what, sometimes there aren't standard functions available. My bet (especially given the apparent threading/overflow thing pointed out before) is that it's from embedded code, where it's totally okay to write unsafe code to save a few bytes.
Admin
Struggling to find any WTFs?
In today's world where multithreaded apps are everywhere, this function really sucks. It puts the resulting string in a global buffer! Call that from two different threads at around the same time and at best one of those threads will see the other thread's output instead of its own.
Or how about the fact that it fails for the most negative number a long can represent (0x80000000), because in that case -n equals n in two's complement.
Other than those major problems and the lack of comments and short variable names (which lets be honest many of us would do the same), it seems usable in limited circumstances.
Admin
When I was 12 years old, I wrote similar code in CBM BASIC 2 for some task. However, IIRC, the code was much simpler than that.
That were times...
Admin
In the end, after formatting 0 -> 5,000,000 (did that formatting myself, no program needed!) they all took about the same amount of time (under 20 seconds each set...I have a lousy pc)
In defense of this SOD being a WTF - I write piles and piles of C code at the day job. Midday, I could figure this out no problem. If I had to figure it out during a middle-of-the-night production problem, no chance. In other words, TRWTF is that C stinks at re-formatting numeric output.
Admin
Yeah, and in today's world, Babe Ruth wouldn't make it out of minor ball, Joe Namath wouldn't last 5 minutes and Wilt Chamberlain would've been luck to sleep with 50 women. (Although, in today's world, Bobby Orr would've been able to have knee surgery and likely play another 5-7 years...)
My point being that this looks like code from before. Not worried about 64 bit or multi-threading. It does what it's supposed to do quickly and efficiently. In it's world, this code is WTF-free and I'm kinda surprised to see it posted here.
Admin
Admin
I would not want to go back to those days, but really -- I cannot accept code quality judgements from people who need a Java EE IDE, 2 GHz, and 512 MB just to produce "Hello, world".
-Harrow.
Admin
A) No it's not from embedded code
B) If you think that function is 'fine' then I seriously hope I never have to maintain any of your code. Only poor programmers think along the lines of "if someone can't instantly understand this then they must be stupid". Adding a one line comment above a function to say what it does takes 5 seconds. Each additional programmer who comes to see this function has to decide what it does and that will ALWAYS take longer than 5 seconds. When you are scanning your way through MILLIONS of lines of code, stopping for even 30 seconds to figure out what EACH function does slows down the whole operation.
People who fail to document their code should be banned from life.
Admin
...the same stuff, over and over, a hundred ways, all alike excepting the variety of bugs which cause program crashes in special cases. Brillant!
Admin
In terms of code correctness/performance/efficiency, this function seems pretty solid. The only thing I immediately noticed was that he could have saved 3 bytes by using bool instead of int for the "neg" variable, seeing as how it only has 2 possible values. Then again, this function may very well pre-date the bool type.
As for people not understanding the code, I can certainly see why. If you're not familiar with C++ pointers, the pre-increment operator, and "C-strings", then this probably won't make a bit of sense to you. And the vaguely named variables and lack of comments won't help, either.
It isn't pretty, but it's not even close to the worst I've seen. I'm so glad I don't write code in C++ anymore... just looking at this gave me chills.
Admin
So timB's defense is that the code is not commented? That's a WTF? And what happened to the "horribly implemented" comment? No longer defending that?
Admin
Admin
Speak to Jake Vinson about his comments. The fact that it lacks any kind of documentation isn't the only problem with this code. The function name for a start is very bad. nice_num could mean many things. The idea that you have nice numbers implies that you have bad numbers. Personally my implementation of nice_num would only print out values that are divisible by 12 as those are the kind of numbers I consider nice. It's just a long formatted to a string for the user to see and there are better ways to name a function that does that. The variable names are also awful. Single letter variables should have been banned at the same time as slavery. The smallest permissible variable name should be idx, which accurately portrays it's purpose in it's name.
Admin
You're the kind of person we write about here on the Daily WTF. Never mind the failures in that code (just FYI, don't ever post code online in a coding forum unless you've tested it.); let's look at the general idea you're working on there.
What you have to remember is that the code you write is never seen by the computer. It's compiled. That compiled machine code is what tell the computer to do something.
When you pull out a huge function like sprintf, you're going to get SLOW performance. Even though it LOOKS like it's a one-line instruction, you're looking at hundreds of lines of code once the compiler's done with it. This is true with ANY string manipulation. Leave it the hell alone if you want speed. Debugging statements can be the worst thing you can do - the slow-down resulting from the printing can change how the code interacts, and often these "load-bearing prints" are left in, because the developer doesn't know how to fix the race condition.
Remember that one line in your code isn't one instruction, and the more you pile on, the more likely it is that your code will be interrupted and write random values to random places. I've had huge bugs caused by a macro that took up a total of three lines.
It was to fix a problem with analog ports on a PIC16F88. Any write would turn all the pins off, so it was fixed with a shadow register. To set a pin, it would read the existing PORT A, then set the pin, then write the shadow register to the physical port.
Occasionally, an interrupt would be triggered between the read and write instructions, putting random data on the port, causing the system to fail.
Admin
I only see one problem with that codem the commonly forgotten MINT_MIN. You should either treat it separtly or not negate the input, the former probably being the better idea.
The code could use some cleaning up and a short comment and fixing of this but but there is nothing that makes this a WTF.
And single letter variables are not bad in any way as long as they are few and their scope is short. I have never understod this fobia. Just because something can be abused doesn't mean it is bad. These days there is actually more abuse of long variable names than there is of short. I have seen programs rendered unreadbale from both too much logn variable names and too many one letter variables. Everything has to be used with common sense and in moderation.
Admin
MINT_MIN, mmmmmmm
Obviosuly I meant MIN_LONG.
Admin
Yes, there's printf, which does (for 10+ years now, anyway) support adding a locale-dependent thousands-grouping character. But that's a pain to deal with when you don't want to be locale-dependent.
Admin
Why would you be going through a million lines of code bottom up? You'll just confuse yourself and blow a gasket.
Structured code makes much more sense from the top down, seeing a high level function and then figuring out what each part does if needed. That way you have a context and know what sort of problem the previous programmer was trying to solve.
How hard is this to figure out?
You don't need a comment more code to know what is going on. Not unless there is a problem with how some numbers are printing -- which doesn't look, from the function posted, likely.
Admin
Agree'd. Not an enterprisey solution.
Admin
This code is just fine: static char *nice_num(long n) ...
This code would be a WTF: // String formatting library for use with all threading // configurations, countries, and future number sizes.
static char *nice_num(long n) ...
If you're writing a general purpose library to be used for years to come, then you should make sure your code handles all possible cases. If you're writing a function for use in a single program with defined parameters, then doing so would be a waste of time, space, and quite possibly obfuscate the code unnecessarily.