- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
The real WTF is that when *I* make "fist" posts like that, mine get deleted with a nasty note from Alex.
Admin
This is the standard practice for the Win32 API. Getting it wrong is also apparently standard practice (e.g. the registry functions on Win95 that always reported the size in 1-byte characters but actually copied data in 2-byte characters or vice versa).
It's somewhat efficient if you guess correctly, and if you guess incorrectly, well, that will get caught by QA before the product ships. ;-)
Admin
I could go into the Whole Spiel ™ (memory pointers, correct spellings) but I'll just cut to the chase: The answer is 4.
Admin
TCHARs are decided at compile time in C++. If you pass a char* to GetTempPathW, the compiler will complain.
Therefore, since there is no cast to wchar_t*, I assume UNICODE is not defined.
Therefore, no buffer double overrun. Just the main overrun of "If path is larger than my buffer,
write past the end of the buffer anyway, thereby bypassing any security Windows tries to offer me
(and people will blame Microsoft for security holes. BWA HA HAH HA HAAAaaaaaa....)"
followed by the lack of a smart pointer for exception safety.
Admin
As Joel from JoelOnSoftware says, he tests applicants on two fundamentals: 1) recursion, and 2) pointers. Recursion isn't so useful in the real world, but understanding it and understanding indirection via pointers reveals a sort of comprehension that is required for effective programming -- especially at a low level.
On the other hand, lots of programming is done at a higher level ... where these modes of thinking are perhaps not so crucial?
Admin
Good spot. And the answer is "NO"
#include <Windows.h>
#include <stdio.h>
int main(int argc, char **argv){
char buff[1];
int i=GetTempPath(1,buff);
printf("Try 1 %d\n",i);
SetEnvironmentVariable("TMP","c:\\");
i=GetTempPath(1,buff);
printf("Try 2 %d\n",i);
return 1;
}
Try 1 32
Try 2 4
Need a while loop to get a temporary directory? WTF!
Rich
Admin
The Real WTF is that the guy uses the MFC instead of STL.
Admin
There's a third error - it doesn't handle the case where GetTempPath fails and returns 0. In this case it returns a string containing whatever random bytes were in the allocated char block.
Admin
Admin
The whole TCHAR -vs-char-vs-wchar_t thing is not really all that important - unless you specifically specify a Unicode build, using a standard char with this function is perfectly fine because GetTempPath(...) will resolve to GetTempPathA(...). Just because code is not TCHAR-capable does NOT mean that it is incorrect. That is what allows older Windows 3.1-era code to compile and still work correctly without changing the code to be TCHAR capable. Writing TCHAR-capable code helps mostly when you have to make both Unicode and ANSI builds from the same codebase.
As far as the code itself, the first call to GetTempPath(...) should not cause a overrun because the function is bounded by the length passed to it. At worst, it would just not put a terminating NUL into the buffer (assuming that the buffer really is 256 characters long).
The length value passed to GetTempPath(...) is specified in characters, not chars, so passing a value that is (length * sizeof( TCHAR )) is incorrect - it will pass the wrong buffer length in a Unicode build. Regardless of using TCHAR, char or wchar_t, the buffer is still 256 elements long. (There are some Win32 functions that expect a character/element size, and others that expect a byte-size.)
AFAIAC, the WTF here is the use/abuse of dynamically allocated memory. There is likely more then enough stack available to create a buffer of __MAX_PATH length on the buffer (n.b __MAX_PATH includes space for the terminating NUL character). So it is much faster to just get that buffer from the stack and pass it in instead of dynamically allocating the memory. It is also much faster than calling the function twice!
A previous poster was worried about [the overhead of] "copying the memory twice", but it takes much more time and instructions to hit the function once to determine the required length, hit the heap to allocate the required memory (+1 for the NUL), call the function a second time, and then deallocate the allocated memory. Stick to the stack for little memory requirements like this - there is no need to complicate things and no need for additional handling of new failures - if the stack could not handle the request, you would not even get to that part of the code. Also, when dealing with multithreaded code, code that abuses the heap like this kills performance (when using the default shared heap).
Using __MAX_PATH as the buffer size here is correct - this particular function cannot handle extended filename syntax, and as a general rule, only the wide (Unicode) versions of functions are capable if handling them, so you know that the function will not be trying to hand you more than __MAX_PATH characters.
CString abuse is a whole other problem but not one that is specific to this code... Lots of MFC developers abuse CString due to ignorance of how they really work.
I would have written the function much simpler as:
The memory management is passed to the caller - if they want it in a CString, they can manage the CString and pass me its buffer and size. If not, they can pass me a buffer allocated however the hell they needed it allocated. Stack, Heap, Shared Memory Segment, Memory Mapped File, etc. - it does it matter, the function just works for all of them. It can also be used in a MFC or non-MFC build, Unicode or ANSI. If the function succeeds, the return value will be ERROR_SUCCESS; if not, it will be a valid Win32 error code - just what the caller needs to figure out if something went wrong and if so, exactly what did go wrong.The amount of data copied into the buffer (if any) is returned via reference, to optimize further manipulation of the data by the caller, but can safely be ignored in most cases.Of course, this is a very thin wrapper around the function, so I would have removed it completely and had the developer call the function directly and not through a poorly designed and written minimal wrapper function.As usual, exception handling needs to be done by the caller, because the callee (in this case) does not have enough information, not is complex enough, to react intelligently to any exceptions. Peace!Admin
Darwin? Stupid little children...
Admin
Your mastery of the non-sequitor* is breathtaking. But, you're still wrong.
*That means it doesn't really support your argument.
[image]
As for the TCHAR stuff: I wasn't saying that his example was wrong because it didn't use TCHAR, but I don't think it makes any sense to make use of it in some places and not others. (see the retval string length calculation) Not using it at all and not defining UNICODE or _UNICODE is fine, but using char * and defining them makes for more potential errors than are necessary.
Admin
Um, no. You hand it a char[] that sits on the stack. If it's too small, allocate one on the heap and use that. It works pretty well in the common case and works in the general case.
Admin
Admin
You are correct. I was weened before there were TCHARs, I'm not used to it. =)
The reason you need a return string is because the interface is returning a CString by value. I wasn't going to take the liberty of changing the interface, but if I did I would pass an input CString by ref as an earlier poster showed.
Admin
The interface "const T* foo(...)" where the return value of foo is NULL on failure and something read-only on success allows implementations to do the minimum possible amount of work in cases where the implementation already has the data in memory in an appropriate format.
For example, getenv() returns a char * to the environment variable memory of the process. No string copying, no memory allocation of any kind is required to implement the interface if the implementation already has the appropriate character strings handy (and if it doesn't, the implementation is best suited to determine buffer lengths and so forth so it can dynamically allocate memory). Of course the pointer becomes invalid if you modify the environment, but if your application is asynchronously reading and modifying environment variables in multiple threads then you've got a WTF on your hands already, and if you really need persistent access to the data you should make your own copy of it.
gethostbyname returns pointers into internal cache structures that were already allocated in the process of processing the response to a DNS request. If your application has multiple threads asynchronously resolving host names...uhhh...wait, that's a fairly normal case. At least there's the GNU extension gethostbyname_r...and gethostbyname2_r...ok, bad example. Well, there's getnameinfo...which tells you that the buffers are too small but not how big they should be...a binary search should suffice...no, it's still a bad example.
localtime...uhhh...ok, there's just no excuse for localtime. It uses a data structure of static length to return values based on a user-supplied parameter. The only possible implementation cost savings would be if localtime(NULL) was equivalent to "get the current system time" and returned you a buffer to shared memory in the OS that pointed directly to the registers of some kind of clock chip which happened to have exactly the same layout as struct tm (or vice versa). Or maybe your CPU uses some strange memory architecture like segmented address spaces with different segments accessed at different speeds, where you might want to put latency-sensitive data like time data in memory with a faster access path. Or maybe your library has hand-optimized assembler code that uses hardcoded addresses for speed while doing DST calculations to a resolution of one-second.
Note that localtime_r relies on the caller to provide the buffer, so it's not like localtime saves a memory allocation relative to localtime_r. There's just no excuse.
Admin
Um, I think it's a fair guess that the whole point of the function is to have a wrapper around GetTempPath that returns a CString. Because quite frankly, your wrapper adds nothing to the function except a different error handling strategy - laudable, perhaps, but not exactly the point.
Admin
Read the docs again. It says that if the temp path fits inside the buffer, it returns the length of the string (without null). If it does NOT fit inside the buffer, it returns how many characters are needed for the buffer (including null). If it returns 300, you pass in a 301, it will now return 299. Try it out.
Admin
Well, what happens if the CString constructor throws an exception? Better to use a smart pointer for instead of a raw char * for path.
Admin
No, it returns the size of the buffer required. It only returns the length of the string if it fits in the buffer.
So if the path is c:\
and the buffer you pass it is 2 bytes, it returns 4
If the buffer you pass it is 6 bytes, it returns 3.
Rich
Admin
The problem was I had misread your code. I was reading "char *pStr = retval.GetBufferSetLength ((pathlen+1) * sizeof (TCHAR));" as "char *pStr = new char[*something something retval*((pathlen+1) * sizeof (TCHAR))];" or something equally silly. I missed that you got rid of the new and were just setting the CString buffer size directly.
Admin
The ahem real WTF here is that two pages of posts into it, no one has yet to suggest checking the return value of new and properly handling our of memory counditions instead of just crashing.
Admin
I was wondering if someone would mention that. Does anyone actually check the return value of new? I know we "should", but I have yet to see production code that does so that wasn't intended to run within a very memory limited environment.
Captcha: STFU. Priceless.
Admin
ERROR_SUCCESS? WTF!
(Yes I'm sure that's correct -- but I don't want to know.)
Admin
If you don't have a good escape plan you're boned anyway. In much code, even if a few functions can handle an out of memory error, the whole thing may come down because of something further up the stack.
Now, if you're writing something more important than regular desktop apps, you're going to need a lot better justification than that, but I hope that most of us aren't writing heart monitor firmware or anything like that.
Admin
That is my whole point - the entire function is a WTF. Putting a CString wrapper around a simple function like this smacks of a newbie MFC developer. Doing multiple dynamic memory allocations is just icing on the cake. Forcing a particular memory management method, regardless of returning a raw pointer or hiding the memory management behind a higher-level class, is always a bad idea. This is exactly what the original code did - it forced the use of a CString.
Sometimes this kind of "forcing" is unavoidable or even required, but not in this case. For something as trivial as wrapping a call to GetTempPath(...), it is just plain stupid. My version can work with any memory, regardless of how it was allocated. As you more away from trivial single-threaded applications and start taking advantage of modern real multi-CPU (and even fake ones like a HyperThreaded core) you start to give serious thought to how a simple misuse of dynamically allocated memory can bottleneck your code to a single-thread. The solution is not to avoid it in all situations, but to allow the caller (who has more knowledge of the particular situation at hand) to manage the memory as required.
My wrapper was thin for two reasons - (1) to show that using new here is serious overkill (on the level of shotgun vs. fly), and (2) that the wrapper is pretty much useless in the first place - I even mention that I would just have gotten rid of the wrapper entirely.
Peace!
Admin
Heh... It is not as bad as it sounds - all of the standard Win32 Error Codes start with ERROR_. Kinda like some people name all of their status constants by starting them with STATUS_ (STATUS_FAIL, STATUS_OK, STATUS_MAF, etc.). Nothing to see here, carry on... :P
Peace!
Admin
A conforming implementation will never return NULL from a 'new' operation and will instead throw an exception (std::bad_alloc, I think, but it's been a while).
Admin
I did. Kinda... I just avoid the entire situation in the first place by not using new at all. In the CString version of my function, an exception would get thrown that would be handled by the caller - I would not be able to do much with it, anyway.
Oh, and an as aside, in my "real" code, I always check pointers that are (1) passed to me, two (2) after I allocate into them. It is part of my coding standards, in fact. My apps still have bugs and issues, of course, but rarely due to NULL pointers. Getting passed invalid pointers is another issue entirely, and I know of not a single case where all non-NULL pointers are validated at runtime (e.g. IsBadWritePtr(...) on Win32) - the checks just take too long.
Peace!
Admin
Of course they do. Not all systems can afford the the initial overhead of garbage collector overhead of Lisp, Smalltalk, or Java. And even those that can afford it still have operating systems that are written in C/C++/Pascal/Assembler, where these issues still are important. Or maybe the real WTF is that there are people who still think that everyone who doesn't follow their own way of doing things is wrong.
Admin
Not to mention run the risk of corrupting your address space. The first rule of checking for valid pointers is: don't.
Admin
Why not just do this:
char path [MAX_PATH + 1];
Then none of that ugly dynamic memory allocation has to be used. Though there's a drawback that your stack has to be large enough. Though I guess most Windows programmers don't care about either.
Admin
Admin
You can achieve some hideous WTFs in Java or C# if you imagine that "no pointers but has garbage collection" is a substitute for knowing how they handle passing objects to functions and understanding the subtlties of how the garbage collector actually works.
There's huge flaws in C++ and many many good things about <insert your language of choice here> - but the real WTF is a belief that any computer language that lacks a sentient telepathic AI module in the compiler will somehow eliminate the need for actual thinking and knowledge. If that guy put as much effort into learning Java or C# or Lisp as he put into learning C++ then he'ld still have been able to come up with something to suit this site.
Admin
Don't use IsBadWritePtr. Here's a real-life lesson on why not. Or, go straight to the source and learn from Raymond himself.
Admin
Why bother with the check? Everyone knows that MAX_PATH characters are enough for every string in the universe.
Admin
Not necessarily. It's a sign that the industry is maturing and with better, more appropriate tools for the task (i.e. high level languages that hide the notion of pointers to the programmer, a la java, .net, VB) when designing and implementing business application allow the programmer to focus on the business rules he/she implements versus allocating, checking, freeing, locking, unlocking, tracking every little bit of buffer they need.
Different languages are good for different tasks.
I would be happy to write device driver code* or low level application protocols over TCP in C or C++. I would be less happy to build a GUI front end w/ database access with these languages.
I personally like C# for the syntax, power (and being lazy, visual studio) and you're always one unsafe away from messing up with real pointers. It's the best of both world.
* I don't write device drivers for a living, just for fun.
captcha: tango? uh?
Admin
Nice try, but there's another potential memory leak here.
(This illustrates nicely, by the way, why that garage sale book on C is pretty useless when you're trying to write C++.)
Admin
You're thinking of malloc. In C++ you never need to check the return value of new.
Admin
This is not true by default for Visual C++ (which is the most widely used dialect)
I don't see it. After reviewing the C++ draft, I'm pretty sure the sequence goes like this:
Where am I going wrong?
Admin
You do if you disable exceptions in the compiler. This is commonly done for performance reasons, especially on embedded systems.
Admin
the problems are a memory leek and a buffer overflow. There is also another problem which realy isn't a problem, it more an effeincy problem. it the returmn type it's not a pointer(ie. the object is made twice).
if (pathLen > 256){
/iff line executes it will always cause an exception(or a segmentation fault)
pathLen = GetTempPath(pathLen, path);
}
There is acturaly nothing wrong with what he was doing i.e GetTempPath if GetTempPath does not have enougth space to return the path. then it returns the space required. I guess that this was done either when he was pissed or at done very early in the morning(i.e 3AM)
the fix would be
Admin
Sorry I just realized i forgot to return NULL when an error has occured
<font face="Lucida Console">CString * GetTempDirectory()
{
char* path = new char[65536];
CString *ret;
int pathLen = GetTempPath(65536, path);
if (pathLen > 65536){
printf ("GetTempPath failed with error %d. ", </font><font face="Lucida Console">
GetLastError());
return NULL;
}
ret = new CString(path);
delete path;
return ret;</font>
<font face="Lucida Console">}</font>
<font face="Lucida Console"></font>
Admin
You do not need the +1 for _MAX_PATH - _MAX_PATH already includes space for the NUL terminator. And since you are using the ANSI version of the function, you can be pretty certain that it is not capable of returning a path longer than _MAX_PATH, anyway. Extended filename syntax, and thus paths longer than _MAX_PATH, are generally not avaialble with the ANSI versions of Win32 functions.
As far as the stack goes, I have never blown a stack doing small alocations like this in non-recursive scenarios, and by default the stack in Win32 is 1MB (adjustable via linker options). Of course, I am not doing embedded development now so..... :)
If you are that worried about the stack, you can always dynamically allocate from the stack using alloca(...), although its availability is limited. The benefit of using alloca(...) instead of normal stack variables is that with normal stack variables, you will blow the stack upon entry to the function if the automatic variables exceed the available stack - none of your code will even run, but if you use alloca(...), the exception gets through at the point where alloca(...) is called, so you can put a try/catch around it and react accordingly.
When used like this, you are able to "test the stack" to get memory and if the stack space is unavailable, resort to heap allocation. Tricky yes, but available nevertheless.
Peace!
Admin
OK - where to start? :) First, you will never get a path that long from GetTempPath(...), especially the ANSI version (which is being used in this case). A buffer the size of _MAX_PATH will work well, and it does not have to come from the heap. In fact, the largest path that can be handled by the Unicode versions of Win32 functions is much closer to 32KB than 64KB. Please do not ever put that code like that into production.
You are invoking multiple heap hits (both directly and indirectly), each one a possible failure point, without any error handling. These are also potential points that can serialize a multi-threaded application (if the heap is being abused here, it likely is elsewhere in the application).
The checking of the return value is incorrect - the function returns zero for failure. Since the function is designed to tell you the required buffer size when you specify a buffer length that is too small, that is considered a successful return from the function - it did what it was designed to do: it successfully told you that you needed more buffer space.
Reporting failures via the console is fine - just make sure that you have a console in the first place so that your message actually goes somewhere.
The delete operator is incorrect, although (IME) this is benign in cases using simple types like this.
Replacing WTF-ed code with code that is more WTF-ed might not be such a good idea... :)
Peace!
Admin
quote from http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/fs/naming_a_file.asp
In the Windows API, the maximum length for a path is MAX_PATH, which is defined as 260 characters. A path is structured in the following order: drive letter, colon, backslash, components separated by backslashes, and a null-terminating character, for example, the maximum path on the D drive is D:\<256 chars>NUL.
The Unicode versions of several functions permit a maximum path length of approximately 32,000 characters composed of components up to 255 characters in length. To specify that kind of path, use the "\\?\" prefix.
Maybe 64 KB is a bit to big.
Admin
If the CString constructor throws an exception, the temporary is not destroyed. Yes, it's subtle, but these things are important.
99.99% of the time, there's no reason to write a manual delete or delete[] in C++. Use a smart container instead.
Admin
It was true of Visual C++ 6.0, which was one of the reasons why it was every C++ developer's least favourite legacy compiler for many years. The behaviour was incorrect, and would even crash the version of the STL that shipped with the compiler under low-memory conditions.
It's been fixed.
Admin
In which case you're not programming in C++. You're programming in a related dialect.
There's embedded systems and there's embedded systems. If space is at a premium and you have to scrounge every byte, then yes, you often can't justify the cost of exception tables. But in that situation, you've probably got specialised operator new implementations anyway.
Larger embedded systems (I'm thinking MRI scanner-sized embedded systems here) often use exceptions freely.
I think the chance that you'll need to call GetTempPath in your pacemaker code is pretty slim, though. If you're ever working on a project that implements something like that on a Win32 platform, please submit your story to The Daily WTF.
Admin
That's not really a problem. Firstly, CString is ref-counted which makes it cheap to copy. Copying it doesn't do a memory allocation. (This is one of many reasons to prefer it over std::string.) Secondly, a good compiler (eg VC8) will optimise away the copy anyway.