The Daily WTF: Curious Perversions in Information Technology

Anonymous · 2014-12-03 Reply Admin

But even on null terminated strings, this code is dangerous. Since arrays in C, like any sane language, are zero indexed, this code may attempt to access memory beyond the end of the array, overwriting whatever’s there with a null terminator.

From what I understand, this is incorrect.

Let's say we have a char coal[] = "Hello";, the array will be identical to char coal[] = { 'H', 'e', 'l', 'l', 'o', '\0' }; which has a size of 6 chars.

Now strlen(coal) will return 5, because the string is obviously 5 characters long. Since arrays in C are zero-indexed, coal[5] refers to the 6^th element of the array, which is exactly the null-terminator. In this case, coal[strlen(coal)] = '\0'; is actually a no-op. Hell, the compiler may even optimize it out.

However, if it is char *fail = "Hello";, then fail[strlen(fail)] = '\0'; can possibly result in an attempt to write to read-only memory location, depending on the compiler and execution environment.

Anyway, the code is a WTF, but so is this article.

Screen capture in case the article gets modified.

lcrawford · 2014-12-03 Reply Admin

The code is a pure WTF for the reasons you mention, but the article is not. Suppose someone builds a string in local variable but doesn't add a NUL for whatever reason - the code above could indeed write to memory that is unrelated to the allocated string space.

Maciejasjmj · 2014-12-03 Reply Admin

I looked at the title and went "who the hell pulls out strings out of their nose, that's disgusting".

lcrawford:
Suppose someone builds a string in local variable but doesn't add a NUL for whatever reason

But even on null terminated strings, this code is dangerous.

If it's null-terminated, it should work (and be a noop). If it's not, it crashes anyway.

VinDuv · 2014-12-03 Reply Admin

I was going to say exactly that. @Remy didn’t take into account the pendantic subset of TDWTF readers :stuck_out_tongue:

HappyCerberus · 2014-12-03 Reply Admin

It's not as WTF as some might suspect.

It's actually a pretty neat "crash early catcher". Especially when used with some type of memory checking tool.

However, if it is the case, it definitely should have been well documented.

boomzilla · 2014-12-03 Reply Admin

lcrawford:
Suppose someone builds a string in local variable but doesn't add a NUL for whatever reason

Yeah, and something named buf is there for doing stuff with. It's not going to be a static-ish thing like char coal[] = "Hello";

Erik_Nilsen_Haga · 2014-12-03 Reply Admin

It would be a neat fail-fast implementation if it was a case of cause the application to crash if buf is not NUL terminated. However it's causing undefined behaviour, so you have no guarantee that your application will crash at this point. Or indeed the ability to guarantee anything about what your application will do.

lcrawford · 2014-12-03 Reply Admin

If it's null-terminated, it should work (and be a noop). If it's not, it crashes anyway.

For the CPU cycle obsessed, if they're not using the str* functions, but instead using a strn* function, they may omit adding a nul terminator to save cycle time or maximize buffer use. But then the buffer space is technically no longer a classic C nul-terminated string.

(That happens more often with embedded software)

Eldelshell · 2014-12-03 Reply Admin

when Java programmers

That was not necessary and totally untrue, because in every sane language with any support for strings, you don't have to think about this sort of problems. So TRWTF is writing string handling ops in C.

HappyCerberus · 2014-12-03 Reply Admin

That is what stack/heap protectors are for. They crash the program if you go out of bounds.

But yes, it isn't guaranteed.

Kuro · 2014-12-03 Reply Admin

C++ added a string class. So there is that. I don't know what problems arise with that but I would agree that strings (and normal arrays) in C are an interesting topic.

Filed Under: \0

Anonymous · 2014-12-03 Reply Admin

lcrawford:
Suppose someone builds a string in local variable but **doesn't add a NUL** for whatever reason

But even on null terminated strings, this code is dangerous. Since arrays in C, like any sane language, are zero indexed, this code may attempt to access memory beyond the end of the array, overwriting whatever’s there with a null terminator.

Not sure if you are really replying to my reply...

accalia · 2014-12-03 Reply Admin

@Remy.... you were reading that thread weren't you... or is this just a coincidence? :-D

http://what.thedailywtf.com/t/a-problem-with-big-numbers/5182/15 http://what.thedailywtf.com/t/a-problem-with-big-numbers/5182/17 http://what.thedailywtf.com/t/a-problem-with-big-numbers/5182/18

Dogsworth · 2014-12-03 Reply Admin

Since arrays in C, like any **sane** language...

You're saying C is sane?

accalia · 2014-12-03 Reply Admin

Dogsworth:
You're saying C is sane?

more so than the way VisualBasic has things sometimes 0 indexed and sometimes 1 indexed leading me to invariably rewrite the thing in C# whenever i encounter legacy VB (and even VB.net)

Maciejasjmj · 2014-12-03 Reply Admin

lcrawford:
For the CPU cycle obsessed, if they're not using the str* functions, but instead using a strn* function, they may omit adding a nul terminator to save cycle time or maximize buffer use. But then the buffer space is technically no longer a classic C nul-terminated string.

It's still strlen, which AFAIK reads until it encounters a NUL. If it doesn't for long enough time and wanders off the program's address space, well...

cyneric · 2014-12-03 Reply Admin

This is almost certainly a NOP, even if the string is not NULL terminated. When strlen() reaches the end of the buffer, it will keep going until it finds a NULL or it reaches unreadable memory. If it finds a NULL in the stack or heap or any writable memory, it will replace the NULL with NULL. It's only if it reaches unreadable memory before finding a NULL or finds the NULL in unwritable memory that it will crash.

Medinoc · 2014-12-03 Reply Admin

Anonymous:
**However**, if it is `char *fail = "Hello";`, then `fail[strlen(fail)] = '\0';` can possibly result in an attempt to write to read-only memory location, depending on the compiler and execution environment.

If someone calls buf a string literal, you have a bigger WTF on your hands. That said, it often happens by accident in WTF code written by beginners:

char* buf = malloc(20);
buf = "fail";

To think that Visual C++'s obsolete C compiler still doesn't have an equivalent to gcc's -Wwrite-strings...

VinDuv · 2014-12-03 Reply Admin

cyneric:
If it finds a NULL in the stack or heap or any writable memory, it will replace the NULL with NULL.

If the program is multithreaded and strlen stops on a null byte in memory used by another thread, you may still end up with memory corruption if the other thread manages to replace the null byte with another value before you write 0 at that address.

Gaska · 2014-12-03 Reply Admin

Maciejasjmj:
It's still strlen, which AFAIK reads until it encounters a NUL. If it doesn't for long enough time and wanders off the program's address space, well...

strnlen() has two params - a pointer to string like strlen(), and a maximum number of chars to read. So, using strnlen() with max length of (buffer_size-1) would actually make sense. Except buffer[buffer_size]='\0' would work just as good, and be faster, so it'd be still WTF to write such code.

Steve_The_Cynic · 2014-12-03 Reply Admin

VinDuv:
I was going to say exactly that. @Remy didn’t take into account the pendantic subset of TDWTF readers :stuck_out_tongue:

I'll bite by pointing out that you spelled "pedantic" incorrectly.

I carefully checked the spelling of my first sentence in order not to fall foul of Muphry's Law.

aliceif · 2014-12-03 Reply Admin

Steve_The_Cynic:
Muphry's

hehehe

Steve_The_Cynic · 2014-12-03 Reply Admin

There's a great deal of pedantry that can be handed out up above concerning the difference between NUL, NULL, null, and '\000'. Only the last of those is suitable for actually terminating a C string.

NUL is a synonym for '\000', but requires you to explicitly define it.
NULL and null are pointers, and normally can't be used when terminating strings of characters.
'\000' can also be written '\0' or 0.

hungrier · 2014-12-03 Reply Admin

Steve_The_Cynic:
I'll bite by pointing out that you spelled "pedantic" incorrectly.

More like, you've incorrectly spelled "pedantic" correctly.

Jaloopa · 2014-12-03 Reply Admin

aliceif:
hehehe

http://en.wikipedia.org/wiki/Muphry%27s_law

Zylon · 2014-12-03 Reply Admin

There isn't nearly enough making fun of Remy here for making the article's image--

Hotlinked from wikimedia.org
A freaking 2560 x 1700, 773K JPG.

Covarr · 2014-12-03 Reply Admin

Eldelshell:
> when Java programmers
That was not necessary and totally untrue, because in every sane language with any support for strings, you don't have to think about this sort of problems. So TRWTF is writing string handling ops in C.

Even so, a C programmer would be prepared to deal with C's idiosyncrasies. A Java programmer would not be. It's always a WTF to try and code in one language as though it were another, regardless of which one is more reasonable.

redwizard · 2014-12-03 Reply Admin

I gave it a like just for the shoot yourself in the foot link provided.

chubertdev · 2014-12-03 Reply Admin

accalia:
more so than the way VisualBasic has things sometimes 0 indexed and sometimes 1 indexed leading me to invariably rewrite the thing in C# whenever i encounter legacy VB (and even VB.net)

Comparing things to VB isn't saying much. :laughing:

That being said, this looks like band-aid code.

chubertdev · 2014-12-03 Reply Admin

Steve_The_Cynic:
I'll bite by pointing out that you spelled "pedantic" incorrectly.

Pedantry and forum memes don't mix...

nmclean · 2014-12-03 Reply Admin

VisualBasic has things sometimes 0 indexed and sometimes 1 indexed leading me to invariably rewrite the thing in C# whenever i encounter legacy VB (and even VB.net)

How does behavior present in VB lead you to rewrite VB.NET into C#, when VB.NET does not have said behavior?

EatenByAGrue · 2014-12-03 Reply Admin

lcrawford:
For the CPU cycle obsessed, if they're not using the str* functions, but instead using a strn* function, they may omit adding a nul terminator to save cycle time or maximize buffer use.

But if you're going to be CPU-cycle-obsessed, and doing any kind of complex string manipulation, then you'll track the length separately (Pascal-style strings), storing "Hello" as {5, "H", "e", "l", "l", "o", "\0"} instead of {"H", "e", "l", "l", "o", "\0"}, which allows you to skip strlen and the O(n) walk down the string that it has to do to give you a count.

accalia · 2014-12-03 Reply Admin

nmclean:
How does behavior present in VB lead you to rewrite VB.NET into C#, when VB.NET does not have said behavior?

because AFAIK VB.net still has 1-based indexes for things like arrays, same as VB
because i can't always tell the two apart at a glance because of identical syntaxes
because VS ceamlessly mixes C# code and VB code without even being asked to so conversion is simple
because i have a script that does 99% of the conversion for me. i just have to validate its output and tweak a few things here and there if it gets confused

nmclean · 2014-12-03 Reply Admin

accalia:
1) because AFAIK VB.net still has 1-based indexes for things like arrays, same as VB 2) because i can't always tell the two apart at a glance because of identical syntaxes 3) because VS ceamlessly mixes C# code and VB code without even being asked to so conversion is simple 4) because i have a script that does 99% of the conversion for me. i just have to validate it's output and tweak a few things here and there if it gets confused

It was a rhetorical question. You said the reason you rewrite VB.NET is because of behavior in VB, but obviously that is not the reason since it's not in VB.NET. The facts that the syntax is similar and that you have a script don't change the fact that arrays in VB.NET are consistently 0-based.

chubertdev · 2014-12-03 Reply Admin

accalia:
4) because i have a script that does 99% of the conversion for me. i just have to validate **it's** output and tweak a few things here and there if it gets confused

*twitch*

accalia · 2014-12-03 Reply Admin

nmclean:
don't change the fact that arrays in VB.NET are consistently 0-based.

huh. TIL that VB.net isn't as cromulent as i thought it was.

I still don't want it in any code base i maintain though.

accalia · 2014-12-03 Reply Admin

hmm? i see no stray' apostrop'he he're....

do yo'u?

chubertdev · 2014-12-03 Reply Admin

accalia:
hmm? i see no stray' apostrop'he he're....
do yo'u?

Indeed. [image]

chubertdev · 2014-12-03 Reply Admin

accalia:
huh. TIL that VB.net isn't as cromulent as i thought it was.
I still don't want it in any code base i maintain though.

sigh

It's much, much, much, much closer to C# .NET than VB6.

accalia · 2014-12-03 Reply Admin

curses! foiled again!

and i would have gotten away with it too if it weren't for that darned edit pencil!

accalia · 2014-12-03 Reply Admin

chubertdev:
It's much, much, much, much closer to C# .NET than VB6.

hmm... having looked up some articles now that @nmclean pointed that fact out to me i agree.

it's still a context switch to move from one syntax to the other in the middle of trying to track down a bug or make a change to a system. and for that reason, if for no other, it should be removed.

Pick ONE language and stick with it. (you are allowed a second if it's something like server side/client side, but no mixing and matching! also you'll probably want someone else to do the client side as i have limited patience for the oddities of IE)

chubertdev · 2014-12-03 Reply Admin

accalia:
hmm... having looked up some articles now that @nmclean pointed that fact out to me i agree.
it's still a context switch to move from one syntax to the other in the middle of trying to track down a bug or make a change to a system. and for that reason, if for no other, it should be removed.

Pick ONE language and stick with it. (you are allowed a second if it's something like server side/client side, but no mixing and matching! also you'll probably want someone else to do the client side as i have limited patience for the oddities of IE)

I don't really struggle going back and forth. We even have a .NET app that has its four main projects in VB, with a dependency on two DLLs that we bought that source code for that are in C#.

That being said, I normally point people to this article: http://visualstudiomagazine.com/Articles/2011/05/01/pfcov_Csharp-and-VB.aspx?Page=1

accalia · 2014-12-03 Reply Admin

chubertdev:
That being said, I normally point people to this article:

i didn't mentionwhich language you had to pick did I? :-P

if VB.NET works for you then fine, but if 90% of the code's already in C# guess which one i'm picking?

CoyneTheDup · 2014-12-03 Reply Admin

No sale: It won't always crash, even if the string is not null terminated. This is because it will always replace a \0 with a \0. That will fail if the target location is read-only, but otherwise it succeeds even if a memory location outside the bounds of the target string is accessed.

As a failure detector...it is a failure.

blakeyrat · 2014-12-03 Reply Admin

Choose what you want, but it's no excuse for ignorance about what VB is.

accalia · 2014-12-03 Reply Admin

blakeyrat:
Choose what you want, but it's no excuse for ignorance about what VB is.

TI-BASIC => Devil's jockstrap VB => evil icky bad VB.NET => now that I've bothered to read up on it a bit more: meh, i'd rather C# but whatevs

my main point is you shouldn't mix languages within an application if you have any other choice. Pick a language that will work (and if it's VB.NET whatever) and stick with it.

antiquarian · 2014-12-03 Reply Admin

Steve_The_Cynic:
I'll bite by pointing out that you spelled "pedantic" incorrectly.

Brillant!

Filed under: not sure if trolling or really didn't read the Memes wiki thread

Maciejasjmj · 2014-12-03 Reply Admin

Gaska:
strnlen() has two params - a pointer to string like strlen(), and a maximum number of chars to read. So, using strnlen() with max length of (buffer_size-1) would actually make sense. Except buffer[buffer_size]='\0' would work just as good, and be faster, so it'd be still WTF to write such code.

It might be kinda useful to actually determine whether the data actually is null-terminated - if you call it with buffer_size and it returns buffer_size, then it's not.

Still, I was pretty sure this one doesn't exist, because it just barely makes sense.

chubertdev · 2014-12-03 Reply Admin

antiquarian:
Brillant!
Filed under: not sure if trolling or really didn't read the Memes wiki thread

@Steve_The_Cynic mainly posting in the Article category is a barrier to knowing about the Memes wiki thread

carleeto · 2014-12-03 Reply Admin

Yep. That's right. If buf already points to a null terminated string, then the statement is a no-op regardless of encoding.

Nasal String Length

Leave a comment on “Nasal String Length”