The Daily WTF: Curious Perversions in Information Technology

2021-02-01 Reply Admin

Complete guide about null and terminated. Thanks for sharing

2021-02-01 Reply Admin

Yeah, c and all that modern malarkey. Bah, you kids of today. At least you're not trying to maintain a suite of Fortran in which someone has created variables with spaces in their names.

(Shuffles off in his slippers, muttering into his grey beard and cardigan.)

Steve_The_Cynic · 2021-02-01 Reply Admin

Fortran in which someone has created variables with spaces in their names.

Tsk. The Fortran compiler removes all extraneous whitespace. That's why the logic/comparison operators all begin and end with dots, among other things, but it also means that those variables don't have spaces in their names.

2021-02-01 Reply Admin

Well yes, the compiler does remove the whitespace, but if you're trying to find instances of a particular variable by means of a global directory search on all the source code, it gets really frustrating if you find that some shambolic ne'er-do-well has put spaces in some of the instances of them and not others. Particularly when this is inconsistent.

Seriously, whose stupid idea was it to allow spaces in the damn variables? I fail to see one single benefit that would outweigh the disadvantages.

But anyway, be that as it may, being serious for the moment, the worst feature of any language anywhere in the universe, without a doubt, is that "define" keyword that causes the bug described in TDWTF today. I'm completely on board with that, and it does make me relieved that I'm silo'ed off in Fortress Fortran.

Steve_The_Cynic · 2021-02-01 Reply Admin

Seriously, whose stupid idea was it to allow spaces in the damn variables?

You'd have to go back to the end of the Fifties to ask around, sadly. And the "feature" allows spaces in lots of stuff that isn't variables, as well.

Example (possibly less useless than for variables):

      I = 1 000 000

But yeah, inconsistent use of spaces for whatever purpose is a problem. Fortunately, the "search for it despite inconsistent use of spaces" is actually something where use of regular expressions reduces, rather than increases, the number of problems...

    grep -i "s *p *a *c *e *y *v *a *r *i *a *b *l *e" *.f

Ugly as fuck, but effective.

2021-02-01 Reply Admin

TRWTF is, of course, C style strings.

Even in ye olden pre-managed days, the Pascal approach (length first) is better in pretty much every way. If you have a 'short string' type where the length is limited to 255 then it isn't even any more expensive in memory.

2021-02-01 Reply Admin

@Prime Mover ref

Seriously, whose stupid idea was it to allow spaces in the damn variables? I fail to see one single benefit that would outweigh the disadvantages.

Agree there are none which outweigh the disadvantages. It's even kinda hard to find any advantages.

I forget if you can break a variable name across a line (=punch card) continuation; that might be one if true.
It's the moral equivalent of PascalCasing making multi-word variables more readily parseable to the human.

Most likely it's simply an artifact of step 1 of phase 1 of the compiler being "delete all whitespace and comments" to save on core.

2021-02-01 Reply Admin

@Steve_the_Cynic ref

Example (possibly less useless than for variables):

      I = 1 000 000

C# was real proud of themselves when they added that feature in 2017. In a burst of sanity they used "_" instead of " ". But the readability gain is still there. See https://docs.microsoft.com/en-us/dotnet/csharp/whats-new/csharp-7#numeric-literal-syntax-improvements

2021-02-01 Reply Admin

#define true false

TheCPUWizard · 2021-02-01 Reply Admin

"with guarantees that it would never allow buffer overruns.".... Ha

char target[10]; strlcpy(target, "this is a really long string", 20);

TheCPUWizard · 2021-02-01 Reply Admin

With regard to strncpy (was using this close to 40 years ago) - psuedo-C

struct safe { public: char buffer[10]; private: char terminator = 0; }

2021-02-01 Reply Admin

Problem being that (at least in Borland's Pascal versions) there were only short strings, so 255 chars was a hard limit for strings. If you, even rarely, needed longer strings, tough luck! (Source: I did program in them back then.) Workarounds included using multiple strings to keep parts of the actual string (very cumbersome), brewing your own strings (and reimplementing the whole string library), or ... wait for it ... using C-compatible NUL-terminated strings (which Borland actually introduced with much fanfare, though I'm not sure if the main reason was to allow longer strings or for Windows API interfacing).

Note that simply making the length field bigger isn't that easy. First, how long? In Borland's time, probably 16 bit (because it was a 16 bit compiler to run on 16 bit OS, and 32 bit arithmetic was expensive). But strings longer that 64KB are unusual, but possible, so the problem is only postponed. Then there are the usual endianness and alignment issues, and string functions wouldn't be compatible between short and long strings. That's at least a small advantage of C strings, they can be as long as memory layout allows.

2021-02-01 Reply Admin

struct safe str;
strncpy (str.buffer, "mylongstring", sizeof (str));

2021-02-01 Reply Admin

One of the often overlooked features of the C struct is that the internal order of the members need not bear any relation to their order in the source code. This decision was (apparently) made to allow compilers to account for different CPU element alignment requirements in optimizing the struct's storage. That's why the offset() macro is in the standard C library.

2021-02-01 Reply Admin

It wasn't to allow spaces in names, but to allow the removal of spaces between things. That pesky limit on the size of a punch card... The simple rule was "spaces are meaningless", applied everywhere.

So, yes, every beginning FORTRAN programmer had to have at least one declaration like:

INTEGERS EX

CodeJunkie · 2021-02-01 Reply Admin

Seriously, whose stupid idea was it to allow spaces in the damn variables? I fail to see one single benefit that would outweigh the disadvantages.

Probably the same people who thought case sensitivity was useful.

2021-02-01 Reply Admin

I would suggest that those two propositions are orthogonal.

See above for "sane" (ie 1970s) reasoning about "spaces within tokens." Seriously. If you never used punched cards, you wouldn't know. Also, in the old days, 1 000 000 was a hell of a lot more readable than the equivalent with no spaces. And compilers were weedy things back then (as far as I know, every single Fortran compiler was a two-pass contraption, not even based on BNF), so ... it all sorta made sense.

Case sensitivity? Well, it depends upon your lexical requirements really, doesn't it? Mine are clearly more severe than yours.

Barry Margolin 0 · 2021-02-01 Reply Admin

Seriously, whose stupid idea was it to allow spaces in the damn variables? I fail to see one single benefit that would outweigh the disadvantages.

There was no other good option for separating words when Fortran was being designed.

I don't think underscore was in most character sets in the 50's and 60's (the ASCII code we now use for it was originally left-arrow). - is the subtraction operator. And I/O devices were mostly uppercase-only, so it would be years before CamelCase was developed.

I'll bet the designers thought they were pretty clever being able to design a language where whitespace was insignificant.

2021-02-01 Reply Admin

features of the C struct is that the internal order of the members need not bear any relation to their order in the source code

I think this isn't right. The C compiler is free to insert padding between elements, that's true, but the order stays as written. See cppreference ("Within a struct object, addresses of its elements (and the addresses of the bit field allocation units) increase in order in which the members were defined") and last C17 standard draft, 6.7.2.1.15, p. 82 ("Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared.").

2021-02-01 Reply Admin

So Amsterdam <> AMSTERDAM ...

Case preservation is good for variables. Case sensitive might be useful in strings but gives headaches when searching.

R3D3 · 2021-02-01 Reply Admin

Part was probably also trying to avoid wasted punchcards. Imagine having to throw away a punchcard, just because you wrongly thought the variable name still fits into the line.

Modern Fortran though... It is a language that COULD be good. So many good ideas are there, and have fixed the worst historical inconveniences. ALLOCATABLE variables, especially, and automatic-destructor semantics.

But then there's the odd omissions; Function semantics are heavily hamstringed by "return by value", so half the time you need to use output parameters of subroutines instead for efficiency. But on the other hand you have MOVE_ALLOC, so move-semantics are already in the language anyway -- just not for function return values.

Or that automatic destructors have to be declared for each rank separately -- or need to be declared IMPURE ELEMENTAL, which you first need to find out. (More likely, you're going to see bugs when something unexpectedly doesn't get finalized.)

Or that you need to create a wrapper class, if you need a ragged or polymorphic array or array of pointers. You can't make an "array of allocatables", but instead you have to make an "array of a user-defined type, that contains an allocatable/polymorphic field".

This wouldn't be much of an issue though, if only you could create type-safe generic data structures (hashes, lists, and yes, allocatable/pointer wrappers). The best you can currently do is use a wrapper type with a CLASS(*),ALLOCATABLE field, which can be allocated for any type. And then you can retrieve the stored value with SELECT TYPE -- which means you can't do something like PRINT *, list%get(1). Alternatively you might make preprocessed-shenaningans, so probably you'll be the only one on the project actually using those features.

Or the syntax, that forces you to separate the (verbose) variable declaration from the initialization statement.

:(

2021-02-01 Reply Admin

Well, there's all that, or there's C++ from 14 onwards.

On the whole I would prefer to see Fortran transposed through a suitable set of tools (based on a robust AST) into C++. Or even Rust, to be absolutely honest.

99% of legacy Fortran (and I speak as somebody who grew up with, and loved, the language) is essentially library functions. Obviously, the multi-threading goodness is going to be difficult to "port" to a new language, other than C, perhaps. But the rest of it? FIgure out a wrapper a la SWIG and just move on.

2021-02-01 Reply Admin

When you port code with lotsa strlcpy to a platform that does not have it, I would not try to fix the callers but take a (reasonably licensed) version of strlcpy from one of the many c libraries that come with it, and add it to my project. Less error prone than trying to fix all the strncpy callers.

That being said, if an architecture provides strcpy_s and strcat_s (and it is better for the application to terminate itself instead of showing undefined behaviour in case of a programming error), I prefer those to strlcpy and strlcat

PJH · 2021-02-01 Reply Admin

Now that was odd, as strlcpy is the "good" way to copy strings, with guarantees that it would never allow buffer overruns.

Because strlcpy(dest, src, strlen(src)); would never happen. Would it (Lounge post on main forum)?

It's not safe.

2021-02-01 Reply Admin

Those were the days, where software engineers were real heroes who wrote their own device drivers using a 9V battery and some jumper wires to flip bits, and could tell what baud your connection was at by the sound of the line training.

"Oh, that timbre sounds harsh, guess I'm only getting 14.4k today"

Get off my lawn!?

2021-02-02 Reply Admin

"Second, arrays are actually just pointers to the first item in the array, "

No. Just, no. This is not right, it is not even wrong. Array-of-T and Pointer-to-T are ditinct types in C and are not the same. Not now, not then, not ever.

An Array-of-T will decay to a pointer in many, but not all, circumstances and C will happily and silently transform

void foo(int a[10])

into

void foo(int *a)

but they are not the same. A pointer is an object that can hold an address. An array is an object that will hold all the items. An pointer can be re-written, an array can not:

int a[10]; int b[20]; int *p;

p = a; // works due to pointer decay a = p; // Does not work a = b; // Does not work.

This "array and pointers are the same in C" mistake is probably cause to as many errors as NULL.

UserK · 2021-02-02 Reply Admin

Defensive programming. I can't get enough of it. Unit tests are for the weak mind! There will be absolutely no need for anything remotely automated because you know, we don't just hire only the best, we also keep them fully motivated and never overworked. I love crunch cereals for breakfast, what are you talking about.

Now, can you make this port by the end of the day?

2021-02-02 Reply Admin

I remember that. God I miss the modem music. Why they didn't include an audio clip of it when connecting to your internet these days I'll newer know. Like at least it could be the default sound for when your phone catches wifi.

2021-02-02 Reply Admin

TRWTF is, of course, using C-style strings in C++, and that the article calls this the "good way".

2021-02-02 Reply Admin

More interesting, is IZMIR == izmir? If you don't know, ask a Turkish speaker. The answer may surprise you.

2021-02-02 Reply Admin

Even accessing element 10 of a 10 character array is bad news, since the array elements are 0 to 9, not 1 to 10

2021-02-02 Reply Admin

That was a pretty good description of the basic problems with strings in C, except Remy could have mentioned explicitly that you need to add that extra byte. If you want a string 10 characters long, you have to give it 11 to make room for the null.

But I guess people who find C strings troublesome never had to work with them in assembly language.

That guy was lazy too: (I probably got this wrong somewhere, but...)

<s>#define strlcpy strncpy</s>
#define double float
size_t strlcpy(char *dst, const char *src, size_t size)
{
  int n = strncpy(dst, src, size);
  dst[size] = 0;
  return n;
|

Probably the most annoying thing about using a 16-bit string length is that now your string length is a different size than the string characters. No problem, we can just go with UTF-16!

And spaces in variable names? Now we can put freaking emoji in variable names! pileOfPoo++

hedronist · 2021-02-02 Reply Admin

Ah yes, null pointers and too-short character buffers! That's how I started my first company back in 1983. I created a C source-level debugger. It was initially just for myself, but then a friend talked me into getting a booth at the 1983 UniForum convention in San Diego. My customers coming out of that included HP, Siemens, and Microsoft (they had recently acquired Xenix from The Santa Cruz Operation). That little debugger, called CDB, definitely helped pay the bills.

2021-02-03 Reply Admin

Rather sadly, you are completely wrong, in both respects.

There is no such thing as an "object" in C. Not even in C99. A C array is just a pointer. Your stipulations don't make no difference. There is no such thing in C as an "object."

Specifically, in the case you quote (Array-of-T and Pointer-to-T ? That's actually a parsing limitation. It's actually counter-intuitive, because:

int a[10];
int b[20];

are, in all other cases, treated as:

int* a, b;
a = malloc(10);
b = malloc(20);

That's just the way that C works. Otherwise, how could you explain the fact that you can add a pointer to an array variable (or vice versa) and ... well, something happens. I used to be able to figure it out.

But, sorry. In C, the array syntax is nothing more than syntactic sugar. Don't believe me? Try this:

int a[10];
int b[20];
int* aptr = a;
int* bptr = b;

Do whatever you like with aptr and bptr. They don't even need a downcast (as youngsters these days would say).

Fundamentally, a C array (or a C string) is nothing more or less than a pointer. You're only fooling yourself if you believe otherwise.

... oh, and the other thing?

"This "array and pointers are the same in C" mistake is probably cause to as many errors as NULL."

"Probably?" I am of the opinion that the probability tends to zero. And even if I am wrong, I believe that any C compiler past about GCC 2.95 would tell you that it's about time you started wearing a tin foil hat. Can I interest you in my free, no questions asked, lobotomies come with the deal, subscription to Q-Anon?

2021-02-04 Reply Admin

Sorry, but you are dead wrong. Read the standard. Objects in C are units of storage. Arrays of T is a distinct type from Pointer-to-T,. See https://en.cppreference.com/w/c/language/object

"But, sorry. In C, the array syntax is nothing more than syntactic sugar. Don't believe me? Try this:

int a[10]; int b[20]; int* aptr = a;"

And yes, this works specifically because of pointer decay. Nothing else. Try the reverse, in your case:

int a[10]; int b[20]; int* aptr = maaloc(100); a = aptr;

Won't work so well, will it?

or try another variant, in file1:

int a[100] = { .. };

in file2:

extern int *a;

And then use a in both file1 and file2, see how well that goes. or try:

int x[1000]; printf("%p %p\n",x, &x); printf("%p %p\n",x01, &x+1);

Why the difference? Because x naked decay to pinter, address-of-x will not decaym hence &x is not a pointer to int, nor apointer to a pointer to int, but a pointer to array of 10 ints, eg: int (*p)[10];

Or a multitude of other examples. Wehn you do int x[100], x is an object (yes, an object, a unit of storage), contiguous, and large enough tohold 100 ints with whatever alignment requirements your platform may have. It is not a pointer. a pointer would likeweise create a unit of storage (object) but only of a size large enough to hold an address, no more.

""Probably?" I am of the opinion that the probability tends to zero. And even if I am wrong, I believe that any C compiler past about GCC 2.95 would tell you that it's about time you started wearing a tin foil hat. Can I interest you in my free, no questions asked, lobotomies come with the deal, subscription to Q-Anon?"

Oh, that escalated quickly. No, sorry. The delusion that arrays are just syntactic sugar for a pointer is not only patently wrong, it is also teh cause of many, many problems. Almost as bad as people not understanding undefined behaviour.

Null and Terminated

Leave a comment on “Null and Terminated”