The Daily WTF: Curious Perversions in Information Technology

2016-12-12 Reply Admin

I always end up having to look up what the exact order is for big/little endian, but at least I don't do this kind of thing.

2016-12-12 Reply Admin

Did I miss something? Why is PutInt16 and PutInt32 increasing _index? It is already increased in PutByte.

2016-12-12 Reply Admin

Maybe they're using those unaltered bytes to indicate boundaries? Still would cause a 32-bit int to end up in two "blocks" of 16 bits, with one unaltered byte inbetween and two more unaltered bytes after. I guess the most logical explanations are either "programmed by a potato" or "misremembered the code."

2016-12-12 Reply Admin

I'd vote for potato.

Note that PutByte writes to _data, which doesn't exist.

I suspect that most of the glitches could be fixed by making this a simple class (no instance nonsense, move the initialise code into the constructor), and by removing the index shenanigans from PutInt16 and PutInt32 methods.

2016-12-12 Reply Admin

"Everyone gets everything he wants. I wanted to fix a WTF, and for my sins, they gave me one."

2016-12-12 Reply Admin

Yup, Initialize() is an obvious antipattern in C++, in this case a corollary to the misused singleton. The classic singleton pattern is flawed though, as it's subject to the static initialization order fiasco. I have mine expose an instantiator object which must be put inside main(), or an object tied to Init/ExitInstance() for example.

2016-12-12 Reply Admin

Why the array external to the class? Just use a std::vector<unsigned char> member variable!

2016-12-12 Reply Admin

Aside from not being thread safe... the class simply changes the endianness of the integers?

Basically from 'I don't know the endianness of the integer' to 'the opposite of the endianness, that I don't know what is'

TRM (The Real Magic) must lie in the ByteReader then... :)

2016-12-12 Reply Admin

TRWTF is not checking for overflows in PutByte()

2016-12-12 Reply Admin

TRWTF is multiple-of-8 word lengths.

2016-12-12 Reply Admin

Singleton horribleness aside, the other WTF is using raw memory allocations in C++. Use smart pointers, or better yet have the (non-singleton) class manage the buffer itself so it can allocate more memory when needed and not cause a buffer overflow.

2016-12-12 Reply Admin

The first WTF is the use of a singleton when you need more than one. And if you use threads, you do need more than one. The second WTF is that the caller must provide the memory for the data, which makes it prone to buffer overflows, and makes it unnecessarily hard to use.

2016-12-12 Reply Admin

-- Easy Reader Version: The Singleton design pattern is one of the simplest to understand, and easiest to implement, thus it's also the ~~abused pattern~~ most often misimplemented. --

2016-12-12 Reply Admin

This is the endian, beautiful friend This is the endian, my only friend, the endian Of our elaborate plans, the endian Of everything that stands, the endian No safety or surprise, the endian I'll never look into your eyes, again

2016-12-12 Reply Admin

This is possibly a case of globalitus. Action at a distance. They may have needed it at some point to be accessible from multiple places and rather than passing it around made it a global singleton. There's a better pattern for that. For example registry pattern or dynamic runtime dependency injection using the passing of parameters.

2016-12-12 Reply Admin

Not quite. It's going from "I don't know the endianness of the integer" to "I know the endianness of this integer because it's stored in this byte sequence in a predictable way across all platforms". It's useless without the ByteReader class, which would do the exact opposite. In this case, the author provided a ByteWriter class that serializes integers in little-endian order regardless of host endianness. The ByteReader class would read the integers as if they were little-endian regardless of the host endianness.

Further reading: https://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html

2016-12-12 Reply Admin

Thankfully, when the internet was being developed, they standardized with "Network Byte Order", and said "You shall conform". Machine makers unfortunately had other ideas, and we're stuck with such conversions. And so it goes.

Yes, even the bits in RFC's are numbered from left to right (it is just documentation, though). Like it should have always been. Mostly because one of the first machines being connected, was a Sigma 7, which numbered things that way (as god intended!).

2016-12-12 Reply Admin

Well, theres using C-style casts in C++, then there's using signed data with masks and shifts and then there's casting something to an int32_t which is passed to a function taking an int32_t. Can't possibly do anything unexpected.

The reader is going to be a nightmare of shifts and casts if he's taking two int16_t's and making an int32_t out of it (and ditto from byte, which should be int8_t for consistency), to int16_t. I don't think byte is standard anyway. And if it isn't I wonder if whoever set it up knew that the signedness of a char is implementation defined.

2016-12-12 Reply Admin

If only there were some simple API methods to convert from network to host short or network to host long, and host to network short and long. You could even give them short names like htons(), htonl(), ntohs() and ntohl().

Oh well, maybe they will be implemented in Linux and Windows sometime.

TheCPUWizard · 2016-12-12 Reply Admin

"I have mine expose an instantiator object which must be put inside main(), or an object tied to Init/ExitInstance() for example."

Now to find a portable implementation that forces those usage constraints....

Probably easier to find a winning Lotto ticket, then up and leave..

CoyneTheDup · 2016-12-12 Reply Admin

But this is the coolest evar! How else would you do things like this?

    byte* tempData = new byte[128];
    byte* lenPtr, next;
    ByteWriter* writer = ByteWriter::Instance();
    writer->Initialize(tempData);
    writer->PutInt32(someIntValue);
    lenPtr = writer->getIndex();
    writer->PutInt16(0);
    writer->PutInt32(someOtherIntValue);
    writer->PutByte(someByteArray);
    writer->PutInt32(someOtherIntValue);
    writer->PutInt32(someOtherIntValue);
    next = writer->getIndex();
// now write the length
    ByteWriter* writer2 = ByteWriter::Instance();
    writer2->Initialize(lenPtr);
    writer2->PutInt16(next - lenPtr);
// back on track
    writer->Initialize(next);

2016-12-13 Reply Admin

When it comes to serializing, the data endianness matters (and must be specified by the format), but the native endianness is often irrelevant: https://commandcenter.blogspot.se/2012/04/byte-order-fallacy.html

2016-12-13 Reply Admin

Just make one like I did. Portability is trivial and hopefully the vast majority of your singletons are under your control.

urkerab · 2016-12-13 Reply Admin

The only sensible bit numbering scheme is the one where bit n is the bit whose value is 2ⁿ.

2016-12-13 Reply Admin

The best and only chapter worth reading in GOF is chapter 1. Everything else, especially Singleton is worthless.

2016-12-14 Reply Admin

Protip: Click the word "endianness" in the first sentence!

Medinoc · 2016-12-15 Reply Admin

"The classic singleton pattern is flawed though, as it's subject to the static initialization order fiasco." Huh? Wasn't the Singleton pattern made specifically for working around the static initialization order fiasco? (by causing initialization on the first call rather than statically)

2016-12-18 Reply Admin

See double-check pattern...

2017-01-13 Reply Admin

Endianness is not a problem. You need to know the endianness of file formats, so you know how to read them in, but otherwise, you don't need to care about it. If you're writing code that takes advantage of the computer's endianness (whether because you assume it's little endian or you test it to figure out what the endianness is), you are going about it the wrong way.

Taking a blob of memory and just mindlessly writing it out to file is a recipe for disaster. In addition to endianness problems, if you're using structs, you're opening yourself up to failures relating to alignment, the sizes of types (Is an int 2, 4, or 8 bits? Trick question - it could be any of them!), and padding, so that, not only will your code not be portable across platforms, it likely won't be portable across versions of your compiler! And, yes, that's absolutely a pain in the ass, but that's only because you're going about it in the wrong way entirely!

When doing binary IO, you should treat it like what it really is - an array of bytes. If you need to read a little endian 32-bit integer, you read the 4 bytes separately, and create the integer by shifting each byte by the correct amount and orring them all together. If you're working with structs, read and write each field individually and explicitly - do not just dump the struct to file via fwrite.

2017-02-06 Reply Admin

Dumping the struct via fwrite() is great fun. It let's you find out where the compiler decided to pad the struct for word alignments.

For even more fun, send them across a network connection and let the other end try to figure out where the extra bytes are.

This is the Endian

Leave a comment on “This is the Endian”