The Daily WTF: Curious Perversions in Information Technology

xtremezone · 2008-04-30 Reply Admin

The real WTF is XML. There, I said it.

2008-04-30 Reply Admin

"The error, for those that care, was that our code expected values in the range 0 to 255, rather than -127 to 128. "

Or if you use one of those newfangled computers that uses two's complement, it would be -128 to +127.

2008-04-30 Reply Admin

Another method, would be to gzip the data, and store it inline (just like pdf does)...

2008-04-30 Reply Admin

bold:
Remembering that XML's golden selling point is "human readable", image expression should be more like this: <image scale="(about yay big)"> <background> <color-palette class="sunsetty"><not-too value="orange"><possibly-some-kind-of type="clouds" class="whispy"/></not-too> </background> <overlay origin="(in the middle(left a bit))"> <sort-of a="dog"><with a="tail"><but more-like="ref:horse"/></with></sort-of> </overlay> <paint-brush style="wooden-handled"> <set id="horse" merge="(closing one eye hold up your thumb)" canvas="linen"> <paint style="jackson-pollock"/> <paint style="something nice"/> </set> <paint inspiration="shetland pony"/> </paint-brush> </image>

Hey, it's svg!

2008-04-30 Reply Admin

cthulhu:
There are many standardized color charts that could be utilized that I believe cover all possible real-life colors, eg:
<pixel>Sunset Orange</pixel> <pixel>Light Green</pixel> <pixel>Reddish</pixel> <pixel>Lime Green</pixel> <pixel>Sea Breeze Blue</pixel> <pixel>Woodchip Brown</pixel>

This way the underlying encoding for images could be understood by anyone. This might be a worthwhile project for anyone with sufficient XML skills.

Are you kidding? What could possibly be gained by encoding an image like that?

2008-04-30 Reply Admin

mister:
What? And limit yourself to rgb colors? No transparency? This is how it should be done:
--snip--
So, kind of like x3d does it?

2008-04-30 Reply Admin

And for some generations of javac, most of the silly string+string - operations people do have been silently optimized to use StringBuilder. It's still not smart to depend on it happening, but that's probably what Phil meant.

2008-04-30 Reply Admin

But wait! I thought garbage collection made it so that I never had to worry about memory allocation and deallocation! It all just goes away, right? Right?

2008-04-30 Reply Admin

Anon:
The above code creates 100 String objects... the first contains "", the second contains "xx", the third contains "xxxx", etc. Of those, the only String that remains reachable after the loop completes is the String object containing 200 'x' characters. The above loop has created String objects containing 9900 characters, none of which are reachable when the loop ends, all of which have to be garbage collected.
...

I can't believe this had to actually be explained to anyone.

Explained again at http://java.sun.com/developer/JDCTechTips/2002/tt0305.html

Welcome to 2003(ish)!

http://java.sun.com/j2se/1.4.2/docs/api/java/lang/String.html

Or how about: http://java.sun.com/j2se/1.3/docs/api/java/lang/String.html

To save some clicking:

The Java language provides special support for the string concatentation operator ( + ), and for conversion of other objects to strings. String concatenation is implemented through the StringBuffer class and its append method. String conversions are implemented through the method toString, defined by Object and inherited by all classes in Java. For additional information on string concatenation and conversion, see Gosling, Joy, and Steele, The Java Language Specification.

Right then. Carry on.

2008-04-30 Reply Admin

The real wtf is that people are proposing XML schemas by posting sample documents when there are multiple unambiguous languages for defining a schema.

2008-04-30 Reply Admin

[...] rather than -127 to 128.

Why not use -42 to 213, that would be much more beautiful!

2008-04-30 Reply Admin

The Real WTF is that they're using XML when there already is a perfectly valid text format for storing images:

<imagedata>
/* XPM */
static char * tdwtf_favico_xpm[] = {
"16 16 5 1",
" 	c None",
".	c #CE0808",
"+	c #D00708",
"@	c #CD0808",
"#	c #CE0809",
"                ",
"  .....         ",
"  ......    ....",
" ........  .....",
".......... .....",
" ... ..... .....",
"    ...... .... ",
"   ......  ...  ",
"  .....    ...  ",
"  +...     ...  ",
"                ",
"           ...  ",
" @#..      .... ",
"  ...     ....  ",
"  ...      ...  ",
"                "};
</imagedata>

And, yes, XPM is a real format. It outputs images as valid C files for use in X11 programs via #include.

2008-04-30 Reply Admin

cthulhu, you're assuming the image is rasterized. What if it is vector art? The file is being transported in its binary form, which would be agnostic to the file format.

I don't ger why they didn't just base64 encode the image. Talk about bloat.

2008-04-30 Reply Admin

GF:
Anon:
The above code creates 100 String objects... the first contains "", the second contains "xx", the third contains "xxxx", etc. Of those, the only String that remains reachable after the loop completes is the String object containing 200 'x' characters. The above loop has created String objects containing 9900 characters, none of which are reachable when the loop ends, all of which have to be garbage collected.
...

I can't believe this had to actually be explained to anyone.

Explained again at http://java.sun.com/developer/JDCTechTips/2002/tt0305.html

Welcome to 2003(ish)!

http://java.sun.com/j2se/1.4.2/docs/api/java/lang/String.html

Or how about: http://java.sun.com/j2se/1.3/docs/api/java/lang/String.html

To save some clicking:

The Java language provides special support for the string concatentation operator ( + ), and for conversion of other objects to strings. String concatenation is implemented through the StringBuffer class and its append method. String conversions are implemented through the method toString, defined by Object and inherited by all classes in Java. For additional information on string concatenation and conversion, see Gosling, Joy, and Steele, The Java Language Specification.

Right then. Carry on.

Except now it's allocating a bunch of unreachable StringBuffer instances on top of the unreachable String instances, since

String str = "";

for(int i = 0; i < n; i++)
{
  str += "blah";
}

is really converted to

String str = "";

for(int i = 0; i < n; i++)
{
  str = (new StringBuffer(str)).append("blah").toString();
}

Yes, huge savings...especially with all that excess wasted space in the private char[] in StringBuffer.

2008-04-30 Reply Admin

Edward Royce:
Hmmmm.
"<bytes>37</bytes>"

Are you telling me that your library is wrapping a "<bytes>" + "</bytes>" tag pair around each and every singl byte of image data??

Doesn't that blow up image sizes by 16? A 10k would become 160k. Isn't that a little ridiculous?

Actually, including whitespace it expands byte sizes by about a factor of 20 (ranging from 19x (for 0) to 22x (for 0127/0xFF)), plus the header/footer overhead.

2008-04-30 Reply Admin

This is why I hate XML.

For every good example of XML use, there are 10000 bad examples.

2008-04-30 Reply Admin

Greg:
cthulhu:
There are many standardized color charts that could be utilized that I believe cover all possible real-life colors, eg:
<pixel>Sunset Orange</pixel> <pixel>Light Green</pixel> <pixel>Reddish</pixel> <pixel>Lime Green</pixel> <pixel>Sea Breeze Blue</pixel> <pixel>Woodchip Brown</pixel>

This way the underlying encoding for images could be understood by anyone. This might be a worthwhile project for anyone with sufficient XML skills.

Are you kidding? What could possibly be gained by encoding an image like that?

Hmmmm.

Look at it this way. If you're using descriptive language to name each and every single permutation of RGB then there are going to be RGB combinations for which there are no standardized names.

So you can immortalize ex-girlfriends and other annoying people all you like.

Example: this color's name is now "Lisa's Pimply Ass".

2008-04-30 Reply Admin

compiler-guy:
But wait! I thought garbage collection made it so that I never had to worry about memory allocation and deallocation! It all just goes away, right? Right?

Oh yeah! Especially if you convert a 4mb image to XML by string concatenation.

LOL. I'd wait to run this specific piece of software until a coworker is trying to compile something really big.

"WTF!? Where did all the resources go ..."

2008-04-30 Reply Admin

Robin Goodfellow:
Edward Royce:
Hmmmm.
"<bytes>37</bytes>"

Are you telling me that your library is wrapping a "<bytes>" + "</bytes>" tag pair around each and every singl byte of image data??

Doesn't that blow up image sizes by 16? A 10k would become 160k. Isn't that a little ridiculous?

Actually, including whitespace it expands byte sizes by about a factor of 20 (ranging from 19x (for 0) to 22x (for 0127/0xFF)), plus the header/footer overhead.

Makes it even more attractive!

Why it's not an integral part of the HTML 4.0 spec, I have no idea.

2008-04-30 Reply Admin

Edward Royce:
compiler-guy:
But wait! I thought garbage collection made it so that I never had to worry about memory allocation and deallocation! It all just goes away, right? Right?

Oh yeah! Especially if you convert a 4mb image to XML by string concatenation.

LOL. I'd wait to run this specific piece of software until a coworker is trying to compile something really big.

"WTF!? Where did all the resources go ..."

Duh, its a hardware problem.

2008-04-30 Reply Admin

That is almost as bad as ASN.1/BER

dtech · 2008-04-30 Reply Admin

Brillant! I'm going to use this in every project I do from now on. Look at all the features!

Human readable: any text editor can show you what the value of each byte is!
Flexible: ever feel the need to add more bytes? In this format you can!
Speed: It is clear that a nice sematic structure will make the reading of the bytes much more faster. Now the application won't have to read every bit, combine 8 to a byte and number each byte! It can just real the whole file and use the sematics to instantly access each byte!
Future sproof: The same syntax can easily be adjusted to be future proof! Think of all of the different type of bytes we could use in the future! You don't need to redesign your applications just because some stupid "quantum computer" wants to use his fancy "imaginary bytes"

2008-04-30 Reply Admin

Don't worry, they're adding it to the HTML 5.0 spec.

Well, not quite as bad, but they ARE requiring that browsers allow you to embed images via "data:" URLs. (Worse, browsers must be capable of generating such URLs via JavaScript.) By default, it requires bytes to be escaped in URL format, like:

data:application/octet-stream,%01%02%03%04

However, you can also use Base64-encoding, by adding something like:

data:application/octet-stream;base-64,ABCD==

Such things are already supported by Firefox, Opera, and Safari. Really.

andrewbadera · 2008-04-30 Reply Admin

TRWTF here is the obvious lack of experience that almost anyone in this forum apparently has when it comes to XML serialization and web services. Not to mention, transfer of binary data.

lolwtf · 2008-04-30 Reply Admin

<image>
  #######
 #       #
#  #   #  #
#    #    #
# #     # #
#  #####  #
 #       #
  #######
</image>

I don't know where I'm going with this.

2008-04-30 Reply Admin

Edward Royce:
Hmmmm.
"<bytes>37</bytes>"

Are you telling me that your library is wrapping a "<bytes>" + "</bytes>" tag pair around each and every singl byte of image data??

Doesn't that blow up image sizes by 16? A 10k would become 160k. Isn't that a little ridiculous?

They did say it was enterprisey.

2008-04-30 Reply Admin

Physics Phil:
... using + for String concatenation in Java is just a shorthand for creating a new StringBuilder ...

Well poo on your java because .NET "string1" + "string2" is the same thing as

string string3 = string.Concat("string1", "string2");

Might not be as bas a string string3 = string.Format("{0}{2}", "string1", "string2"); which I bet someone, somewhere has done exactly that. (Note: this would use a StringBuilder, and three string objects!!)

If you used a StringBuilder you'd only have two object (though one more complex than the other) instead of three string objects which is "slightly" better. Since the less objects you have the better off your are as the GC eats your waste of space faster :D

2008-04-30 Reply Admin

I think this is actually even more sinister than anyone here has guessed. I don't think this is meant to be a bitmap format translated to XML. As others have pointed out, it doesn't contain the information necessary to render an image (dimensions, color channels). I think each <bytes> element is actually a byte from the image file. The serialization code simply opens the file as a character stream and reads from beginning to end, writing each byte out to XML.

Pure, unmitigated evil.

2008-04-30 Reply Admin

Edward Royce:
Why it's not an integral part of the HTML 4.0 spec, I have no idea.

I've heard it's part of the OOXML standard. Check page 3000-something.

2008-04-30 Reply Admin

You could easily define the actual values of each color earlier in the image, also:

<colors>
   <color name="Sunset Orange">
      <red>FF</red>
      <green>8A</green>
      <blue>00</blue>
   </color>
</colors>

2008-04-30 Reply Admin

schemas aren't funny

crystal mephistopheles · 2008-04-30 Reply Admin

The error, for those that care, was that our code expected values in the range 0 to 255, rather than -127 to 128.

Well thank heavens for the XML then. If those bytes had been in binary, he never would have noticed that they were signed!

2008-04-30 Reply Admin

Don't meant to ruin the fun... but, just base-64 encode the image thusly: <data type="bmp"> xsfgsdjnsdasdfvsavhbasdovuinsadv sdfgdsfsdfsdjndsifndsfisdufnsfs asdfiojndfdisufnsdufdsuifnsdfsdufn </data>

(btw: don't try and decode that. I'm far too lazy to encode something clever into this post.)

2008-04-30 Reply Admin

FTW!!!

The sad thing is I've seen this example back in my Uni days :) The lecturer for 'Internet Programming' was truly clueless

2008-04-30 Reply Admin

FTW!!!

The sad thing is I've seen this example back in my Uni days :) The lecturer for 'Internet Programming' was truly clueless

2008-04-30 Reply Admin

draeath:
Hopefully the xml is gzipped on-the-fly. We can only hope.

No it wasn't. But it did come with terribly useful HTTP headers, such as Content-Length and Content-Type. You'd think those would be useful for something.

2008-04-30 Reply Admin

Deron:
"The error, for those that care, was that our code expected values in the range 0 to 255, rather than -127 to 128. "
Or if you use one of those newfangled computers that uses two's complement, it would be -128 to +127.

Hey - I was the one expecting it in the range 0-255 and preferably not in XML. Don't give me a hard time for what was being returned from the server.

2008-04-30 Reply Admin

That looks just like first draft of Microsoft OOXML!

2008-04-30 Reply Admin

Bob N Freely:
I think this is actually even more sinister than anyone here has guessed. I don't think this is meant to be a bitmap format translated to XML. As others have pointed out, it doesn't contain the information necessary to render an image (dimensions, color channels). I think each <bytes> element is actually a byte from the image file. The serialization code simply opens the file as a character stream and reads from beginning to end, writing each byte out to XML.
Pure, unmitigated evil.

Yep. It was a PNG, but it was a whole PNG, not image data. Yep. The actual file.

2008-04-30 Reply Admin

xmlicious:
XML is like violence: if it doesn't solve your problem, you're not using enough of it.

this is the best thing ive heard all day.

Eternal Density · 2008-04-30 Reply Admin

dkf:
<bytes>
    <bit number="0" lsb="true">1</bit>
    <bit number="2">1</bit>
    <bit number="5">1</bit>
    <bit default="true">0</bit>
</bytes>
By doing this, you eliminate the interoperability problem of just how to represent bytes and instead have something that is easy to interpret unambiguously every time!!

I'd love to see a calculator program built on this. Even seeing an adder that takes two of these bytes and adds them would be interesting to see done...

2008-04-30 Reply Admin

<?xml version="1.0" encoding="utf-8"?>
<DiskImages>
	<Disk Brand="Maxtor" Size="250 GB">
		<Partition Type="Primary" FirstSector="1" LastSector="999999999" Label="Personal Files" SectorSize="4 KB">
			<Sector>
				<Byte>
					<Bit>True</Bit>
					<Bit>True</Bit>
					<Bit>False</Bit>
					<Bit>True</Bit>
					<Bit>False</Bit>
					<Bit>False</Bit>
					<Bit>True</Bit>
					<Bit>False</Bit>
				</Byte>
				<Byte>
					<Bit>True</Bit>
					<Bit>True</Bit>
					<Bit>False</Bit>
					<Bit>True</Bit>
					<Bit>False</Bit>
					<Bit>False</Bit>
					<Bit>True</Bit>
					<Bit>False</Bit>
				</Byte>
				<Byte>
					<Bit>True</Bit>
					<Bit>True</Bit>
					<Bit>False</Bit>
					<Bit>True</Bit>
					<Bit>False</Bit>
					<Bit>False</Bit>
					<Bit>True</Bit>
					<Bit>False</Bit>
				</Byte>
				<Byte>
					<Bit>True</Bit>
					<Bit>True</Bit>
					<Bit>False</Bit>
					<Bit>True</Bit>
					<Bit>False</Bit>
					<Bit>False</Bit>
					<Bit>True</Bit>
					<Bit>False</Bit>
				</Byte>
				<Byte>
					<Bit>True</Bit>
					<Bit>True</Bit>
					<Bit>False</Bit>
					<Bit>True</Bit>
					<Bit>False</Bit>
					<Bit>False</Bit>
					<Bit>True</Bit>
					<Bit>False</Bit>
				</Byte>
			</Sector>
		</Partition>
	</Disk>
</DiskImages>

With tabs as 4 spaces, Windows newlines, and Unicode, such a disk image would be 6,000 times bigger than the actual disk. 1 GB Disk => 6 TB disk image.

I am not making fun of DriveImage XML, which uses XML to match filenames to chunks within the binary disk image.

The XML should appear indented

2008-04-30 Reply Admin

Anonymouse:
FTW!!!
The sad thing is I've seen this example back in my Uni days :) The lecturer for 'Internet Programming' was truly clueless

University of Auckland?

2008-05-01 Reply Admin

I think you mean

2008-05-01 Reply Admin

since each element is a byte, that means it's binary right? like binary data?

better to base64 encode it first then, no? of course then you need an extra attribute on each element. It's a little wordier, but best to be clear:

DOA · 2008-05-01 Reply Admin

xmlicious:
XML is like violence: if it doesn't solve your problem, you're not using enough of it.

hehehehe... I want a t-shirt with this logo

donniel · 2008-05-01 Reply Admin

Matt:
Bob N Freely:
... I don't think this is meant to be a bitmap format translated to XML. Pure, unmitigated evil.

Yep. It was a PNG, but it was a whole PNG, not image data. Yep. The actual file.

Hi, not being sarcastic, got a question:

Why is this worse? Is it because the MIME headers are also unnecessarily encoded, instead of just the actual 'data'?

2008-05-01 Reply Admin

dkf:
Edward Royce:
Are you telling me that your library is wrapping a "<bytes>" + "</bytes>" tag pair around each and every singl *byte* of image data??
There is one way to fix this problem. Convert each of those to something like this (with namespaces omitted for brevity):
<bytes>
    <bit number="0" lsb="true">1</bit>
    <bit number="2">1</bit>
    <bit number="5">1</bit>
    <bit default="true">0</bit>
</bytes>
By doing this, you eliminate the interoperability problem of just how to represent bytes and instead have something that is easy to interpret unambiguously every time!!

HOLY CRAP WHAT ARE SAYING????

Your going to use XML to describe each bit??? Dude, for one byte you'd eat up loads of XML. That is unreadable, will be problematic to parse and basically increases the size of the file 8 fold (worst case - not to mension you already added attributes).

2008-05-01 Reply Admin

phill:
That is unreadable, will be problematic to parse and basically increases the size of the file 8 fold (worst case - not to mension you already added attributes).

Apart from your total unfamiliarity with the concept of 'sarcasm', you also fail on mathematics. It's more like 150-fold for average case.

2008-05-01 Reply Admin

The real WTF here is the use of the plural "bytes" to represent a single byte. It should obviously be: <bytes> <byte>12</byte> <byte>193</byte> : </bytes>

There. much better.

Oh, XML

Leave a comment on “Oh, XML”