The Daily WTF: Curious Perversions in Information Technology

DES · 2008-10-01 Reply Admin

HTML is not XML, it's SGML. XML is (mostly) a subset of SGML. The XML version of HTML, which practically noone uses, is called XHTML.

2008-10-01 Reply Admin

TopCod3r:
Look. <snip>
But the bottom like is any decent developer should be able to write a specialized XML library that is easier to use and out-performs the generic XML library included in .NET.

If I had to write a "specialized XML library", I would seriously doubt the XML requirement. I'd opt for a simpler (and more lightweight!) CSV or JSON scheme instead. Either you use standard XML (...) or not.

DES · 2008-10-01 Reply Admin

Both are allowed in XML. Placing a space before the closing slash is commonly done because some older (pre-XML) HTML parsers understand <foo /> but not <foo/>. As far as upper-case characters: XML is case sensitive, but that doesn't mean it doesn't accept upper-case characters in element or attribute names.

2008-10-01 Reply Admin

Nerf Herder:
Ledward:
In Soviet Russia, tags close you!

I dont know who votes on featured comments, but if this isn't blue by the end of the day there is a conspiracy.

./ called, they want their joke back.

2008-10-01 Reply Admin

RiF:
Nerf Herder:
Ledward:
In Soviet Russia, tags close you!

I dont know who votes on featured comments, but if this isn't blue by the end of the day there is a conspiracy.
/. called, they want their joke back.

What typo? ;-)

2008-10-01 Reply Admin

Sorry, but in XML an empty element tag is clearly equivalent to a start tag, end tag with no content. That carries through to XHTML. (I just checked the standards documents).

Unfortunately, existing browsers tend to be broken on this behaviour.

2008-10-01 Reply Admin

nikki9696:
I know a certain major bank whose "web service", AKA an email address that you send XML to, doesn't like "malformed" XML either. It also doesn't like when you switch the element order. That is
<wtf>foo</wtf> <hi>there</hi>

was not treated the same as <hi>there</hi> <wtf>foo</wtf>

It's a sad thing.

???

I hope the above was a clever troll...

(In case there are any newbie XML folks here who may not pick up on this: in XML order does matter!)

2008-10-01 Reply Admin

eXtreme Markup Languages aren't for wusses.

I don't think the XML will last more than a season or two.

Kermos · 2008-10-01 Reply Admin

Nerf Herder:
Kermos:
Dood:
Michael responds to let them no that their software hasn't changed in a month

no?

You know what is sad? I've gotten so used to people writing "no" for "know" that I don't even notice anymore...

Its the same with people writing "you" instead of "your". I realize its a different case because they are not homophones, but so many people for whatever reason cannot type "your" correctly.

I cannot count how many times a week I read something like: "Don't forget to submit you timesheets by Friday"

Drives me nuts

You know, that I can almost still dismiss as a typo. At least they didn't write "Don't forget to submit ur timesheets by Friday." I absolutely hate it when I see something like that. My boss does it all the time, as well as instant message me with "r u there?" to which I generally just give him an answer in Japanese.

"their", "they're", "there" is another favorite of mine. I mean for crying out loud, English is my second language and I can get it right! Why can't a lot of native speakers I meet?

2008-10-01 Reply Admin

(In case there are any newbie XML folks here who may not pick up on this: in XML order *does* matter!)

(Oops, that should say element order does matter.)

TopCod3r · 2008-10-01 Reply Admin

RiF:
RiF:
Nerf Herder:
Ledward:
In Soviet Russia, tags close you!

I dont know who votes on featured comments, but if this isn't blue by the end of the day there is a conspiracy.
/. called, they want their joke back.
What typo? ;-)

I don't know what you are trying to say, but shouldn't it be . ? you got the slash the wrong way. Just thought I would try to help you out.

DeLos · 2008-10-01 Reply Admin

Nerf Herder:
Ledward:
In Soviet Russia, tags close you!

I dont know who votes on featured comments, but if this isn't blue by the end of the day there is a conspiracy.

But then we would have to make it blue in every other wtf since there is one for every post.

2008-10-01 Reply Admin

While it's funny that the application rejected valid XML, that's only half the story. Clearly this is a specialized application. Just because the XML is valid doesn't mean that the data is valid. If it didn't barf out on parsing the XML, it would have barfed on the carrier and service being left undefined.

A little better error checking would have been nice, but the email DID say "Invalid data near <Carrier". How much better would it have been to have an email that said "Error: Invalid Request. Carrier not defined!"?

El Duderino · 2008-10-01 Reply Admin

I accidentally your XML.

2008-10-01 Reply Admin

Wouldn't that be backslashdot

fruey · 2008-10-01 Reply Admin

Kermos:
You know, that I can almost still dismiss as a typo. At least they didn't write "Don't forget to submit ur timesheets by Friday." I absolutely hate it when I see something like that. My boss does it all the time, as well as instant message me with "r u there?" to which I generally just give him an answer in Japanese.
"their", "they're", "there" is another favorite of mine. I mean for crying out loud, English is my second language and I can get it right! Why can't a lot of native speakers I meet?

Native speakers are much more likely to make errors based on things being the same phonetically speaking. Non native speakers are likely to have learned, via translation with their language.

Example in French:

their = leur they're = ils sont there = là

If they translate back, they know which is which without working it out.

Example of English person speaking French as foreign language

French mistake phonetically identical "noté" and "noter"

noté = wrote down noter = to write down

An Englishman would translate back, and realise which is which by the same token.

2008-10-01 Reply Admin

Kermos:
Dood:
Michael responds to let them no that their software hasn't changed in a month

no?

You know what is sad? I've gotten so used to people writing "no" for "know" that I don't even notice anymore...

Wow. I didn't even catch it until you pointed it out.

2008-10-01 Reply Admin

I can think of a valid reason not to use a XML parser. A malicious DTD can cause a DoS attack. It is shocking how many web services and how much software are vulnerable. Any code that parses XML and attempts to process the DTD can be bogged down with the million laughs exploit. For example, SVG is an XML image format and the Opera browser is vulnerable to malicious SVG images.

2008-10-01 Reply Admin

Regarding the <script/> vs. <script></script> issue, yes, they are different, only because IE has a bug! with its' DOM parser. (ugh!)</p> <p>affects all versions of IE:</p> <p>http://webbugtrack.blogspot.com/2007/08/bug-153-self-closing-script-tag-issues.html</p> </script>

2008-10-01 Reply Admin

I've seen worse. One day, I've got an upset email from one of our partners claiming that we were not compliant with our own XML specification. We sent something like:

<orderid>12345</orderid>
<fname>John</fname>
<lname>Doe</lname>

Turns out, the tag order was different in our specification so they were expecting this:

<orderid>12345</orderid>
<lname>Doe</lname>
<fname>John</fname>

Thinking about what the XML parser must look like on their side scares me ...

2008-10-01 Reply Admin

WTF?:
nikki9696:
I know a certain major bank whose "web service", AKA an email address that you send XML to, doesn't like "malformed" XML either. It also doesn't like when you switch the element order. That is
<wtf>foo</wtf> <hi>there</hi>

was not treated the same as <hi>there</hi> <wtf>foo</wtf>

It's a sad thing.

???

I hope the above was a clever troll...

(In case there are any newbie XML folks here who may not pick up on this: in XML order does matter!)

WTF?:
(In case there are any newbie XML folks here who may not pick up on this: in XML order *does* matter!)

(Oops, that should say element order does matter.)

There's some sort of circular reference created by this, can you spot it? rolls eyes

2008-10-01 Reply Admin

SpasticWeasel:
Wouldn't that be backslashdot

That's the place where everyone is evil and wears a goatee.

2008-10-01 Reply Admin

sig:
I've seen worse. One day, I've got an upset email from one of our partners claiming that we were not compliant with our own XML specification. We sent something like:
<orderid>12345</orderid>
<fname>John</fname>
<lname>Doe</lname>
Turns out, the tag order was different in our specification so they were expecting this:
<orderid>12345</orderid>
<lname>Doe</lname>
<fname>John</fname>
Thinking about what the XML parser must look like on their side scares me ...

Your partners were right and you were wrong: See http://www.ibm.com/developerworks/xml/library/x-eleord.html and several dozen more hits on http://www.google.com/search?hl=en&q=xml+element+order&btnG=Google+Search&aq=f&oq=

2008-10-01 Reply Admin

funny. Google base did the exact same shit.

2008-10-01 Reply Admin

For WSDL, I thought that you could make the order for elements matter? I don't use webservices much, but I could swear that there is a way to force elements to be in a certain order...

2008-10-01 Reply Admin

Damn. I almost responded to this as if it was a serious post....

2008-10-01 Reply Admin

umbrage:
Damn. I almost responded to this as if it was a serious post....

Crap, it was supposed to be a quote on TopCod3r's post....

2008-10-01 Reply Admin

DES:
HTML is not XML, it's SGML. XML is (mostly) a subset of SGML. The XML version of HTML, which practically noone uses, is called XHTML.

from this site's soure:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

2008-10-01 Reply Admin

RTFM:
sig:
I've seen worse. One day, I've got an upset email from one of our partners claiming that we were not compliant with our own XML specification. We sent something like:
<orderid>12345</orderid>
<fname>John</fname>
<lname>Doe</lname>
Turns out, the tag order was different in our specification so they were expecting this:
<orderid>12345</orderid>
<lname>Doe</lname>
<fname>John</fname>
Thinking about what the XML parser must look like on their side scares me ...
Your partners were right and you were wrong: See http://www.ibm.com/developerworks/xml/library/x-eleord.html and several dozen more hits on http://www.google.com/search?hl=en&q=xml+element+order&btnG=Google+Search&aq=f&oq=

They may be right or wrong. The order may or may not be important. That's why it's important to write DTD document definitions for your project. If there were some kind of disagreement, it would be still possible to say who's right and who isn't.

If your specification doesn't specify the order or whatever, it's your fault. You had to write it in the language made for that purpose instead of natural language.

2008-10-01 Reply Admin

There seems to be a lot of confusion about this...

The order of elements IS significant in XML, unless the DTD/schema says otherwise (and it usually doesn't)

In XSD, when you specify child elements with xsd:sequence, their ordering matters. Putting them in a different order is a violation of the schema.

There's a very good reason for this - it allows streaming parsers to know what element to expect next, instead of needing to bounce all over the current parent to find what you're looking for.

2008-10-01 Reply Admin

ethan:
XHTML is an attempt to produce valid XML which can nonetheless make it through existing web browser's HTML parsers, which were not designed to parse XML. So yes, when you're producing XML which has to make it through a parser which wasn't designed to parse it, it turns out that you can't just use any old XML you like.

That probably also explains why you have to comment out CDATA sections inside script tags, or you get javascript errors.

<script type="text/javascript">
//<!--[CDATA[
alert("foo");
//]]-->
</script>

2008-10-01 Reply Admin

I like your "standard" English

2008-10-01 Reply Admin

Sorry if sound a bit asshatty, I think you mispelled "there" in your sentence.

2008-10-01 Reply Admin

You must be be referring to §C.3, which says not to use the minimized form for non-EMPTY elements.

However, you apparently missed the first line of Appendix C, "this appendix is informative." Set in boldface, even.

So that's informative, not normative. You write <script></script> not because the standard requires it (it doesn't; it's XML), but because you want your pages to work with a browsers that try and parse your XML as HTML. Same reason you write "
" instead of
" (that's §C.2, BTW).

Have a nice day, and thanks for playing.

2008-10-01 Reply Admin

RTFM:
Your partners were right and you were wrong: See http://www.ibm.com/developerworks/xml/library/x-eleord.html and several dozen more hits on http://www.google.com/search?hl=en&q=xml+element+order&btnG=Google+Search&aq=f&oq=

Well, our definition looked something like:

<xsd:choice minOccurs="0" maxOccurs="1">
<xsd:element name="orderid" type="tns:_orderid"/>
<xsd:element name="fname" type="xsd:string"/>
<xsd:element name="lname" type="xsd:string"/>
</xsd:choice>

so I think (hope) it's valid. But I see what you mean. We usually accept just about any element order (on same level) so I thought it's the normal behaviour. But I guess that's not very common.

At first, I actually thought they've built a custom parser expecting a certain keyword on a certain line number or something (they were also complaining about indentation) -- but I'm probably wrong here. Thanks for the info!

2008-10-01 Reply Admin

Bean:
There seems to be a lot of confusion about this...
The order of elements IS significant in XML, unless the DTD/schema says otherwise (and it usually doesn't)

In XSD, when you specify child elements with xsd:sequence, their ordering matters. Putting them in a different order is a violation of the schema.

There's a very good reason for this - it allows streaming parsers to know what element to expect next, instead of needing to bounce all over the current parent to find what you're looking for.

True. Unless you use xsd:all instead of xsd:sequence. When it comes to object serialization (for example, in a web service) rather than xpath/searching, I'd much prefer for order not to matter, but as luck would have it, most of the WS frameworks I've seen use the xsd:sequence element for collections. I know the new serializer in WCF, the DataContractSerializer, doesn't even support xsd:all at all, while the old .NET serializer allowed you to force it. Annoying.

2008-10-01 Reply Admin

levi_h:
Article:
<Carrier />

A space before the slash? And tag names that contain upper case characters? I'd reject that too if I were a parser :)

Whitespace is ignored, and xml is case-insensitive. If, that is, the spec is followed.

2008-10-01 Reply Admin

2008-10-01 Reply Admin

Joey,

The first is not a hack and is just as valid as the second.

http://www.w3.org/TR/REC-xml/#dt-empty

EmptyElemTag	   ::=   	'<' Name (S Attribute)* S? '/>'

I'll translate that regex into more real world language. The tag name and the opening '<' must go together with no spaces. Then zero or more attributes clauses that are each preceded with a space. Then zero or one space characters. Finally follow off with '/>'. This is the XML spec.

http://www.w3.org/TR/REC-xml/#dt-etag

End tags cannot be '</ name>'. This is invalid according to the spec.

ETag ::= '</' Name S? '>'

And as far as your third "wouldn't have" example.... Completely valid by the spec. Both the empty string AttValue and the space preceding the '>' http://www.w3.org/TR/REC-xml/#dt-stag STag ::= '<' Name (S Attribute)* S? '>'

2008-10-01 Reply Admin

sig:
Well, our definition looked something like:
<xsd:choice minOccurs="0" maxOccurs="1">
<xsd:element name="orderid" type="tns:_orderid"/>
<xsd:element name="fname" type="xsd:string"/>
<xsd:element name="lname" type="xsd:string"/>
</xsd:choice>
so I think (hope) it's valid. But I see what you mean. We usually accept just about any element order (on same level) so I thought it's the normal behaviour. But I guess that's not very common.

Did you mean "all" instead of "choice"? Choice, IIRC, would mean that you could only have one of the three elements, not all three.

(they were also complaining about indentation)

Woah. That's a complete WTF on their part...

2008-10-01 Reply Admin

TopCod3r:
Look. The way I understand XML, the way I was taught, it is up to the receiving party to define the standard they are willing to accept, so this means things like which conventions from the spec will apply with any XML conversation you have with them.

The whole frickin' point of XML is to be standard, and to be agreed on by both parties. wtf?!

One BIG reason I can think for them making this design decision is for performance. I mean if you have to look for closed tags two different ways, then that is twice as much processing time. Another reason might be cost for development, I mean why would you want to write extra unnecessary code.

Maybe if you're sending 10Gb/s of data down the pipe. Performance reasons? Twice as much processing time? pffff

So you might be saying, .NET (or Java) already has an XML library included standard. But again this has distinct disadvantages. First, you have to ask can you trust this implementation. I personally have found more bugs in the .NET framework than I can remember. I wish I had kept a list of them so I could tell you.

Yeah, I believe you...

BTW, nice troll, TopCod3r. Somebody had to be the one to do the feeding today. :)

2008-10-01 Reply Admin

Shouldn't that be:

Michael responds to them "no, that there software hasn't changed in a month"?

2008-10-01 Reply Admin

Thunder:
levi_h:
Article:
<Carrier />

A space before the slash? And tag names that contain upper case characters? I'd reject that too if I were a parser :)
Whitespace is ignored, and xml is case-insensitive. If, that is, the spec is followed.

Ahh, no. XML elements are definitely case sensitive; the reason? Performance: http://www.tkachenko.com/blog/archives/000354.html

2008-10-01 Reply Admin

Jeff:
How much better would it have been to have an email that said "Error: Invalid Request. Carrier not defined!"?

Or even better: "#`%${ - NO CARRIER"

TopCod3rsBottom · 2008-10-01 Reply Admin

TopCod3r:
But the bottom like is any decent developer should be able to write a specialized XML library that is easier to use and out-performs the generic XML library included in .NET.

But both Java and its .NET components are free, Open Source softwares, and so rather than writing your own buggy parser, you should be fixing their bugs, and submitting them so that everyone benefits. I don't know anything about XPath, though.

ruijoel · 2008-10-01 Reply Admin

TopCod3r:
Look. The way I understand XML, the way I was taught, it is up to the receiving party to define the standard they are willing to accept, so this means things like which conventions from the spec will apply with any XML conversation you have with them.

In XML, you're free to define the vocabulary, that is, the allowed tags, but you can't define the grammar, that is, how you write the tags.

TopCod3r:
One BIG reason I can think for them making this design decision is for performance. I mean if you have to look for closed tags two different ways, then that is twice as much processing time. Another reason might be cost for development, I mean why would you want to write extra unnecessary code.

Optimization is a complex issue, and your assertion is wrong. Twice the possibilities is not twice the processing time. Anyways, if you think that parsing XML is too slow, then don't. Use CSV or JSON or, what the heck, an INI hybrid monstrosity or something. That'll give you even better performance.

And that's assuming that the XML parsing is even an issue. You're saying that the average web server can't process an XML document as fast as the network can send it? Come on! Oh, wait, I didn't realize the whole Internet was already in optic fiber.

TopCod3r:
So you might be saying, .NET (or Java) already has an XML library included standard. But again this has distinct disadvantages. First, you have to ask can you trust this implementation. I personally have found more bugs in the .NET framework than I can remember. I wish I had kept a list of them so I could tell you. Second, .NET gives you only a general-purpose XML implementation, so the developer still has to write a lot of processing code... or use XPath, but don't get me started on XPath, it has burned us on multiple occasions, so we have banned it.

Ok, so you're suggesting that Microsoft was unable to come up with a decent XML parsing library? And you claim you're better than the Redmond engineers? Yes, yes, of course. In my experience most bug reports are due to user arrogance, thinking they did everything right and the implementor did a mistake. It happens. Just a lot less than code I write myself.

And of course you're expected to write custom code, you're a developper for Christ's sake! What's worse is that nobody ever thought of creating a framework which maps big XML documents to plain old C# objects in a few lines of code. (note: this is sarcasm).

TopCod3r:
But the bottom like is any decent developer should be able to write a specialized XML library that is easier to use and out-performs the generic XML library included in .NET.

No. XML has a flexible and complex grammar. That's why CSV is still very popular, and JSON was invented. You could go ahead and try to write your own parser, but anything more complex than a few regular expressions is too too much. From the project management point of view, there is too much cost (number of bugs, developper time) for the gain (a rigid class which reads a specific version of a specific document if it's not quirkly formatted). This is what today's WTF is all about! I just wonder what the CIO or the project manager were thinking when they allowed this solution.

I wouldn't ever try to write an XML parser. There are people a lot smarter than me who already did it. So why woudl I bother? I have some actual work to do, with more value for my entreprise.

2008-10-01 Reply Admin

I have to hand it to TopCod3r... This guy is the best long-term troll (in the best sense of the term) I've seen since Usenet in the 90s.

He knows exactly how far to go to get people riled up but not going so far as to give the game away blatantly.

So, thank you, TopCod3r, for reviving an art that I thought was practically dead!

2008-10-01 Reply Admin

Oh, do I know how this is. I did a project a few years ago with a startup that needed an application that talked to people over the phone. We didn't have a physical location so we decided to use angel.com to do the phone stuff.

It worked pretty well, except for their 'XML' schema called AngelXML. It was XML... except that not putting whitespace between tags in the right way made it not validate and silently do nothing. People need to learn to use someone else's parser that actually works.

2008-10-01 Reply Admin

One of our XML parsers(!!!!!!) (implemented in LabView iirc) doesn't support new lines.

The entire document must exist on a single line.

I kid you not.

2008-10-01 Reply Admin

Ladies and gentlemen, I present to you a Daily WTF in the making. His name is even ironic.

The Substandard Standard

Leave a comment on “The Substandard Standard”