- Feature Articles
- CodeSOD
-
Error'd
- Most Recent Articles
- Secret Horror
- Not Impossible
- Monkeys
- Killing Time
- Hypersensitive
- Infallabella
- Doubled Daniel
- It Figures
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
HTML is not XML, it's SGML. XML is (mostly) a subset of SGML. The XML version of HTML, which practically noone uses, is called XHTML.
Admin
If I had to write a "specialized XML library", I would seriously doubt the XML requirement. I'd opt for a simpler (and more lightweight!) CSV or JSON scheme instead. Either you use standard XML (...) or not.
Admin
Both are allowed in XML. Placing a space before the closing slash is commonly done because some older (pre-XML) HTML parsers understand <foo /> but not <foo/>. As far as upper-case characters: XML is case sensitive, but that doesn't mean it doesn't accept upper-case characters in element or attribute names.
Admin
Admin
Admin
Sorry, but in XML an empty element tag is clearly equivalent to a start tag, end tag with no content. That carries through to XHTML. (I just checked the standards documents).
Unfortunately, existing browsers tend to be broken on this behaviour.
Admin
???
I hope the above was a clever troll...
(In case there are any newbie XML folks here who may not pick up on this: in XML order does matter!)
Admin
eXtreme Markup Languages aren't for wusses.
I don't think the XML will last more than a season or two.
Admin
You know, that I can almost still dismiss as a typo. At least they didn't write "Don't forget to submit ur timesheets by Friday." I absolutely hate it when I see something like that. My boss does it all the time, as well as instant message me with "r u there?" to which I generally just give him an answer in Japanese.
"their", "they're", "there" is another favorite of mine. I mean for crying out loud, English is my second language and I can get it right! Why can't a lot of native speakers I meet?
Admin
(Oops, that should say element order does matter.)
Admin
I don't know what you are trying to say, but shouldn't it be . ? you got the slash the wrong way. Just thought I would try to help you out.
Admin
But then we would have to make it blue in every other wtf since there is one for every post.
Admin
While it's funny that the application rejected valid XML, that's only half the story. Clearly this is a specialized application. Just because the XML is valid doesn't mean that the data is valid. If it didn't barf out on parsing the XML, it would have barfed on the carrier and service being left undefined.
A little better error checking would have been nice, but the email DID say "Invalid data near <Carrier". How much better would it have been to have an email that said "Error: Invalid Request. Carrier not defined!"?
Admin
I accidentally your XML.
Admin
Wouldn't that be backslashdot
Admin
Native speakers are much more likely to make errors based on things being the same phonetically speaking. Non native speakers are likely to have learned, via translation with their language.
Example in French:
their = leur they're = ils sont there = là
If they translate back, they know which is which without working it out.
Example of English person speaking French as foreign language
French mistake phonetically identical "noté" and "noter"
noté = wrote down noter = to write down
An Englishman would translate back, and realise which is which by the same token.
Admin
Wow. I didn't even catch it until you pointed it out.
Admin
I can think of a valid reason not to use a XML parser. A malicious DTD can cause a DoS attack. It is shocking how many web services and how much software are vulnerable. Any code that parses XML and attempts to process the DTD can be bogged down with the million laughs exploit. For example, SVG is an XML image format and the Opera browser is vulnerable to malicious SVG images.
Admin
Regarding the <script/> vs. <script></script> issue, yes, they are different, only because IE has a bug! with its' DOM parser. (ugh!)</p> <p>affects all versions of IE:</p> <p>http://webbugtrack.blogspot.com/2007/08/bug-153-self-closing-script-tag-issues.html</p> </script>
Admin
I've seen worse. One day, I've got an upset email from one of our partners claiming that we were not compliant with our own XML specification. We sent something like:
Turns out, the tag order was different in our specification so they were expecting this:
Thinking about what the XML parser must look like on their side scares me ...
Admin
There's some sort of circular reference created by this, can you spot it? rolls eyes
Admin
Admin
Your partners were right and you were wrong: See http://www.ibm.com/developerworks/xml/library/x-eleord.html and several dozen more hits on http://www.google.com/search?hl=en&q=xml+element+order&btnG=Google+Search&aq=f&oq=
Admin
funny. Google base did the exact same shit.
Admin
For WSDL, I thought that you could make the order for elements matter? I don't use webservices much, but I could swear that there is a way to force elements to be in a certain order...
Admin
Damn. I almost responded to this as if it was a serious post....
Admin
Crap, it was supposed to be a quote on TopCod3r's post....
Admin
from this site's soure:
Admin
They may be right or wrong. The order may or may not be important. That's why it's important to write DTD document definitions for your project. If there were some kind of disagreement, it would be still possible to say who's right and who isn't.
If your specification doesn't specify the order or whatever, it's your fault. You had to write it in the language made for that purpose instead of natural language.
Admin
There seems to be a lot of confusion about this...
The order of elements IS significant in XML, unless the DTD/schema says otherwise (and it usually doesn't)
In XSD, when you specify child elements with xsd:sequence, their ordering matters. Putting them in a different order is a violation of the schema.
There's a very good reason for this - it allows streaming parsers to know what element to expect next, instead of needing to bounce all over the current parent to find what you're looking for.
Admin
That probably also explains why you have to comment out CDATA sections inside script tags, or you get javascript errors.
Admin
I like your "standard" English
Admin
Sorry if sound a bit asshatty, I think you mispelled "there" in your sentence.
Admin
You must be be referring to §C.3, which says not to use the minimized form for non-EMPTY elements.
However, you apparently missed the first line of Appendix C, "this appendix is informative." Set in boldface, even.
So that's informative, not normative. You write <script></script> not because the standard requires it (it doesn't; it's XML), but because you want your pages to work with a browsers that try and parse your XML as HTML. Same reason you write "
" instead of
" (that's §C.2, BTW).
Have a nice day, and thanks for playing.
Admin
Well, our definition looked something like:
so I think (hope) it's valid. But I see what you mean. We usually accept just about any element order (on same level) so I thought it's the normal behaviour. But I guess that's not very common.
At first, I actually thought they've built a custom parser expecting a certain keyword on a certain line number or something (they were also complaining about indentation) -- but I'm probably wrong here. Thanks for the info!
Admin
True. Unless you use xsd:all instead of xsd:sequence. When it comes to object serialization (for example, in a web service) rather than xpath/searching, I'd much prefer for order not to matter, but as luck would have it, most of the WS frameworks I've seen use the xsd:sequence element for collections. I know the new serializer in WCF, the DataContractSerializer, doesn't even support xsd:all at all, while the old .NET serializer allowed you to force it. Annoying.
Admin
Admin
Admin
Joey,
The first is not a hack and is just as valid as the second.
http://www.w3.org/TR/REC-xml/#dt-empty
I'll translate that regex into more real world language. The tag name and the opening '<' must go together with no spaces. Then zero or more attributes clauses that are each preceded with a space. Then zero or one space characters. Finally follow off with '/>'. This is the XML spec.
http://www.w3.org/TR/REC-xml/#dt-etag
End tags cannot be '</ name>'. This is invalid according to the spec.
ETag ::= '</' Name S? '>'
And as far as your third "wouldn't have" example.... Completely valid by the spec. Both the empty string AttValue and the space preceding the '>' http://www.w3.org/TR/REC-xml/#dt-stag STag ::= '<' Name (S Attribute)* S? '>'
Admin
Did you mean "all" instead of "choice"? Choice, IIRC, would mean that you could only have one of the three elements, not all three.
Woah. That's a complete WTF on their part...
Admin
The whole frickin' point of XML is to be standard, and to be agreed on by both parties. wtf?!
Maybe if you're sending 10Gb/s of data down the pipe. Performance reasons? Twice as much processing time? pffff
Yeah, I believe you...
BTW, nice troll, TopCod3r. Somebody had to be the one to do the feeding today. :)
Admin
Shouldn't that be:
Michael responds to them "no, that there software hasn't changed in a month"?
Admin
Admin
Admin
Admin
In XML, you're free to define the vocabulary, that is, the allowed tags, but you can't define the grammar, that is, how you write the tags.
Optimization is a complex issue, and your assertion is wrong. Twice the possibilities is not twice the processing time. Anyways, if you think that parsing XML is too slow, then don't. Use CSV or JSON or, what the heck, an INI hybrid monstrosity or something. That'll give you even better performance.
And that's assuming that the XML parsing is even an issue. You're saying that the average web server can't process an XML document as fast as the network can send it? Come on! Oh, wait, I didn't realize the whole Internet was already in optic fiber.
Ok, so you're suggesting that Microsoft was unable to come up with a decent XML parsing library? And you claim you're better than the Redmond engineers? Yes, yes, of course. In my experience most bug reports are due to user arrogance, thinking they did everything right and the implementor did a mistake. It happens. Just a lot less than code I write myself.
And of course you're expected to write custom code, you're a developper for Christ's sake! What's worse is that nobody ever thought of creating a framework which maps big XML documents to plain old C# objects in a few lines of code. (note: this is sarcasm).
No. XML has a flexible and complex grammar. That's why CSV is still very popular, and JSON was invented. You could go ahead and try to write your own parser, but anything more complex than a few regular expressions is too too much. From the project management point of view, there is too much cost (number of bugs, developper time) for the gain (a rigid class which reads a specific version of a specific document if it's not quirkly formatted). This is what today's WTF is all about! I just wonder what the CIO or the project manager were thinking when they allowed this solution.I wouldn't ever try to write an XML parser. There are people a lot smarter than me who already did it. So why woudl I bother? I have some actual work to do, with more value for my entreprise.
Admin
I have to hand it to TopCod3r... This guy is the best long-term troll (in the best sense of the term) I've seen since Usenet in the 90s.
He knows exactly how far to go to get people riled up but not going so far as to give the game away blatantly.
So, thank you, TopCod3r, for reviving an art that I thought was practically dead!
Admin
Oh, do I know how this is. I did a project a few years ago with a startup that needed an application that talked to people over the phone. We didn't have a physical location so we decided to use angel.com to do the phone stuff.
It worked pretty well, except for their 'XML' schema called AngelXML. It was XML... except that not putting whitespace between tags in the right way made it not validate and silently do nothing. People need to learn to use someone else's parser that actually works.
Admin
One of our XML parsers(!!!!!!) (implemented in LabView iirc) doesn't support new lines.
The entire document must exist on a single line.
I kid you not.
Admin
Ladies and gentlemen, I present to you a Daily WTF in the making. His name is even ironic.