XML Kōan from the Fourth Dimension

« Return to Article
  • Foosball Girl In My Dreams 2007-10-12 09:04
    </>*

    *Comment_Not_Found
  • batasrki 2007-10-12 09:05
    Oh yes, XML's tag name limitations are well known. Damn Tim Bray and his invention. :-)
  • snoofle 2007-10-12 09:06
    Now now, maybe the intention was to say that he (in the tag) had nothing worth saying - a null tag
  • David 2007-10-12 09:08

    in order to meet the well known XML limitation of only allowing 5 characters per tag name.


    I hope you're bullshiting me right? It must be, it isn't, isn't it? Please tell me it isn't, please? Otherwise all of my XML of the last X years has been invalid...
    Nah, you must be bullshitting me, aren't you? Seriously, please tell me this is bullshit.
  • pcooper 2007-10-12 09:09
    I think constructions like that may be legal in some applications of SGML. But not in XML.

    I've always been amused that <_/> is a legal XML document, though.
  • FDF 2007-10-12 09:15
    David:

    in order to meet the well known XML limitation of only allowing 5 characters per tag name.


    I hope you're bullshiting me right? It must be, it isn't, isn't it? Please tell me it isn't, please? Otherwise all of my XML of the last X years has been invalid...
    Nah, you must be bullshitting me, aren't you? Seriously, please tell me this is bullshit.


    Of course he's not bullshitting you
  • cynic 2007-10-12 09:17
    Imagine if namespaces were used to bridge the "limitation"

    <dicti:onary>
    <defin:ition>
    </defin:ition>
    </dicti:onary>

  • DaveK 2007-10-12 09:25
    David:

    in order to meet the well known XML limitation of only allowing 5 characters per tag name.


    I hope you're bullshiting me right? It must be, it isn't, isn't it? Please tell me it isn't, please? Otherwise all of my XML of the last X years has been invalid...
    Nah, you must be bullshitting me, aren't you? Seriously, please tell me this is bullshit.


    YHBT, fairly comprehensively!
  • ObiWayneKenobi 2007-10-12 09:26
    WTF? I've seen plenty of XML that use more than 5 characters in tags. Are you telling me this isn't really valid XML, despite conforming to the W3C standards for XML?
  • Talchas 2007-10-12 09:27
    The five letters thing was sarcasm...
  • AndyBee 2007-10-12 09:31
    good grief.

    In future perhaps we should have <sarcasm></sarcasm> tags?
  • Sunstorm 2007-10-12 09:35
    AndyBee:
    good grief.

    In future perhaps we should have <sarcasm></sarcasm> tags?

    <srcsm></srcsm>, get it right.
  • Darkstar 2007-10-12 09:37
    Talchas:
    The five letters thing was sarcasm...


    <sarcasm>NOW you tell me, it has taken me hours to go through all our XML documents and redo the tags - not to mention building up a cross reference XML document indicating what the tags were and are now.</sarcasm>
  • Grant 2007-10-12 09:38
    And even without the <sarcasm /> tags, it's probably worth noting that the XML specs are available for free on the iterweb. The interweb is a neat little tool that really helps software developers out. If you haven't checked it out yet, give it a try.
  • Mean Mr. Mustard 2007-10-12 09:40
    Actually, all English communication is supposed to be words of five characters or less. About the only legal words on the Internet are those CAPTCHA thingies.
  • An apprentice 2007-10-12 09:42
    Grant:
    And even without the <sarcasm /> tags, it's probably worth noting that the XML specs are available for free on the iterweb. The interweb is a neat little tool that really helps software developers out. If you haven't checked it out yet, give it a try.


    Excuse me, but how exactly can a series of tubes help in software development?
  • Andy Goth 2007-10-12 09:42
    XML is overused and underused at the same time. I see it used for things that would have been handled better by traditional formats or custom formats. But then I also see ugly custom "I can't believe it's not XML!" formats used where XML would have done the trick nicely. I suspect the decision to use XML is primarily driven by marketing and management, rarely by developers who might actually have a clue about what's appropriate.

    Also, I once saw XML used to contain HTML. A literal ampersand had to be written out thusly: &amp;amp; . In context it looked like this: <mydata>Hello &lt;b&gt;Alex &amp;amp; Jake!&lt;/b&gt;</mydata> . Wow.
  • brazzy 2007-10-12 09:53
    Andy Goth:
    XML is overused and underused at the same time. I see it used for things that would have been handled better by traditional formats or custom formats. But then I also see ugly custom "I can't believe it's not XML!" formats used where XML would have done the trick nicely. I suspect the decision to use XML is primarily driven by marketing and management, rarely by developers who might actually have a clue about what's appropriate.

    You forgot "badly used". I currently work on a large banking app that uses XML to communicate with a certain backend system - perfectly legitimate use of XML. Unfortunately, the backend system provider's understanding of XML is adequately summarized by this verbatim quote:

    "The schemas are just considered documentation and are not binding. The actual binding specification of the communication format are these Excel sheets."
  • Martin 2007-10-12 10:06
    All of the XHTML is bad then! <a href=""> </a> was illegal for so long!
  • FredSaw 2007-10-12 10:08
    Sunstorm:
    AndyBee:
    good grief.

    In future perhaps we should have <sarcasm></sarcasm> tags?
    <srcsm></srcsm>, get it right.
    How about...
    <sar:chasm>Signifying the expanse between what was said and what was understood</sar:chasm>
  • gabba 2007-10-12 10:10
    I like how the dump was "XML-like". No doubt the programmer thought improving on XML was way better than conforming to the standard.
  • FredSaw 2007-10-12 10:11
    brazzy:
    "The schemas are just considered documentation and are not binding. The actual binding specification of the communication format are these Excel sheets."
    Terms like "justifiable homicide" spring to mind.
  • Bosshog 2007-10-12 10:14
    Sunstorm:
    AndyBee:
    good grief.

    In future perhaps we should have <sarcasm></sarcasm> tags?

    <srcsm></srcsm>, get it right.


    Brillant! :D
  • d3matt 2007-10-12 10:32
    Martin:
    All of the XHTML is bad then! <a href=""> </a> was illegal for so long!

    Wrong. The tag is <a />. The href is an attribute of <a />. Also, href is less five characters as well so it is perfectly fine :)
  • John Cowan 2007-10-12 10:35
    Quite so. In full SGML, "</> means "End the current element", and "<>" means "Repeat the most recent start-tag". So "<p>foo</><>bar</>" was short for "<p>foo</p><p>bar</p>".

    It's also true that the default maximum length for tags is 8 characters (not 5), though implementations could and did extend that.
  • Charlie 2007-10-12 10:45
    David:

    in order to meet the well known XML limitation of only allowing 5 characters per tag name.


    I hope you're bullshiting me right? It must be, it isn't, isn't it? Please tell me it isn't, please? Otherwise all of my XML of the last X years has been invalid...
    Nah, you must be bullshitting me, aren't you? Seriously, please tell me this is bullshit.


    You must be new here. This is how Alex et. al. do humor.
  • LoL 2007-10-12 10:46
    Is it just me or is the real WTF some of these comments?
  • Soul-Grinding Madness Looms 2007-10-12 10:49
    John Cowan:
    Quite so. In full SGML, "</> means "End the current element", and "<>" means "Repeat the most recent start-tag". So "<p>foo</><>bar</>" was short for "<p>foo</p><p>bar</p>".

    Just one of many, many SGML WTFs, which seems to be designed to frustrate automatic processing as much as possible. Why they ever picked this to base HTML on... XML is basically SGML after de-WTF-ification. Not that XML is completely free of brain damage itself, mind you, but SGML is from the days where people thought that ADD A TO B GIVING C would be nice to have as a statement in a programming language.
  • NiceWTF 2007-10-12 10:58
    David:

    in order to meet the well known XML limitation of only allowing 5 characters per tag name.


    I hope you're bullshiting me right?


    I think your sarcasm detector is broken.
  • Hmmmm... 2007-10-12 11:13
    "The schemas are just considered documentation and are not binding. The actual binding specification of the communication format are these Excel sheets."


    Doesn't seem unreasonable to me. You have normative (the spreadsheets) and descriptive (the schemas). If there is a inconsistency between the two, you hold up the defined "correct" copy as the standard and use that to fix the schemas...
  • Synonymous Awkward 2007-10-12 11:45
    Soul-Grinding Madness Looms:
    John Cowan:
    Quite so. In full SGML, "</> means "End the current element", and "<>" means "Repeat the most recent start-tag".

    Just one of many, many SGML WTFs, which seems to be designed to frustrate automatic processing as much as possible.

    Actually, if </> were enforced as the only way to end a tag, it'd make parsing a bit faster since well-formed-ness checking is so much simpler. Of course, you lose the sanity-checking provided by the repeated tag-name. And you've basically just reinvented S-exps.

    But oh well. :-)
  • snoofle 2007-10-12 12:00
    Soul-Grinding Madness Looms:

    ... SGML is from the days where people thought that ADD A TO B GIVING C would be nice to have as a statement in a programming language.

    I'm from those days, and I worked on a project where management thought it would be "cool" to use this new-fangled SGML (to represent component parts of documents that had been scanned in so they could be electronically updated instead of completely retyped; at the time, it was an improvement over what we had available). Of course, this was in the days where we had to hand-crank the computer to get it to do anything...
  • FredSaw 2007-10-12 12:05
    Hmmmm...:
    "The schemas are just considered documentation and are not binding. The actual binding specification of the communication format are these Excel sheets."


    Doesn't seem unreasonable to me. You have normative (the spreadsheets) and descriptive (the schemas). If there is a inconsistency between the two, you hold up the defined "correct" copy as the standard and use that to fix the schemas...
    That makes sense. So, whenever a client wants to connect with a web or WCF service, the service can put the excel spreadsheet on a wooden table, take a polaroid of it, and fax that over to the client to establish the contract.
  • Michael 2007-10-12 12:05
    brazzy:
    "The schemas are just considered documentation and are not binding. The actual binding specification of the communication format are these Excel sheets."

    I can't wait until they migrate to Office 2007 with their MSOOXML Excel sheet, the infinite recursion should literally make people explode.
  • Fish Basket Gordo 2007-10-12 12:10
    FredSaw:
    Sunstorm:
    AndyBee:
    good grief.

    In future perhaps we should have <sarcasm></sarcasm> tags?
    <srcsm></srcsm>, get it right.
    How about...
    <sar:chasm>Signifying the expanse between what was said and what was understood</sar:chasm>


    What great wit. I didn't know Oscar Wilde had the Internet in the hereafter. (I am purposely leaving off any tags to indicate my tone.)
  • jayh 2007-10-12 12:40
    homocide:

    accidental
    justifiable
    laudable
  • Daniel15 2007-10-12 12:50
    An apprentice:
    Grant:
    And even without the <sarcasm /> tags, it's probably worth noting that the XML specs are available for free on the iterweb. The interweb is a neat little tool that really helps software developers out. If you haven't checked it out yet, give it a try.


    Excuse me, but how exactly can a series of tubes help in software development?

    On the other hand... A large truck, now *that* would be useful.
    :)
  • Darien H 2007-10-12 12:52
    Andy Goth:
    Also, I once saw XML used to contain HTML. A literal ampersand had to be written out thusly: &amp;amp; . In context it looked like this: <mydata>Hello &lt;b&gt;Alex &amp;amp; Jake!&lt;/b&gt;</mydata> . Wow.


    There was this MLS service which abruptly cut off the ability to downloads SQL dumps for a client and then said everyone had to use their badly-documented SOAP service and a demo/test site which would inexplicably fail every other invocation.

    The XML specification for my query simply would not work when I put it into the soap body.

    Then I found out that it wanted my XML in <![CDATA[ tags...
  • brazzy 2007-10-12 12:57
    Hmmmm...:
    "The schemas are just considered documentation and are not binding. The actual binding specification of the communication format are these Excel sheets."


    Doesn't seem unreasonable to me. You have normative (the spreadsheets) and descriptive (the schemas). If there is a inconsistency between the two, you hold up the defined "correct" copy as the standard and use that to fix the schemas...

    You've never used non-trivial XML communication, have you?
    XML schemas are designed and perfectly fit to provide a formal normative description of an XML format. Using Excel sheets means you throw away all formal regidity and introduce endless possibilities for inconsistence, vagueness and openness to interpretation that would simply not exist if you used a schema.

    Using Excel sheets instead of a schema to specify an XML format is like being faced with the task of putting a nail into a wall and deliberately choosing an empty bottle instead of a (prefectly available) hammer as a tool.
  • Desperate for True WTFs 2007-10-12 13:04
    brazzy:
    Andy Goth:
    XML is overused and underused at the same time. I see it used for things that would have been handled better by traditional formats or custom formats. But then I also see ugly custom "I can't believe it's not XML!" formats used where XML would have done the trick nicely. I suspect the decision to use XML is primarily driven by marketing and management, rarely by developers who might actually have a clue about what's appropriate.

    You forgot "badly used". I currently work on a large banking app that uses XML to communicate with a certain backend system - perfectly legitimate use of XML. Unfortunately, the backend system provider's understanding of XML is adequately summarized by this verbatim quote:

    "The schemas are just considered documentation and are not binding. The actual binding specification of the communication format are these Excel sheets."


    And this is why I continue to read Daily WTF. Even when the owners of the site can't manage to post a real WTF for weeks on end, occasionally, a user comment strikes WTF gold!
  • Edward Royce 2007-10-12 13:08
    Hmmmm.

    <spoon></spoon>
  • Nonymous 2007-10-12 13:12
    Andy Goth:
    XML is overused and underused at the same time. I see it used for things that would have been handled better by traditional formats or custom formats. But then I also see ugly custom "I can't believe it's not XML!" formats used where XML would have done the trick nicely. I suspect the decision to use XML is primarily driven by marketing and management, rarely by developers who might actually have a clue about what's appropriate.

    Also, I once saw XML used to contain HTML. A literal ampersand had to be written out thusly: &amp;amp; . In context it looked like this: <mydata>Hello &lt;b&gt;Alex &amp;amp; Jake!&lt;/b&gt;</mydata> . Wow.


    Or they could just use a CDATA block...
  • AdT 2007-10-12 13:18
    David:
    I hope you're bullshiting me right?


    Hi David,

    the irony meter manufacturer just called. Your device is malfunctioning. Please take it to the nearest service station ASAP.
  • AdT 2007-10-12 13:25
    brazzy:
    Using Excel sheets means you throw away all formal regidity and introduce endless possibilities for inconsistence, vagueness and openness to interpretation that would simply not exist if you used a schema.


    Not to mention that there aren't too many tools that can automatically check the validity of an XML document using an Excel spreadsheet "schema".
  • G Money 2007-10-12 13:39
    Grant:
    And even without the <sarcasm /> tags, it's probably worth noting that the XML specs are available for free on the iterweb. The interweb is a neat little tool that really helps software developers out. If you haven't checked it out yet, give it a try.


    I'll wait until it's available on computers.
  • phaedrus 2007-10-12 13:45
    LoL:
    Is it just me or is the real WTF some of these comments?


    You must also be new here. TRWTF(tm) is always the comments. That's why you read them.
  • Pablo 2007-10-12 13:56
    The guy hoped it was a joke, and begged for confirmation.His sarcasm detector was fine.

    His hope that anything in IT, anywhere, ever, is even remotely sane, is right on the edge. It's okay David. You don't have to break down yet.
  • gasman 2007-10-12 14:39
    Nonymous:
    Andy Goth:

    Also, I once saw XML used to contain HTML. A literal ampersand had to be written out thusly: &amp;amp; . In context it looked like this: <mydata>Hello &lt;b&gt;Alex &amp;amp; Jake!&lt;/b&gt;</mydata> . Wow.


    Or they could just use a CDATA block...


    ...and turn an ugly-but-working data format into one that breaks as soon as your user input contains the characters ]]>.

  • Zygo 2007-10-12 14:39
    Soul-Grinding Madness Looms:
    Just one of many, many SGML WTFs, which seems to be designed to frustrate automatic processing as much as possible. Why they ever picked this to base HTML on...


    Because the available alternatives at the time were SGML, ASN.1, Gopher's hypertext format, or something from Microsoft.

    If the timing had played out a little differently, web browsers would have shipped with BSD and web pages would probably end up written in m4 or something. <shiver>
  • iMalc 2007-10-12 15:16
    Of course we all know it was supposed to be:
    <>There is no spoon</>
  • TallGuy 2007-10-12 15:24
    iMalc:
    Of course we all know it was supposed to be:
    <>There is no spoon</>


    I was thinking of

    <spoon><spoon><spoon><spoon><spoon><spoon><spoon><spoon>Badger, bagder</spoon></spoon></spoon></spoon></spoon></spoon></spoon></spoon>

    as an alternative to that one. (No badgers were harmed in the posting of this comment).
  • Bejesus 2007-10-12 15:28
    AdT:
    brazzy:
    Using Excel sheets means you throw away all formal regidity and introduce endless possibilities for inconsistence, vagueness and openness to interpretation that would simply not exist if you used a schema.


    Not to mention that there aren't too many tools that can automatically check the validity of an XML document using an Excel spreadsheet "schema".


    Sounds like the manufacturer of the backend system in question has plenty of such tools employed in it's development team.
  • No name 2007-10-12 16:23
    If you load that into a parser and print the xml attribute at the top level tag, you probably get:

    "I am the alpha and the omega, and there is no phenotype in between."
  • TheRider 2007-10-12 16:25
    brazzy:

    "The schemas are just considered documentation and are not binding. The actual binding specification of the communication format are these Excel sheets."
    Would you be working for the same bank as I do? Then lets get together and cry a little.
  • purge 2007-10-12 16:56
    <RealWTF>XML</RealWTF>
  • barfman 2007-10-12 17:04
    An apprentice:
    Grant:
    And even without the <sarcasm /> tags, it's probably worth noting that the XML specs are available for free on the iterweb. The interweb is a neat little tool that really helps software developers out. If you haven't checked it out yet, give it a try.


    Excuse me, but how exactly can a series of tubes help in software development?


    bwahahahaha! That was very funny... I had forgotten about that Senator, or Congressman, who said that line, but if you recall please feel free to share so that I may youtube it.... TOO funny
  • gremlin 2007-10-12 17:09
    TallGuy:

    I was thinking of

    <spoon><spoon><spoon><spoon><spoon><spoon><spoon><spoon>Badger, bagder</spoon></spoon></spoon></spoon></spoon></spoon></spoon></spoon>

    as an alternative to that one. (No badgers were harmed in the posting of this comment).


    Yeah but what happened to the spoons?
  • FredSaw 2007-10-12 17:15
    TallGuy:
    Badger, bagder
    Yeah, but who was she, and where did he bag her? (Mushroom, mushroom)
  • Shinobu 2007-10-12 17:37
    Actually, this happens a lot in newsfeeds. If a certain element can contain HTML, it must be escaped because valid HTML is not necessarily valid XML.
  • real_aardvark 2007-10-12 18:49
    Zygo:
    Soul-Grinding Madness Looms:
    Just one of many, many SGML WTFs, which seems to be designed to frustrate automatic processing as much as possible. Why they ever picked this to base HTML on...


    Because the available alternatives at the time were SGML, ASN.1, Gopher's hypertext format, or something from Microsoft.

    Or EDI, or
    Synonymous Awkward:
    S-exps
    or anything you make up on the spot. I mean, it could hardly be worse than SGML. Let's face it, the Web was designed by amateurs and juveniles, and it shows. Berners-Lee presumably picked SGML because, well, it was there, and it was unix-y. Without Andreessen jamming the <image> tag in, HTML would have died a quick and unlamented death.

    Actually, I don't see what's wrong with ASN.1 as an alternative to XML (obviously not as an alternative to HTML, which is a different question). ASN.1 is self-defining ("look, Ma, no schemas!" Well, practically none, anyway) and scales exceptionally well. This just might be why every single telecoms company in the world uses it. Just think how slow the intertubes would be if hardware configuration were to be done via XML messaging.
    Andy Goth:
    XML is overused and underused at the same time.
    A very astute comment, but I think you give programmers a deal too much credit. An ungodly number of them seem to have a fetish for the thing.
    Andy Goth:
    Also, I once saw XML used to contain HTML. A literal ampersand had to be written out thusly: &amp;amp; . In context it looked like this: <mydata>Hello &lt;b&gt;Alex &amp;amp; Jake!&lt;/b&gt;</mydata> . Wow.

    That's the problem with having different (broken) mark-up languages founded on a common (very broken) base. (They look the same, so they must be the same, right?) Also the problem with the twerpy in-band control sequences. I'm surprised it doesn't happen more often.
    brazzy:
    "The schemas are just considered documentation and are not binding. The actual binding specification of the communication format are these Excel sheets."

    Beautiful, purely beautiful. But I think we may all be missing a significant point here. Banks don't give a rat's ass about programmers, but they do care deeply about lawyers and regulatory compliance. This is partly because the people who run banks are technological illiterates. It's mostly because the bank can suffer limitless liability over legal issues, but only limited liability over programming fuck-ups (although I've seen one or two cases that stretch this proposition near to breaking-point).

    Consider: in a legal case, your pet shark is not going to be able to defend your actions based upon an XML schema. One, he won't understand it, and two, it leaves far too little wiggle room.

    He will, however, be able to mine the imprecision of an Excel document for any amount of obfuscatory fruitfulness.

    Particularly if said document is written in legalese, with no punctuation of any kind. (Not a charge you could level against the average XML Schema, I admit...)
  • Frank Mitchell 2007-10-12 19:18
    There's also JSON (JavaScript Object Notation), which uses a subset of JavaScript to represent trees of strings, numbers, booleans, nulls, lists, and maps. Not only is it an easy-to-read textual format (given enough ignorable whitespace), it translates more easily than XML to most programming languages.

    But I guess it's not enterprisey enough.
  • real_aardvark 2007-10-12 19:38
    Frank Mitchell:
    There's also JSON (JavaScript Object Notation), which uses a subset of JavaScript to represent trees of strings, numbers, booleans, nulls, lists, and maps. Not only is it an easy-to-read textual format (given enough ignorable whitespace), it translates more easily than XML to most programming languages.

    But I guess it's not enterprisey enough.

    It does, on the other hand, "use a subset of JavaScript."

    I believe I mentioned the Web, amateurs, and juveniles in a sentence just above. I'm sure you will find that JSON is more than up to the task of being "enterprisey enough."

    Excuse me whilst I reach for a very large and accommodating bucket.

    Addendum (2007-10-12 20:02):
    I don't normally append, but I think there's an important distinction here.

    HTML, XML and the like are mark-up languages. Javascript, insofar as it escapes the impending repeal of the Wade vs Roe judgement, is a programming language.

    It is not wise to mix the two.

    Basically, a mark-up/messaging (HTML/XML) language should define a set of relationships, preferably in an easily-parsed way, and preferably in a way that does not involve shooting off requests to every single goddamn URL on God's green and webby earth.

    Just boiling down a programming language to avoid all that nasty procedural/functional business is not the way to do this. Start from the ground up (and don't stand on a steaming pile of SGML shit while you're doing this) and build a coherent framework for the representation of the types you require. Remember: the semantics of these types differ significantly between a mark-up/messaging language (producer) and an actual programming language (consumer). You, as the creator of this language, might understand the nuances of these differences. Programmers, as the consumers, will not.
  • Jenny Simonds 2007-10-12 20:15
    whoever started the thread:
    John Y recently had to deal with an XML-like dump from a "4D" database. This dump used a peculiar form of abbreviation in which letters were chosen seemingly at random from field names, in order to meet the well known XML limitation of only allowing 5 characters per tag name.


    I'll bet they were trying to cut down on the bloated size of XML files due to having to repeat the tagnames for every element.

    somebody else:
    Quite so. In full SGML, "</> means "End the current element", and "<>" means "Repeat the most recent start-tag". So "<p>foo</><>bar</>" was short for "<p>foo</p><p>bar</p>".


    Anonymous closing tags are legal in SGML? That's VERY cool. I really wish they'd add that to the XML standard. In one fell swoop they'd eliminate 30-40% of the size of the overhead of XML boilerplate.

    The "<>" construct seems dangerous and useless, but anonymous closing tags are an obvious and needed improvement. I can't think of any good reason not to add it to the XML spec. As for readability - well, we all do just fine with programming languages that lack "endif", "endwhile", "endfor", "endswitch", "endfunction", "endstruct", "endclass", etc. Don't we?


    Addendum (2007-10-12 20:24):
    Consider:

    <myserializeddata>
    
    <anumericdatacolumn>0</anumericdatacolumn>
    <anotherdatacolumn>2</anotherdatacolumn>
    <yetanotherdatacolumn>1</yetanotherdatacolumn>
    <heresatotallyemptycolumn></heresatotallyemptycolumn>
    </myserializeddata>

    vs.:

    <myserializeddata>
    
    <anumericdatacolumn>0</>
    <anotherdatacolumn>2</>
    <yetanotherdatacolumn>1</>
    <heresatotallyemptycolumn></>
    </>
  • Andy Goth 2007-10-12 21:01
    Jenny Simonds:
    The "<>" construct seems dangerous and useless, but anonymous closing tags are an obvious and needed improvement. I can't think of any good reason not to add it to the XML spec. As for readability - well, we all do just fine with programming languages that lack "endif", "endwhile", "endfor", "endswitch", "endfunction", "endstruct", "endclass", etc. Don't we?
    You forgot "endunion"! The automobile industry, at least, would benefit from it.
  • David 2007-10-12 21:02
    gasman:
    Nonymous:


    Or they could just use a CDATA block...

    ...and turn an ugly-but-working data format into one that breaks as soon as your user input contains the characters ]]>.

    Not if you properly sanitize it to ]]&gt;
  • Andy Goth 2007-10-12 22:28
    I have my answer. Nobody ever bothers to change the subject line. Probably nobody even read it! :^)
  • prejudiced 2007-10-12 23:48
    Talchas:
    The five letters thing was sarcasm...

    and I thought this was yet one more Windows "feature"...
  • John Cowan 2007-10-13 00:17
    In those days, it was all about keeping the markup terse, because you paid offshore guys to keyboard documents that were going to be electronically processed, and they charged by the keystroke. Anything you could do to shorten it meant $$$$.

    The reason HTML is based on SGML is that it's essentially stolen from a sample document format that was published in one of the appendixes of the standard, with the addition of the A (and later IMG) elements.

    As for ADD A TO B, SGML is from the 80s, Cobol from the 60s.

  • foxyshadis 2007-10-13 07:42
    David:
    gasman:
    Nonymous:

    Or they could just use a CDATA block...

    ...and turn an ugly-but-working data format into one that breaks as soon as your user input contains the characters ]]>.

    Not if you properly sanitize it to ]]&gt;

    When you sanitize one instance of a character, you have to sanitize them all for consistent decoding, and in that case you may as well just sanitize everything at once and skip the cdata all together.

    Zygo:
    If the timing had played out a little differently, web browsers would have shipped with BSD and web pages would probably end up written in m4 or something. <shiver>

    Everyone knows the best replacement for HTML/XML is Mork.

    Jenny Simonds:
    Anonymous closing tags are legal in SGML? That's VERY cool. I really wish they'd add that to the XML standard. In one fell swoop they'd eliminate 30-40% of the size of the overhead of XML boilerplate.

    People have been begging for that and for a standard binary XML format for ten years, including me. It's not going to happen, and non-standard formats will keep filling the gaps of their stubbornness.

    Desperate for True WTFs:
    brazzy:
    "The schemas are just considered documentation and are not binding. The actual binding specification of the communication format are these Excel sheets."

    And this is why I continue to read Daily WTF. Even when the owners of the site can't manage to post a real WTF for weeks on end, occasionally, a user comment strikes WTF gold!

    This has to be featured on representative line next week.

    Q for old-timers: Do most SGML implementations use < >? Now that HTML is ubiquitous new ones probably do, but I know you can use any character you like for brackets and many did.
  • real_aardvark 2007-10-13 09:10
    Andy Goth:
    Jenny Simonds:
    The "<>" construct seems dangerous and useless, but anonymous closing tags are an obvious and needed improvement. I can't think of any good reason not to add it to the XML spec. As for readability - well, we all do just fine with programming languages that lack "endif", "endwhile", "endfor", "endswitch", "endfunction", "endstruct", "endclass", etc. Don't we?
    You forgot "endunion"! The automobile industry, at least, would benefit from it.

    Shareholders would. Managers on bloated bonus schemes would. There is a small but arguable case to say that customers would.

    It wouldn't do a lot for the 1 million people directly employed by vehicle manufacturers in the US, though, or for the 5 million employed in supporting industries.

    On the John Stuart Mill principle, I think you're wrong here.

    There are structural problems with the auto industry that have nothing at all to do with unionisation. I'm not a huge fan of unions: they work particularly poorly in areas like the software world. But I'm afraid that in production-line work, they're pretty much essential.
  • real_aardvark 2007-10-13 09:12
    Andy Goth:
    I have my answer. Nobody ever bothers to change the subject line. Probably nobody even read it! :^)

    Check it out, dude. Four posts up from yours.

    Feel free to read it as well, if you can spare the time.
  • real_aardvark 2007-10-13 09:33
    John Cowan:
    In those days, it was all about keeping the markup terse, because you paid offshore guys to keyboard documents that were going to be electronically processed, and they charged by the keystroke. Anything you could do to shorten it meant $$$$.

    The reason HTML is based on SGML is that it's essentially stolen from a sample document format that was published in one of the appendixes of the standard, with the addition of the A (and later IMG) elements.

    As for ADD A TO B, SGML is from the 80s, Cobol from the 60s.

    And men are from Mars, and women from this bijou little shop I know of that's down the Fulham Road ...

    Look, the first Beatles album was issued on March 22nd, 1963. The first Stones album was issued on May 30th, 1964. Does it really matter which came first? What sort of an argument is this?

    Cobol is primitive, but had no "prior art" on which to be based. It worked pretty well for thirty years. I hate it, but I've got to be honest.

    SGML is primitive, but for some reason decided to abandon entirely any "prior art" that might be lying around. That was the Unix Way. It sucked. Its spavined descendants still suck. Why anybody still thinks that having a human-readable, but machine-unparseable -- it's parseable if well-formed, but you're fucking doomed if a single character out of the millions is out-of-place -- is beyond me. That truly is a Cobol-style WTF.

    Anonymous closures, or whatever </> is called, are probably an improvement on the current version. I'm not sure why they're any less dangerous that their complement, anonymous openers (<>), unless you refuse to believe that "it is better to travel hopefully than to arrive..."

    However, they're still just syntactic sugar. The real problem is with the whole architecture of SGML/HTML/XML/etc.

    (Thanks for the history lesson, though ... always nice to have actual information to argue with, other than loony assertions like "Check out KleenXGML -- it wipes the arse off the competition!")
  • Weekend Warrior 2007-10-13 11:51
    real_aardvark:
    blah blah blah
    D00d, you seriously need to get a real job.
  • real_aardvark 2007-10-13 13:16
    Weekend Warrior:
    real_aardvark:
    blah blah blah
    D00d, you seriously need to get a real job.

    You hiring?

    Or are you just unable to read more than a couple hunnert words without your eyes crossing?
  • real_aardvark 2007-10-13 13:22
    foxyshadis:

    Everyone knows the best replacement for HTML/XML is Mork.

    Never heard of it, so I looked it up. Wonderful stuff. Now I'm sorry I ever bashed SGML; or, as Jamie Zawinski apparently said,
    "[Mork is] the single most braindamaged file format that I have ever seen in my nineteen year career."

    And this is still being used by Firefox and Seamonkey?
  • Daniel Beardsmore 2007-10-13 23:38
    TallGuy:
    I was thinking of

    <spoon><spoon><spoon><spoon><spoon><spoon><spoon><spoon>Badger, bagder</spoon></spoon></spoon></spoon></spoon></spoon></spoon></spoon>


    Protect yourself!

    (Sorry, had to ;)
  • Ingo 2007-10-14 06:24
    Hahahaha now everything below this line is sarcasm
    <sarcasm>
  • Talchas 2007-10-14 13:35
    Jenny Simonds:

    Anonymous closing tags are legal in SGML? That's VERY cool. I really wish they'd add that to the XML standard. In one fell swoop they'd eliminate 30-40% of the size of the overhead of XML boilerplate.

    The "<>" construct seems dangerous and useless, but anonymous closing tags are an obvious and needed improvement. I can't think of any good reason not to add it to the XML spec. As for readability - well, we all do just fine with programming languages that lack "endif", "endwhile", "endfor", "endswitch", "endfunction", "endstruct", "endclass", etc. Don't we?


    Addendum (2007-10-12 20:24):
    Consider:

    <myserializeddata>
    
    <anumericdatacolumn>0</anumericdatacolumn>
    <anotherdatacolumn>2</anotherdatacolumn>
    <yetanotherdatacolumn>1</yetanotherdatacolumn>
    <heresatotallyemptycolumn></heresatotallyemptycolumn>
    </myserializeddata>

    vs.:

    <myserializeddata>
    
    <anumericdatacolumn>0</>
    <anotherdatacolumn>2</>
    <yetanotherdatacolumn>1</>
    <heresatotallyemptycolumn></>
    </>

    I tend to agree, but why not go the whole way and use s-exprs?
    (my-serialized-data
    
    (a-numeric-data-column 0)
    (another-data-column 2)
    (yet-another-data-column 1)
    (heres-a-totally-empty-column))

    or better yet

    (my-serialized-data
    :a-numeric-data-column 0
    :another-data-column 2
    :yet-another-data-column 1
    (heres-a-totally-empty-column))

    Admittedly, the only real difference is a bit of syntax, but I find it makes it a lot easier to read.
  • mikko 2007-10-14 15:11
    real_aardvark:
    Andy Goth:
    Jenny Simonds:
    The "<>" construct seems dangerous and useless, but anonymous closing tags are an obvious and needed improvement. I can't think of any good reason not to add it to the XML spec. As for readability - well, we all do just fine with programming languages that lack "endif", "endwhile", "endfor", "endswitch", "endfunction", "endstruct", "endclass", etc. Don't we?
    You forgot "endunion"! The automobile industry, at least, would benefit from it.

    Shareholders would. Managers on bloated bonus schemes would. There is a small but arguable case to say that customers would.

    It wouldn't do a lot for the 1 million people directly employed by vehicle manufacturers in the US, though, or for the 5 million employed in supporting industries.

    On the John Stuart Mill principle, I think you're wrong here.

    There are structural problems with the auto industry that have nothing at all to do with unionisation. I'm not a huge fan of unions: they work particularly poorly in areas like the software world. But I'm afraid that in production-line work, they're pretty much essential.


    <sarcasm> You are absolutely right </sarcasm>
    The only thing that unions do is keep the incompetent employed. Well, not the only thing, but the biggest one...

    Just remember the UAW motto:

    "If it is poorly made of inferior materials, outrageously
    overpriced, and fails to fulfill its designed function,
    IT IS UNION MADE IN AMERICA!"
  • j6cubic 2007-10-14 15:47
    There are valid reasons for XML-like dialects. There just aren't many of them.

    I use a pseudo-XML format looking like that: <tag="value" />
    However, I'm smart enough to only use it for data entry (which requires writing lots of very simple XML by hand), don't try to actually store data in the format and, above all, never inflict the format on anyone else.

    People can bastardize XML all they want, as long as they don't make anyone else use their "new and improved" version.
  • real_aardvark 2007-10-14 17:46
    mikko:
    <sarcasm> You are absolutely right </sarcasm>
    The only thing that unions do is keep the incompetent employed. Well, not the only thing, but the biggest one...

    I'm sorry to say that you're wrong here. Sarcasm is the lowest form of wit and, well, that's just not very witty, is it? Try:

    <bigoted cretin>You are absolutely right</bigoted cretin>

    There, that feels better.

    What are the other things, btw?
  • Eternal Density 2007-10-14 19:18
    <>There is no quack</>
  • Sin Tax 2007-10-14 19:26
    real_aardvark:


    Actually, I don't see what's wrong with ASN.1 as an alternative to XML (obviously not as an alternative to HTML, which is a different question). ASN.1 is self-defining ("look, Ma, no schemas!" Well, practically none, anyway) and scales exceptionally well.


    ASN.1 is not self-defining. It is defined in ISO-8824, and it is not a language in which you write data, possibly together with it's specification (like SGML and XML), but a grammar /syntax) notation that you use to define how to parse binary data, together with the associated encoding rules.

    That's the problem with having different (broken) mark-up languages founded on a common (very broken) base. (They look the same, so they must be the same, right?) Also the problem with the twerpy in-band control sequences. I'm surprised it doesn't happen more often.


    No, this can happen with pure XML as well. A colleague at a former workplace once WTFed me with his webservice, which was supposed to receive XML data. Instead of doing what I suppose would be "right", namely declaring that the data inside the call should be of some schema, he simply passed the XML as a string - naturally requiring full encoding. IMO this is extremely inefficient, as you effectively have to parse the same data twice. He was very smart, and insisted this was indeed the right way to do it with XML. Either he was wrong, or XML is broken, or most likely, both.

    At least SGML has a different notation (DTD) for defining the "domain" language. I never understood why the XML people wanted so badly to get rid of DTDs and replace them by XMLSchema. XML is good, so everything must be XML right? Wrong. I haven't been interested much, but I think there have been other attempts at making a replacement for DTDs. Relax-NG etc, and I doubt this mess will ever get cleared up. That sure is a real WTF.

    Oh, in case you didn't know/remember: Before WWW, we had WAIS, which was based on Z39.50 using ASN.1. It worked just great. If TBL had built from there instead of making a crude protocol and even cruder markup language, the world would have been better off.

    -Sin Tax
  • Nefarious Wheel 2007-10-14 19:35
    You had a hand-crank? Sheer luuxury, mate. We had to kick-start our computers from the left side and the CDC disk drives were always leaking hydraulic fluid. Used to be a merry chase to herd them back into line Monday mornings when the oil got under the little rubber feet during a library compress. You young whippersnappers have got it sooooo easy!

    Oh, and have you heard of the object-oriented Cobol compiler? It's called "Add One to COBOL" and the system clock runs backwards.

    The above rant is all your fault for making me think of 80 column US Form 5081 punch cards with XML tags on them. Rat!
  • Mr Steve 2007-10-14 23:03
    <boobies>My face</boobies>
  • Javabeutel 2007-10-15 03:45
    <srcsm><srcsm>great</srcsm></srcsm>
  • Rhialto 2007-10-15 06:04
    mikko:

    Just remember the UAW motto:

    "If it is poorly made of inferior materials, outrageously
    overpriced, and fails to fulfill its designed function,
    IT IS UNION MADE IN AMERICA!"

    You have some strange unions in America. Really.
  • Bobco 2007-10-15 07:01
    FredSaw:
    brazzy:
    "The schemas are just considered documentation and are not binding. The actual binding specification of the communication format are these Excel sheets."
    Terms like "justifiable homicide" spring to mind.


    Ha, you are clearly unaware of another easter egg in current versions of Microsoft Excel (besides the well known flight simulator). Press Ctrl-S, Ctrl-O, Ctrl-A and Ctrl-P at the same time on Bill Gates' birth day and Excels transforms in a enterprise ready, fully fail-over next generation, industry leading ESB/SOAP Server/DWH/BAM/BPM tool that will read all the specs that can reached from your machine, create and compile source code in any programming language ever invented and deploy it on that old 386 in the corner, serving the entire enterprise. Part of this process is the automatic binding of any Excel sheet into valid XSDs wich in turn are used to generate XML binding code.

    The only thing that is missing is .Net 4.0 support, but that is being worked on. They only thing I have to figure out is Bill Gates' birth day...
  • Mirvnillith 2007-10-15 08:48
    Having had some indirect contact with the database engine in question (I pity those at my company that had direct contact!) nothing surprises me. I mean, would you consider fall-out from a WTF to be a WTF in itself?
  • Pingmaster 2007-10-15 09:14
    Andy Goth:
    Jenny Simonds:
    The "<>" construct seems dangerous and useless, but anonymous closing tags are an obvious and needed improvement. I can't think of any good reason not to add it to the XML spec. As for readability - well, we all do just fine with programming languages that lack "endif", "endwhile", "endfor", "endswitch", "endfunction", "endstruct", "endclass", etc. Don't we?
    You forgot "endunion"! The automobile industry, at least, would benefit from it.

    Don't you have a custome header file in all of your programs with the following:

    #define endif }
    #define endwhile }
    #define endfor }
    #define endswitch }
    #define endfunction }
    #define endclass }

    I mean, really, I can't write a program without it..


    BTW: the 5-character tag names might have been true(ish)..consider this: Senior Developer Paula has to write some program that utilizes XML in some way. Program breaks for some reason (maybe errors related to string length?) or another until co-incidentally, Paula shortens the tags in the XML file to <5 chars, while at the same timefixes the actual error. When the programs then runs successfully, Paula decides (without checking the W3C spec of course) that all XML tag names must be less than 5 hars in length. Being the Senior Developer, this becomes company policy.
  • Cloak 2007-10-15 10:31
    phaedrus:
    LoL:
    Is it just me or is the real WTF some of these comments?


    You must also be new here. TRWTF(tm) is always the comments. That's why you read them.


    Well said, phaedrus. And don't turn around, Lot!
  • mrData 2007-10-15 10:33
    This reminds me of a project I was on where I needed some master data for some customers and some products, and I asked for file with test data - and I got a text-file with one-line saying: "testdata"
  • Cloak 2007-10-15 10:43
    Andy Goth:
    I have my answer. Nobody ever bothers to change the subject line. Probably nobody even read it! :^)


    Sure, we did. It only took a while...
  • Cloak 2007-10-15 10:52
    Andy Goth:
    I have my answer. Nobody ever bothers to change the subject line. Probably nobody even read it! :^)


    Sure, we did. It only took a while...
  • NotanEnglishMajor 2007-10-15 14:32
    Pingmaster:
    Andy Goth:
    Jenny Simonds:
    The "<>" construct seems dangerous and useless, but anonymous closing tags are an obvious and needed improvement. I can't think of any good reason not to add it to the XML spec. As for readability - well, we all do just fine with programming languages that lack "endif", "endwhile", "endfor", "endswitch", "endfunction", "endstruct", "endclass", etc. Don't we?
    You forgot "endunion"! The automobile industry, at least, would benefit from it.

    Don't you have a custome header file in all of your programs with the following:

    #define endif }
    #define endwhile }
    #define endfor }
    #define endswitch }
    #define endfunction }
    #define endclass }

    I mean, really, I can't write a program without it..


    BTW: the 5-character tag names might have been true(ish)..consider this: Senior Developer Paula has to write some program that utilizes XML in some way. Program breaks for some reason (maybe errors related to string length?) or another until co-incidentally, Paula shortens the tags in the XML file to <5 chars, while at the same timefixes the actual error. When the programs then runs successfully, Paula decides (without checking the W3C spec of course) that all XML tag names must be less than 5 hars in length. Being the Senior Developer, this becomes company policy.


    How long is a har?
  • LightningDragon 2007-10-15 14:52
    Zygo:
    Soul-Grinding Madness Looms:
    Just one of many, many SGML WTFs, which seems to be designed to frustrate automatic processing as much as possible. Why they ever picked this to base HTML on...


    Because the available alternatives at the time were SGML, ASN.1, Gopher's hypertext format, or something from Microsoft.

    If the timing had played out a little differently, web browsers would have shipped with BSD and web pages would probably end up written in m4 or something. <shiver>

    I read that SGML was picked because it was *already* used for documentation systems at the time, and Tim Berners-Lee was trying to get support for the Web as a documentation system (he was working at CERN, which didn't have one at the time). As a result, it inherited many of SGML's WTFs, which didn't get ironed out until XHTML was created.
  • George 2007-10-15 16:36
    you need to get laid.
  • Dot For Now 2007-10-15 17:27
    On five letter limits: I thought that wasn't true. You still see lots of code where coders apparently think it's more scientific to leave vowls out of rndom placs in idntifers. Just like in the earlier versions of Fortran.

    1957 wants its compiler limitations back.
  • ronabop 2007-10-16 02:00
    Hookay.

    I guess we don't have a lot of readers who speak 4D (a real product, BTW)

    Let's say you have a RAD tool where db table names are changed on the fly, and you wanted to separate table (and column) labels out, so that what a user *sees* as a table label is actually pointing to something totally different. Thus, on Monday, a table can be named "foo", and on Tuesday, it can be renamed "bar", *without breaking anything at all*.

    This might lead to a dev making a mapping table, where "foo" can be a name one day, "bar" can be the name another day, but both names will point to a table (or column) named "_xb1x". (five chars, in ASCII, is a hella lot of tables/fields, BTW... if you exhaust that, your DB design is "wrong and stupid").

    A side effect of this is that when devs want a "raw dump" of the data, they get "_xb1x", rather than "foo" or "bar", but are totally ignorant of the *why*.

    As far as the lack of phenotype, that's a simple lack of understanding abstraction in RAD tools.

    Of course, the zen of programming is hard to explain to the unenlightened, which is why people, uhm, do insane things like make "very important names" for db tables and columns, and possibly why a simple abstraction layer shows up in the WTF pages.

    All that being said, I do think this kind of db abstraction sucks, just because it makes life hell on newbie programmers when working on raw data with no ideas on how it all maps out.
  • DarkSprout 2007-10-16 07:45
    <Badger><Badger><Badger><Snake Sing="Mushroom" /></Badger></Badger></Badger>
  • Daniel Colascione 2007-10-16 20:14
    Your comment makes no sense.

    So what if there's an abstraction layer? Abstracting something that doesn't need abstraction usually qualifies for a WTF in itself. In this database, you have an internal name, on one hand, and a mapped name that can be used for (I imagine) queries and such on the other. What if the mapped name changes? You're back where you started, only now you have a bunch of incomprehensible internal table names as well as the higher-level table name that you need to modify.

    Why not simply give your tables decent names to start with?
  • IMSoP 2007-10-18 04:45
    I think what a lot of people here are forgetting is that HTML wasn't primarily used for storing structured data - the <head> maybe, but the <body> is basically one block of formatted text ("HyperText").

    It's not actually all that meaningful to consider something like the following as a data structure:
    This sentence contains <em>emphasis</em> and <a name="here" href="elsewhere">an anchor</a>; which is nice.

    There are no "nodes" in that data stream, only formatting markup.

    So all this talk of ASN.1, EDI, etc is completely missing the point. Only later did people start using HTML for layout (using things like <table>s); and then, later still, to denote abstract structure (using things like <div>s) and pushing the visual layout into a different layer (CSS).

    So, sure, stricter rules on things like tag closing and nesting might have made things easier further down the line. But make it too strict and machine-oriented, and no-one would have got round to writing any browsers - or content!
  • argh! 2007-10-23 19:24

    in order to meet the well known XML limitation of only allowing 5 characters per tag name.

    if thats true, how exactly is xhtml valid xml?!

    textarea
    select
    option
    ..
  • Murkish 2008-09-28 13:29
    Excuse me, but how exactly can a series of tubes help in software development?


    Every programmer is technically a series of tubes...