• LCrawford (unregistered)

    At least they used JSON (which was probably the buzzword of the year) and not tortured XML.

    By the way, happy Frouth of July.

  • dusoft (unregistered)

    URLs with spaces. Proper punkers. Oi!

  • Kythyria (unregistered) in reply to LCrawford

    If you believe Lisp fanboys, XML is "just" verbose sexprs. JSON sans objects is rather more equivalent to sexprs (as long as you don't distinguish atoms and strings), therefore transitively, this is XML.

    Of course, the equivalence between XML and sexprs is very dubious: so far as I've ever been able to tell, a mapping scheme which translates arbitrary XML (after entity expansion) into sexprs and back is not one which turns arbitrary sexprs into XML and back. So lisp fanboys would also be a WTF.

  • Randal L. Schwartz (github)

    JSON's mistake of not permitting comments and trailing commas makes it horrible for humans to edit, although it's used like that every day.

    XML fails for humans to edit because the closing tag always has to be matched up.

    sexprs at least didn't require verbose closing tags, but the last line of your file was always '))))))))))))'.

    YAML is sorta OK, except it's expensive to parse because of those labeled references. At least it permits a relaxed JSON when you don't feel like using indentation as a syntax.

    I can't wait for the AI people to let us start writing config in natural language. Won't that be wonderful?

  • (nodebb)

    It sounds like the primary metric is not SLOC, but MTBF: Mean Time Between Failures.

  • Kythyria (unregistered) in reply to Randal L. Schwartz

    XML requiring matching close tags is a WTF on its own: it's the one thing that prevents the grammar from being context free (contrary to popular belief, "possibly misnested XML" is a regular language regardless of this).

    This would be less of a WTF if XML wasn't a subset of SGML, which allows a syntax that is simply </> to close the current element. I have no idea why this didn't make it into XML, and my only hypothesis is that SGML's own facility for defining what shorthands are allowed doesn't have a specific switch to allow just empty close tags without also enabling empty open tags or something.

    Empty open tags are a bit less helpful: do they open another copy of the currently open element, or of the most recently closed element? Still not the most WTF bit of SGML, which is an excellent demonstration of the principle that "easy to generate manually" and "easy to parse mechanically" are very hard indeed to reconcile.

  • (nodebb) in reply to Randal L. Schwartz

    On the YAML front, I don't like thinking of it as JSON-compatible: the whitespace is significant, just like in Python. The only JSON that's YAML-compatible is one that's been pretty-printed with just the right settings, and the only YAML that's JSON-compatible is one using just the right delimiters. Whether it's written by humans or converted by machines, that JSON-YAML compatibility is awfully fragile.

  • (nodebb)

    I just looked up YAML reference out of curiosity and it's one more proof that software design on drugs is a persistent problem.

    Happy independence day, fellow Americans!!

  • (nodebb) in reply to Randal L. Schwartz

    "Programming in natural language" is older than I am. It's called COBOL, and I'm in no hurry to return to it.

  • (nodebb) in reply to Randal L. Schwartz

    JSON's mistake of not permitting comments and trailing commas makes it horrible for humans to edit, although it's used like that every day.

    I agree on trailing commas, but on comments I am somewhat split. I can perfectly see situations where a consumer rejects JSON with extra keys, so people start adding custom metadata in comments. But yes, the argument is weak, and I'd prefer having them.

  • (nodebb)

    I don't particularly much like XML myself. But the closing tags are really not what I'd complain about. They are verbose, but that's it.

    What I'd rather complain about is the enormous complexity, that results in everyone implementing the same concepts completely differently. Though I can see some value even there.

    Example: You want to represent a list of input values, let's say a mass and a position. In JSON, you'd probably come up with a substructure along the lines of

    "properties": {
        "mass": 1.53,
        "position": [0, 0, 10]
    }
    

    In XML, at this level this already starts with multiple questions: Attributes or text inside nodes? In our own config files I've seen both equivalents of

    <mass value="1.53"/>
    

    and

    <mass>1.53</mass>
    

    Things get weirder for the array, because XML has no natural way to represent this. I've seen all of

    <position value="0 0 10"/>     (1)
    
    <position>0 0 10</position>    (2)
    
    <position>                     (3)
       <x>0</x>
       <y>0</y>
       <z>10</z>
    </position>
    

    and I suppose there are many more possible variations, as people try to balance the verbosity and need for conventions of (3) against the lack of structure in (1) and (2).

    Now a new requirement comes along: You have so far been representing lengths in meters and masses in kilograms, but some manager or customer is adamant about the need to represent different unit systems in the XML file. Or at the very least, the XML file should actually specify the units. For XML the extension paths are reasonably easy using attributes.

    <mass value="1.53" units="kg">
    <mass units="kg">1.53</mass>
    

    For the position, same concept. These input files are now also compatible with previous versions of the software, as long as the units correspond to the previous convention, as the previous version will just ignore the added attributes.

  • (nodebb)

    (continued)

    So... how do you do the same in JSON?

    "mass_units": "kg",
    "mass": 1.53,
    

    is just awful at first glance, but might be OK for solving the "human readable and backwards compatible" aspect, if so required given the alternatives. A separation

    "unit_system_info": {
        "mass": "kg",
        "length": "meters"
    },
    ...
    "proprties": {
        "mass": 1.53,
        "position": [0, 0, 10]
    }
    

    would work, but still be awkward. Also, not robust for allowing mixed units (e.g. milimters and meters) in the file, which the XML version is, and it separates the pieces of needed information awkwardly from a human-reader perspective.

    A change to

    "mass": {"value": 1.53, "units": "kg"},
    "position": {"value": [0, 0, 10], "units": "meters"}
    

    lacks the backwards-compatibility property, and makes the data less human-readable due to its verbosity. A form

    "mass": [1.53, "kg"]
    

    would look natural to a human reader, but be weirdly error-prone from a parsing perspective, and lead to weird cases like

    "position": [[0, 0, 10], "meters"]
    

    which start looking very weird under pretty-printing

    "position": [
        [0, 0, 10],
        "meters"
    ],
    

    which again fails the human-readability aspect.

    TL;DR: Whatever to complain about the capabilities of common data representation methods, whether it needs a closing tag or not is really not high on my list of concerns.

  • LZ79LRU (unregistered)

    Honestly I like verbose closing tags for the simple reason that they make parsing by both humans and machines easier. Consider the nesting example R3D3 posted only instead of one level you have 5 or 50 and suddenly having named closing tags make thing a lot easier to figure out at a glance.

    Verbosity isn't always bad. Too much information can be discarded mentally, too little can't be created.

  • (nodebb) in reply to R3D3

    On the subject of comments in JSON, the fact that JSON was designed as a wire protocol between processes, and not as some sort of file format, is all by itself enough to explain why it doesn't have comments. It was never supposed to sit around to be edited by humans. But yeah, adding comments to JSON would just encourage people to put semantically-significant stuff in those comments.

  • (nodebb) in reply to LZ79LRU

    Verbosity isn't always bad. Too much information can be discarded mentally, too little can't be created.

    Mixed bag on that. While I agree that XML can make it easier to navigate large files, the verbosity is quite frequently so extreme, that it severely hurts readability.

    I'm glad that our code-base went for the

    <mass value="1.53" units="kilograms"/>
    

    method, as a find

    <mass units="kilograms">1.53</mass>
    

    to be awful for human-readability and I often need to edit these files manually for development purposes.

  • (nodebb)

    Out of curiosity: What happens in various formats with "duplicate keys"?

    JSON:

    properties: { 
        "mass": 1.53, 
        "mass": 1.52
    },
    

    XML:

    <properties>
        <mass value="1.53"/>
        <mass value="1.51"/>
    </properties>
    

    In JSON, it would be obvious that something is going wrong. From what I've been reading, real-world parsers may not treat it is an error, but silently use the first or last value, or even try to preserve the duplicate key somehow in anticipation of possibly ill-formed input. But at least it is clear, that there is an issue.

    With XML, it is not clear at all. In our own code base it has gone the very weird way of just ignoring such inputs entirely... So the two would be the same to our software:

    (a) <properties><mass value="1.53"/><mass value="1.52"/></properties>
    (b) <properties/>
    
  • LZ79LRU (unregistered) in reply to R3D3
    Comment held for moderation.
  • SG (unregistered) in reply to R3D3

    In XML, they're not duplicate keys unless there's a schema which says that the 'mass' element cannot be repeated. It's perfectly valid XML as-is, and indeed, that's how you'd represent an array structure that allows multiple values. So it's up to the parser... it might reject it as non-conformant with the schema, or it might be fine because it does conform to the schema, or a non-validating parser might do just about anything...

  • (nodebb) in reply to R3D3

    With XML, it is not clear at all.

    Maybe. If you have a validating parser and a correct description of the schema, the double <mass ...> might be treated as invalid or as what amounts to a two-valued parameter (effectively an array of masses).

    So the two would be the same to our software:

    I would hope that it treats <properties><mass value="1.53"/><mass value="1.52"/></properties> as being equivalent to <properties></properties> because of the subtle difference between <properties></properties> and <properties/>. (As I understand it, the first contains an empty text element, while the second contains no text element.)

  • Steve (unregistered) in reply to dusoft
    Comment held for moderation.
  • (nodebb) in reply to SG
    Comment held for moderation.
  • TRED (unregistered)
    Comment held for moderation.
  • TRED (unregistered)
    Comment held for moderation.
  • markm (unregistered)
    Comment held for moderation.
  • LZ79LRU (unregistered) in reply to R3D3
    Comment held for moderation.
  • (nodebb)
    Comment held for moderation.

Leave a comment on “Classic WTF: The Contractor”

Log In or post as a guest

Replying to comment #:

« Return to Article