- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
At least they used JSON (which was probably the buzzword of the year) and not tortured XML.
By the way, happy Frouth of July.
Admin
URLs with spaces. Proper punkers. Oi!
Admin
If you believe Lisp fanboys, XML is "just" verbose sexprs. JSON sans objects is rather more equivalent to sexprs (as long as you don't distinguish atoms and strings), therefore transitively, this is XML.
Of course, the equivalence between XML and sexprs is very dubious: so far as I've ever been able to tell, a mapping scheme which translates arbitrary XML (after entity expansion) into sexprs and back is not one which turns arbitrary sexprs into XML and back. So lisp fanboys would also be a WTF.
Admin
JSON's mistake of not permitting comments and trailing commas makes it horrible for humans to edit, although it's used like that every day.
XML fails for humans to edit because the closing tag always has to be matched up.
sexprs at least didn't require verbose closing tags, but the last line of your file was always '))))))))))))'.
YAML is sorta OK, except it's expensive to parse because of those labeled references. At least it permits a relaxed JSON when you don't feel like using indentation as a syntax.
I can't wait for the AI people to let us start writing config in natural language. Won't that be wonderful?
Admin
It sounds like the primary metric is not SLOC, but MTBF: Mean Time Between Failures.
Admin
XML requiring matching close tags is a WTF on its own: it's the one thing that prevents the grammar from being context free (contrary to popular belief, "possibly misnested XML" is a regular language regardless of this).
This would be less of a WTF if XML wasn't a subset of SGML, which allows a syntax that is simply </> to close the current element. I have no idea why this didn't make it into XML, and my only hypothesis is that SGML's own facility for defining what shorthands are allowed doesn't have a specific switch to allow just empty close tags without also enabling empty open tags or something.
Empty open tags are a bit less helpful: do they open another copy of the currently open element, or of the most recently closed element? Still not the most WTF bit of SGML, which is an excellent demonstration of the principle that "easy to generate manually" and "easy to parse mechanically" are very hard indeed to reconcile.
Admin
On the YAML front, I don't like thinking of it as JSON-compatible: the whitespace is significant, just like in Python. The only JSON that's YAML-compatible is one that's been pretty-printed with just the right settings, and the only YAML that's JSON-compatible is one using just the right delimiters. Whether it's written by humans or converted by machines, that JSON-YAML compatibility is awfully fragile.
Admin
I just looked up YAML reference out of curiosity and it's one more proof that software design on drugs is a persistent problem.
Happy independence day, fellow Americans!!
Admin
"Programming in natural language" is older than I am. It's called COBOL, and I'm in no hurry to return to it.
Admin
I agree on trailing commas, but on comments I am somewhat split. I can perfectly see situations where a consumer rejects JSON with extra keys, so people start adding custom metadata in comments. But yes, the argument is weak, and I'd prefer having them.
Admin
I don't particularly much like XML myself. But the closing tags are really not what I'd complain about. They are verbose, but that's it.
What I'd rather complain about is the enormous complexity, that results in everyone implementing the same concepts completely differently. Though I can see some value even there.
Example: You want to represent a list of input values, let's say a mass and a position. In JSON, you'd probably come up with a substructure along the lines of
In XML, at this level this already starts with multiple questions: Attributes or text inside nodes? In our own config files I've seen both equivalents of
and
Things get weirder for the array, because XML has no natural way to represent this. I've seen all of
and I suppose there are many more possible variations, as people try to balance the verbosity and need for conventions of (3) against the lack of structure in (1) and (2).
Now a new requirement comes along: You have so far been representing lengths in meters and masses in kilograms, but some manager or customer is adamant about the need to represent different unit systems in the XML file. Or at the very least, the XML file should actually specify the units. For XML the extension paths are reasonably easy using attributes.
For the position, same concept. These input files are now also compatible with previous versions of the software, as long as the units correspond to the previous convention, as the previous version will just ignore the added attributes.
Admin
(continued)
So... how do you do the same in JSON?
is just awful at first glance, but might be OK for solving the "human readable and backwards compatible" aspect, if so required given the alternatives. A separation
would work, but still be awkward. Also, not robust for allowing mixed units (e.g. milimters and meters) in the file, which the XML version is, and it separates the pieces of needed information awkwardly from a human-reader perspective.
A change to
lacks the backwards-compatibility property, and makes the data less human-readable due to its verbosity. A form
would look natural to a human reader, but be weirdly error-prone from a parsing perspective, and lead to weird cases like
which start looking very weird under pretty-printing
which again fails the human-readability aspect.
TL;DR: Whatever to complain about the capabilities of common data representation methods, whether it needs a closing tag or not is really not high on my list of concerns.
Admin
Honestly I like verbose closing tags for the simple reason that they make parsing by both humans and machines easier. Consider the nesting example R3D3 posted only instead of one level you have 5 or 50 and suddenly having named closing tags make thing a lot easier to figure out at a glance.
Verbosity isn't always bad. Too much information can be discarded mentally, too little can't be created.
Admin
On the subject of comments in JSON, the fact that JSON was designed as a wire protocol between processes, and not as some sort of file format, is all by itself enough to explain why it doesn't have comments. It was never supposed to sit around to be edited by humans. But yeah, adding comments to JSON would just encourage people to put semantically-significant stuff in those comments.
Admin
Mixed bag on that. While I agree that XML can make it easier to navigate large files, the verbosity is quite frequently so extreme, that it severely hurts readability.
I'm glad that our code-base went for the
method, as a find
to be awful for human-readability and I often need to edit these files manually for development purposes.
Admin
Out of curiosity: What happens in various formats with "duplicate keys"?
JSON:
XML:
In JSON, it would be obvious that something is going wrong. From what I've been reading, real-world parsers may not treat it is an error, but silently use the first or last value, or even try to preserve the duplicate key somehow in anticipation of possibly ill-formed input. But at least it is clear, that there is an issue.
With XML, it is not clear at all. In our own code base it has gone the very weird way of just ignoring such inputs entirely... So the two would be the same to our software:
Admin
In XML, they're not duplicate keys unless there's a schema which says that the 'mass' element cannot be repeated. It's perfectly valid XML as-is, and indeed, that's how you'd represent an array structure that allows multiple values. So it's up to the parser... it might reject it as non-conformant with the schema, or it might be fine because it does conform to the schema, or a non-validating parser might do just about anything...
Admin
Maybe. If you have a validating parser and a correct description of the schema, the double <mass ...> might be treated as invalid or as what amounts to a two-valued parameter (effectively an array of masses).
I would hope that it treats
<properties><mass value="1.53"/><mass value="1.52"/></properties>
as being equivalent to<properties></properties>
because of the subtle difference between<properties></properties>
and<properties/>
. (As I understand it, the first contains an empty text element, while the second contains no text element.)