- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
I love the new word coinage. Blaggarant. It's good!
Admin
Ha Ha Ha Ha Ha Ha.
Admin
Admin
Any chance of someone posting what a valid XML solution might look like? I am XML ignorant, and though I get some of the complaints with this snippet, it would be nice to have a better solution to compare to.
Admin
It's not a lie to call that XML. That's perfectly formed XML that will pass any legal sniff test.
Admin
/me raises hand... sheepishly....
Admin
It should have looked like this:
Admin
"Why don't you use XPATH" is not a **** (dumb?) question. Just as much as "Why don't you use a SQL query?". In many cases it makes no sense to use XML if you're not taking advantage of XSLT, XPATH, etc. If you care about performance so much, don't use XML.
Sometimes coming across like an ***hat is better than coming across as an ignorant ***hat, I'll give you that.
Admin
I've had almost the exact opposite, one of our contracting groups had a timekeeping system that output in csv, but the last column in the csv was a full xml document containing all of the data contained in that row, sans the xml column of course.
Well before I got there the timekeeping system we have would parse xml, and some two geniuses got together when they picked up the contractors and decided on using csv as the exchange medium.
What was idiotic about the whole thing, the contractors timekeeping system was written to output xml first, and was altered to output csv.
Admin
not that horribly bad, maybe transformable it with an xslt to something more firendly format :)
Admin
Here's an example of what they could have done. Pretend that I left the options tag part intact, as that's actually not that bad.
<people>
<user>
<name>Jack Wade</name>
<officeID>214</officeID>
<startingDate format="mm/dd/yyyy">01/02/2003</start>
<personelID>111012</personelID>
<DOB format="mm/dd/yyyy">07/04/1975</DOB>
<sex>Male</sex>
<lastModified format="mm/dd/yyyy">02/11/2006</lastModified>
</user>
... other users go here
</people>
Bear in mind that XML is very freeform. A good XML primer is at W3Schools ( http://www.w3schools.com/xml/default.asp ).
The idea is that with XML, the data pretty much describes itself. You can look at an XML document and know exactly what the hell is going on, as long as the tags and such are descriptively named such that you can understand it. What these guys did, instead, is take a CSV file, throw it in, and tack on a few "descriptors" in another node. Basically, they took a CSV file and wrapped XML around it so that they could say they were using "XML"
Admin
Let me tell you a story. I work for a public authority of the Saarland (a state of germany). We were used to exchange data with our superior authority in Berlin once quarter. This was a mostly painless process, we used CSV and fixed length EDI files. Then some bighead decided two years ago it would be chicque and politically desirable (authorities like to look modern, you know) to replace the established process by a whole new toolset and, of course, using XML.
Result:
Worst of all: I am in charge of that crap.
Holli
Admin
I think you're missing the point. This was a third party. If you're using XML for extra-company communications, you should use XML in its expected format. It's one thing to WTF in intra-company programs, but WTFing in the exposed portions of your product, the parts that clients are supposed to use, is another thing entirely.
Admin
I bet their database has comma-delimited fields in it too!
Admin
XML is markup, not code.
Admin
Bravo. Justifying design decisions by appealing to performance. The last resort of the incompetent. Also known as: "turbo-might manure-ver".
I'll have to remember that the next time I'm in over my head. I should also remember to consider this possibility the next time I hear someone use that excuse on me. The bluff can be called by asking "what performance benchmarks have you run in a production environment to justify this decision"?
I'm impressed that you had the guts to try this and even more impressed that you got away with it. Well done.
Admin
That's too painfully familiar. Try exporting a defect report from Rational ClearQuest and you get something like that. Except it's so bad Excel can't even import it. 'Scuse me while I go cry again.
Admin
The correct answer to "what performance benchmarks have you run in a production environment to justify this decision"? is:
I am sorry, but i implemented the benchmark in dot net 2.0 and the license for dot net 2.0 prevent me from giving you any benchmark data. (And then they say there is no use for stupid licenses :}
Admin
I'll have to remember that. Just make up a new word (like preformance) and people won't dare admit they don't know what it means. Brillant!
Admin
OMG that is the best signature ever. brings me back to the days of watching dune while drinking vast amounts of psylocybin tea...........
yeah totally off topic, but bad xml is more the norm than good signatures.....
Admin
Unless speed isn't an issue; then XML may be better because you don't have to write any new code to parse it (unless, on the other hand, you already have libraries for csv / fixed width). And if you need to validate the data, you can add a schema (although it will be even slower).
On balance, I think XML gets a bad rap on this forum.
Admin
Admin
I believe they totally understood XML. However, I also believe they already had an existing infrastructure for CSV files, and they didn't want to change that infrastructure. Thus, they married a (quite reasonable) XML header to an (already existing) CSV parser, and the problem was solved in time for lunch. Actually doing XML schema parsing would take a long time, and thus reduce the profit on the sale. It's the Enterprise way, from what I'm led to understand!
Admin
<hee hee/>
I once inherited a project that some LAWYER wrote <wtf #1 /> that stored an xml file in a database *FIELD* <wtf #2> which led to "hey how can I run a report on that xml file that's in that field <wtf #3>
needless to say a rewrite fixed this problem.
Admin
Tags are for things there are many of. Attributes are for things there are one (or zero) of. Thus, you'd probably just want something like:
<people dateFormat='mm/dd/yyyy'>
<user name='Jack Wade' officeID='214' startingDate='01/02/2003' personnelID='111012' DOB='07/04/1975' sex='M' lastModifier='02/11/2006' />
</people>
Admin
True. I think XML is a good option if you have to save nested data structures in non-binary files. Of course once your data structures reach a certain complexity or your XML files grow too large, you may be better off inventing your own file format or using a database system.
Admin
This is no wtf, this is magnificent! Obviously some passive-agressive developer had enough of his pointy haired boss insisting on xml, and gave him this to shut him up.
At least, thats what I choose to believe.
Admin
I have come across situations where storing XML in a database field makes sense. A record recycle system that I worked on several years ago held records that had been *deleted* from the system as XML in a database field. This removed the need to have hundreds of specific recycled record tables. So its not quite as big a WTF as you would make it out to be. Of course there is no excuse for lawyers writing software.
Admin
Its already a serialized format. That's actually all plists are, serialized objects. The binary format takes up less space on disk and is faster to process which is why its the default format for programs now, however the file format difference is completely transparent to any well behaved program.
As far as Human Readable goes, there's a command line tool called plutil which allows conversion from the binary format to the XML format and back. (ex. plutil -convert xm1 myfile.plist) There's also a plist editor that comes with the OS X developer tools.
Admin
Admin
The trouble(*) is that the "quote problem" has multiple solutions, each of which fails on a different subset of inputs. TSV's failure set (values containing tabs) is considerably smaller.
* Troubleshooter (n.) - Someone who finds trouble and shoots it.
Admin
How is that FizzbinSQL project coming along, anyhow?
Admin
This behavior on requirements goes both ways though. Did the development group ever contact their customers to let them know: "Hey, this is how we're expecting to fulfill your requirement, are you OK with this design?"
Honestly, would you build a brand new house costing $100k+ and never once inspect the design plans, how the construction was going, and whether they were meeting your expectations? Would half the builders just build a piece of shit and cover it up with drywall and paint if you never were involved in the build process? You bet your butt they would!
Yes, it's a nice WTF, but there've been better ones posted on this board than this one. I blame the Business and the 3rd Party software vendor on this one. Just typical, bad project management though - and there's a lot of poor leadership of projects, processes, and entire businesses going 'round these days...
Admin
I used to think so, but was shown wrong. Attributes are by strong convention to be used only for metadata. Reference this excellent resource.
Admin
Point taken. I guess I should have said THIS XML is markup, not code. So I stand by the OP's statement.
Admin
Wait till he finds out about the "tab problem" in TSV.
Admin
Thanks to people like you, we have this site that entertains me daily...!
I'm pretty sure you still produce those WTF moments daily.. why? Because it's too difficult to recognize for you that you don't know something... it's not much different from cheating...Admin
When you have a library, like http://ostermiller.org/utils/CSV.html , parsing CSV is very simple. The library takes care of quoting, double quotes, escapes...
Admin
Still.. markup... that marks-up code :-)
Admin
Maybe, but what we tend to do is give them what they'll pay for, which is often neither what they want nor what any sane person would need. ;)
That's why the spec writing is done after the analysis phase, and is updated in an iterative process driven by intelligent people who're cooperating to build a useful product.
In theory.
In reality, either or both side are held down by idiots and office politics means that the average customer is focussed on building their own office empire, rather than meeting real needs of the poor bastards who're actually going to beusing the software. Isn't life fun?
The guy with the chequebook is the customer. The guy in a cubicle with a workstation is the one who needs to get data from assorted locations. Meeting the needs of the user often doesn't satisfy the desires of the guy who won't personally use the damn thing.
Doesn't matter if it's software or selecting a catering company - the VP dines at a 5-star restaurant, and doesn't care if the replacable employees get fed stale tuna every day, no matter how much the catering company broker may have high personal standards.
The real WTF is the idea that leaving the oceans was a good idea. ;)
Admin
The quote problem is a fundamental design flaw - you've got two separate standards for escaping things and the edge cases caused by this complexity mean that there are lots of incompatible csv standards, so you may as well chuck it all and start over.
I propose the following standard:
1. pipe separated values, one row per line
2. all escaping is done with \, so \r, \n, \p (pipe) and \" behave as expected
3. any line starting with | is a meta line. It can describe column headings or author info or whatever you like. Not much defined here
The problem, as always, is getting people to agree on the same thing.
Admin
Which are used as serialisation format.
A hint for people who want to preserve their sanity: Never EVER look at the native OmniOutliner 2 files; only look at the exported "XML" files which at least attempt at being sane. (No idea if 3.x has cleaned up their act.)
Basically the files are plists. With bazillion pieces of gunk you don't need if you are not interested, in intimate detail, how OmniOutliner physically stores its objects. With text in the outline in RTF format. Mysterious Binary Blobs. And the whole thing wrapped in XML.
Admin
No.
captcha: genius
Admin
Admin
If Rob failed to notice these things, how could have he related them to anybody else?
-------------------------
"Are you asleep?"
"Yes."
Admin
If Rob failed to notice these things, how could have he related them to anybody else?
-------------------------
"Are you asleep?"
"Yes."
Admin
You aren't looking hard enough...
Captcha: Quality
Admin
Uhm its parsable XML, but it is not legal XML.
And yeah its pretty bad.
The only point to the XML is to define the "fields" (and map them to something) But what "fields" are you defining?
As far as I can tell, they've tied the fields to the ORDER the fields tags are listed (that is the first field tag is the first column in the csv data, etc). And that defeats the purpose of using XML.
Admin
How does that solve anything? Are you saying I'm not allowed to have tabs in my fields?
Admin
the story isn't complete. did you actually use xpath in your application? i hope so.