- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Notice mysql (5.1) now has ExtractValue() and UpdateXML() to performan xpath queries and updates on xml documents stored in text fields. Can't wait to see some of the gems this will inevitably result in...
Personally, I think storing xml in a relational database is a very bad idea, unless that database is specifically designed to handle xml (ie, Oracle's XML DB ). I reckon it probably comes down to developers being too lazy to design a proper schema for their tables - whatever the reason, it leaves a freakin mess behind for those attempting to maintain the damn thing.
Admin
Actually, they'll probably use that XQuery bullshit, the wonderful Tamino database already does it (native XML database for the win, XQuery or XQL requests (with heaps of XPath goodness... the parts that've been implemented in the DB engine i mean), runs slower than molasses and overall defeats the very purpose of a database).
But at least you can run XSLs on the raw data you retrieve from the DB without having to do any transformation!
Oooh, another guy who's had the (dubious) priviledge of "using" Tamino, I'm not alone!
Admin
Uh, no. Nice sarcasm & spelling, though.
Are you going to let your users specify new schema elements, and are you going to design a front end app to do so? I thought not.
By "extensible", I'm guessing they mean extensible at run time, ad hoc, by the users. I've worked on a number of systems where this was a requirement and it's always very difficult. The best solution I found was to create an abstract db schema. This is difficult, but has the advantages of leaving the data in native, non-compound db attributes (unlike today's example).
Admin
I'm actually working on fixing something very similiar.
XML doesn't hurt applications. Developers hurt applications.
Admin
I inherited a project that has a database that stores most of the data in xml strings. Aside from the pain of extracting data, another WTF is that the storage required for a piece of data like UsersMiddleInitial, which could be captured in a char(1) column, you store it as <UsersMiddleInitial>J</UsersMiddleInitial> in each and every column.
Probably an outsourcing nightmare like the system I deal with.
Admin
This is absolutely evil. My guess is either management kept changing the requirements, or the programmers never realized what they needed to do, and in previous projects, they felt like they were always mucking with changing the database table structures all the time to handle design refinements.
Maybe they could make classes "fully extensible" by just using hashtables of strings and function pointers...it seems like "fully extensible" is another way of saying "won't have to adhere to pesky advanced plannning" and such.
Btw, how do I apply to become a hardware vendor for this company? I foresee brisk sales of large load balancing database server arrays.
Admin
Thats...not really that bad...
Seriously, I've done far worse things in SQL and I've known the language less than 3 months
Try UNIONing 3 statements each 6 lines long, reading from tables with different column names for the same data and each statement requiring 2 SELECTs in their FROM list because the unindexed lookup table otherwise gets overly linked and sends the execution time to hell... That statement only reached 3 SELECTs deep(the third was so that I could order the whole lot) and I found it relatively easy to maintain weeks after I had written it.
If you think SQL is bad, try dealing with VB... The only reason my statement was so long was because it'd be agonizing to bring the data back and do all the transformation on the local computer in Excel for Applications... -.-'
Admin
Oh no, you've just WTF'd Slashdot with that link!
Admin
Once again we are presented evidence of people who should simply not be in our business.
This one is off the charts, though.
Admin
That's just about the opposite of what a schemaless database provides. With any relational database, there needs to be some commonality (other than "housekeeping metadata") between entries/rows/records. Not every record may produce entries in every table, but when the data are truly amorphous, one may need essentially one table per record/entry/document. Yes, potentially thousands or millions of tables in a single database. In a schemaless database, only the metadata is guaranteed. All data, though, remain retrievable/searchable by field, even if that field may exist in only one or two entries of several million. Again, it's horses for courses.
Admin
Ok, here's my beef. Some people here are saying, "XML is great when used for the right purposes"...maybe so. Although for basic data, why not use something more lightweight, like JSON? XML is really bloated, and the model of having attributes and a body doesn't really map all that well to an object model. JSON does.
For markup of text, XML is great. And hey, wait a minute, that's what it was designed for! Extensible MARKUP language. Not extensible all-purpose data language. MARKUP. It's like HTML, only you can define your own tags. Who would think that HTML is a good format for sending or storing raw data? But that's what the world is doing now.
So repeat after me: XML is for MARKUP. Not for structured data. Something like JSON does structured data much better...more readably, more compactly, more logically, easier to parse.
Admin
Heh. There was a website* that was posted on both the frontpages of Slashdot and digg.com today. Poor bastard never had a chance...
* <FONT color=#808080>Aint I a stinker? Of course, I don't think WTF's effect will be nearly as bad...</FONT>
Admin
Oh, that's right. <FONT size=1>SMALL</FONT> means <FONT size=6>VERY, VERY FRIGGIN' BIG!!!</FONT>
Admin
That's ok, I hit him a +1 myself.
-dave-
Admin
I've seen worse than this. I've got a WTF that I've got to submit in the near future [as soon as I can get enough of the business logic that is owned by my company out, it's been OK'ed] where the guy invented his own format to put inside a column. I can't decide if the best part about it is that he actually had some of the columns to use in the same table or that he does SQL queries parsing the data, or if it's that the whole thing fails if you have a tilde out of place...
Admin
When I first read that word, I saw "shemales", heehee!
Admin
Holy shit. This is possibly the worst thing I have ever seen. I congratulate the programmers on their pure evilness.
Admin
Yeah... Or Lua... Heck, even the PHP serialization format is readable enough and compact enough. As they say, there are only solutions...
The real trouble is, you can't search/join on a column containing that. Not from SQL. Not if you want something resembling decent performance. But then again, SQL is even more misunderstood and misused that XML...
Admin
this reminds me of a query i wrote some time ago. i'm rather morbidly proud of it despite its complexity: 7 left joins on a well-designed database for event listings. interestingly, my query managed to properly locate, sort, and post-process a list of the week's upcoming eventsand arranged them appropriately for publication in a local newspaper.
it only required a couple lines of PHP to split the data by section and sort those arbitrarily, but the data, as produced, was perfect for the paper's needs.
i wish i could find the final query i came up with... it was a doozy weighing in at over 8,000 bytes.
Admin
That's like taking all the good parts of both SQL and XML and combining them... and then throwing them out of the window and make do with what's left over.
Admin
Working on something similar here.
I'm trying to find out why one of the linux machines on our network is so sluggish in the morning, and OK for the rest of the day. Turns out that box was selected to run a particular cron job at 3am, which imports new data from a certain source into the database. Turns out that its method for doing this is to get the files (a numbered set of a few hundred CSV files), slurp them all into memory, sort them, write them out to a numbered set of XML files, then spawn another process which reads in the XML files, checks them for validity and combines them into one large file. It then reads in and parses this file, sorts it several times to check for duplicates on several different unique fields (rewriting and re-parsing the file on disk each time), before finally breaking the data down into a big set of parallel arrays, which it loops over to insert the data into the SQL database.
Result: Transferring ~5000 rows of 31 fields from CSV files to an SQL table takes 99% CPU and ~250MB of ram, for about 10-12 hours every morning. Aaaaargh!
Stripping out all the XML crap reduces this execution time to just over 6 minutes (most of which is spent waiting for the files to arrive)
Admin
Why not store all of the critical, permanent fields in one table and secure it in the rdbms to prevent any modification. Then create another related table to store the 'optional' columns, and build a nice, safe data layer component to control addition of new columns to this secondary table?
It would then be possible to set the right data type & size for each additional column, and use simple sql syntax to retreive data, rather than parse all of that XML.
Just a thought....any feedback? Am I setting up a future WTF with this idea???
Admin
<FONT face=Verdana size=2>I think this thing is somewhere between Commerce Server ("The database design will be fully extensible") and BizTalk "the rest of the data will be stored in an XML-formatted TEXT column" [:D] [:D]</FONT>
<FONT face=Verdana size=2>pf</FONT>
Admin
What I also love is the way that performance will slowly degrade as more data is added (because the database hasn't got a cat in hell's chance of indexing this monstrosity), by which time presumably the app will be live or near-live and re-engineering it to do things properly is going to be an enormous task. Mazeltov !
Admin
Now if only I didn't have an existing system in production for several clients that did just that :(
I work for one of the world's larger ISP/CPs - and I was involved in writing a system that provides pdfs that are produced on the fly using XSLT for a few million end users. For reasons of data integrity (the PDFs can't change) we decided to store the XML directly, rather than generating it from potentially malleable tables.
We're now migrating everything, for different, but still slightly insane, business reasons to storing the PDFs in the database. Even more joy.
darjien
Admin
My eyes! The googles, they do nothing!
Admin
You're a fag.
And what would be this mystical "properly use" of XML that you are so keen on using?
Fucking poser.
Ass-monkey.
Admin
I made an application for a guy that needs to send a few reports - basically just some weirdly laid-out
s - to the authorities every few days. He used to do it all by hand. I store the data he needs to fill in in an XML file, then use XSLT to actually generate the HTML reports (used to be Microsoft .DOC) and these get e-mailed to the authorities. As an added bonus, he saves minutes of satellite time because HTML is so much smaller than Word.INI files would have been underpowered, SQL would have been overkill. I'd say this is an example of proper use of XML.
Admin
As the coder (internally or as a consultant), your presence is the extensibility. Mo' money, mo' money.
Admin
Sleepycat have a native XML database. Oracle have just bought Sleepycat. So perhaps we'll see.
Admin
And what would be this mystical "properly use" of XML that you are so keen on using?
Persisting complex objects in a text format, either for local storage or transmission to another system? Import/Export to/from databases? Displaying data on a web page (using XSLT)? AJAX? Event logging? (Yes, you read that last one right - just try setting up a CI environment with tools that don't support XML logging, and you'll see what I mean)
Perhaps there are better ways of doing it, but XML's power comes from the fact that it is widely accepted and, despite the anti-xml pseudo-religious cult that seems to be popular around here, it works. It does the job well enough that I don't see the need to force everyone to change to yet another format, even if that other format is actually better. XML came along when the only alternatives were CSV, which simply doesn't do the job if the data is anything other than a single table.
Someone mentioned JSON. I've never heard of JSON before now, and I doubt many other people have either. So why should I spend a lot of time implementing JSON into my application as, say an export format, when no-one can import it?
Admin
This is all so shameless
Admin
At the last place I worked their 'developers' (management had outsourced to el-cheapo workers in the USA, who in turn outsourced to cheaper workers in bulgaria) had done pretty much exactly that. Their app was both client side java swing and server side java running in some borland 'enterprise' server with ms-sql behind it. on the client side they use a custom persistance scheme to map the data to xml and then write that into an embedded database. on the server they wrote a subtly different xml to ms-sql.
and they complained it was too slow.
but the best / worst thing was they were not doing any of their db stuff wrapped in any sort of transactions and so during a typical step such as:
save meta-data save image generate image thumbnail save thumb
if the generate thumb step broke (and it did cos they hadn't worked out how to use JAI properly) the thumbs were never saved and the app seemed to be losing people's pictures.
sorry i seemed to veer off topic a bit there but yeah - xml as a longtext in a sql table is just stupid.
Admin
Actually, like most everything else, it all depends.
This example clearly shows that the XML consists of data that should be stored in the database tables, since it is being joined and searched on and the like. But the if the XML content is truly just a single value that is being stored, then it is no different than storing any other text in a database column.
The key is, are you putting a square peg in a round hole (as in this example), or do you simply need to store a snippet of XML for a set of entities in your data? The act of storing some text in an XML format in your database doesn't really break any rules, but trying to search, sort, or join to it using SQL is what makes this a WTF.
Admin
As a patron member of said cult, I should note that this is the *only* thing that gives XML its power. I should also note that "works" is a pretty low standard - I could base a data-exchange language on Microsoft Word, and that would still work.
There have been better alternatives to XML long before XML came around. Let us remember that XML post-dates HTML; it is, after all based on HTML (get the comment on MS Word above, they're both display documents?). Revisionist historians will tell you that "oh no, XML is a branch of SGML" -- but seriously folks, look at HTML. Then look at XML. Then look at SGL. It's pretty clear who's a child of whom.
Admin
Is it possible they aren't using SQL Server 2000?
Admin
This is another unfortunate case of people who aren't comfortable with DB's or are "developers" but don't know databases well.
As another poster stated, this is more modern "flat-file" way of "structuring" your database.
I'm going to bet too, that since the SQL is so attrocious, that the deadlines were blown and other developers resorted to selecting as much of the table as required and processing it in local code as opposed to just getting what they needed via SQL.
When are people going to learn?
Admin
As a patron member of said cult, I should note that this is the *only* thing that gives XML its power.
Indeed, I agree with that. But you can't argue that that power exists, it exists for a reason, and that at the moment there's nothing else that has that property. Ok, maybe CSV, but that can't handle anything more complex than a single table.
I should also note that "works" is a pretty low standard - I could base a data-exchange language on Microsoft Word, and that would still work.
Not without giving up a lot of functionality - just being plain text has a lot going for it, you know. I agree, it is a fairly low standard, and no-one's claiming that XML's perfect, or that if it was all done again, it might be done differently.
There have been better alternatives to XML long before XML came around. Let us remember that XML post-dates HTML; it is, after all based on HTML (get the comment on MS Word above, they're both display documents?).
Not quite... XML shares syntactical style with HTML, but that does not mean it's a direct descendant, more of a younger sibling.
Name some superior alternatives to XML that were around before XML came on the scene, and we can tell you why we're using XML rather than those alternatives. Even if that's just because no-one knew about the alternative (see my comment on JSON)Admin
no
Fully defined and specified special-purpose (DSL) markup language, SGML application
A metalanguage or a general purpose markup language used to create special-purpose language
A metalanguage or a general purpose markup language used to create special-purpose language
XML is not a child, it's a giant step.
Admin
Wow, I'm not even a database guy and this still makes my eyes bleed!
Admin
For you it's a fucking joke and I actually had to fight a consequences of someone hearing similar joke and deciding it's a great idea... No, really. Having some DBA expertise, I almost had a heart attack on the spot. The scariest part - it was advocated by two guys calling themselves architects. Yes, just to confirm, I'm talking about saving serialized objects to the SQL Server ntext column. Luckily, I won that fight, but I don't know what's next they're gonna come up with.
Admin
Sheesh. . .
CSV was a good idea before PKZip came along.
XML could not survive without something like PKZip.
XML has served to create a need for humungous disk drives. . . lots of memory for parsers to "work". For providing a standardizable way of presenting data for printing i.e., Markup repeat Markup. . . I haven't seen the use for it. I've been trying to climb the mountain to get the message. . . but sorry, I'll just keep going to a different church.
Admin
Goog god! XML is painful enough if you do it the right way.
Admin
Yeah, a giant step off the edge of the f*ckin world
Admin
I think the way to do this "properly" is to write a SQL stored procedure that uses sp_OACreate to make an OLE Automation object write and read flat files directly from the filesystem. That would greatly improve the extensibility of the system by removing all the "obstacles" presented by the RDBMS, such as columns, tables, views and indexes. Pure freedom to store the data as any format required by the end user. Heck, you could just give them the tools to write their own binary files. Ultimate freedom, ultimate extensibility. They could just type 000001101001001001111100110. Or use XML if they want. And you would not need to use that old fashioned SQL language much at all. Just pass in all the parameters to the stored proc, and off you go.
Plus it'd be really fast, by avoiding all the overhead caused by the database objects.
(Kidding. Sign me up to fix it afterward, though. Seems like there could be a lot of money in that, if you could avoid killing yourself halfway through.)
Admin
Uh, I would argue that what you are talking about (a table for every year/month/etc) is WTF in itself. There have been various posts on here with people doing that very thing. If you have too much data, archive it over to a read-only reporting server. But please don't make a bunch of duplicate tables. Repeating data structures is almost as bad as repeating data! They did exactly what you are discussing over at True.com and it was a huge WTF when they wanted some kind of report over several months or years. UNION this month table, with next months table, with the next months tables...etc. Unions galore. ug!
DRY: Don't Repeat Yourelf.
Admin
Sir, your masterful use of the "WTF" debating style leaves me astounded. Truly, you are gifted in the art of WTFery.
Admin
We use xml, as johnl outlined in an earlier post, for transmission between systems (and organisations). It's largely temporary and needs neither big disk drives or lots of memory. My company would be way too cheap to pay for either. Even where we have saved records off neither of these has been an issue. We don't use PKZip for anything XML related.
Like anything else, it can be used and abused.
Admin
This XML panacea is so similar to the proliferation of integration software I see permeating the industry. First, everyone rode the CRM wave, then the HIPAA/SarbanesOxley wave, now we're back on the (almost dead--now has new life) integration server wave thanks to all these funky XML formats flying around.
Man. I fucking hate it when people dump on BizTalk (2004, earlier versions were all crap). People dump on it for a reason that they don't understand: VERY FEW organizations should be using BizTalk. And I mean FEW. Most BizTalk implementations fail or overcomplicate environments because it's being used as a golden hammer.
If you are already using EDI format for partner transactions, maybe you could use BizTalk... (perhaps... sort-of).
If you are migrating from one database to another and you don't have the luxury of cutting everyone over all at once to a new design, BizTalk is a wonderful tool to use in a large scale migration project where you have both environments operating in your org simultaneously.
If you are seeking to integrate more than 4-5 stovepipe apps that CANNOT and WILLNOT be removed from the environment, then BizTalk will definately help you merge them closer.
If you deal with processes that begin outside your org and end outside your org, then BizTalk is the ideal solution there.
Too many people shove BizTalk into an environment where it just doesn't belong simply because they believe that their heavy use of XML passing warrants it. I have seen BizTalk used at a client site within a SINGLE application before. The app was not talking to any partners, it only had one DB2 database, and the client was WebForms C#. But the whole friggin' business logic was stuffed into BizTalk which made it a nightmare to make changes to the app. Plus, data was duplicated between DB2 and BizTalk since the wonderful "consultant" at a rather wealthy and well-to-do firm chose not to directly operate on DB2 from BizTalk (hint... the shop's first name starts with C and their last name begins with a vowel).
I'm not just nagging BizTalk. This argument applies to all integration vendors like TIBCO or K2 (but I wouldn't use K2 since it's non-performant crap... just because it's written in C# doesn't make it worth using).
I'm sure after a year or so this client is going to have to bring in an army from Hyderabad or Bangalore to fix this. What a mess.
I almost hate XML and B2B work because of this stupidity and the consultants running around rampant promoting this overcomplexity. Alas, I make $67/hr consulting these fools into fixing their own messes...
Admin
Okay, so this is clearly a WTF, but I don't see why people are constantly beating up on XML use in databases.
We run something quite similar for a number of our systems EXCEPT before I get jumped up and down on it's as a blob add on to the existing normalised databases, XML is never searched on and is not used for relationships etc.
The reason for doing it is that we have a large number of small projects where the end result is similar with clients wanting an occassional extra field here or there, just for minor customisation. We've an x-form that controls both render, validation and data collection for our stored procedures so instead of having to make any database schema changes when client X wants to add some such display field that isn't bound to anything else in the schema, or searched upon or would need to be normalised we can just copy a couple of lines in the x-form and everything works just as before.