The Daily WTF: Curious Perversions in Information Technology

2006-09-13 Reply Admin

Hubert Farnsworth:
Anonymous:

The problem that some people have with relational data, is that it conflicts with the way data are handled in the client(Client is here: The software that uses the database, it might very well, be a java servlet server) . If you for example are making a dating site, and you have a UserAccount class, which contains references to many other objecs such as MyImages,MyFriends,MyMailBox,MyGuestBook and so on,
you have an large object graph. This mean that when you need to load the user object, you alse need to reference to a lot of other tables, containing the GuestBok data, the ImageData, the Mails and so on.. The result is a major, ugly join which the java client, then have to read, and then greate the object graph again. (And things get even more ugly if you got sub objects with subobjects. Example
Class UserAccount {
    Vector<MessageGroup>myMessages;
}
class MessageGroup {
    String title; // Other data here.
    Vector<SingleMessage>allMessagesInThisGroup;
}
class SingleMessage {
String title,Message; // And so on.
}
Now to load an UserAccount from the database, including all that users mail, you must do a join between the user, the messagegroup and the SingleMessage table. Parsing the result of the join on the client is diffucult, and require quite some cpu time, becasue the reply from the database contains redundent data. (And having to writing all that code that transform data to/from a object graph sucks.)
There are however sereval solutions: There exists projects such as hibernate which contains code to automatic transform data to/from object to database. (Not perfect, and it sometimes cause to much overhead, and it somehow limit which queries you can ask about your data)
The best solution would be to use a object orientered database, but nobody have really made a good useable general purpose language neutral oodb, so it's not always a solution.
Aye mate, do you have any clue what you're talking about?
I'll leave the solution to you -- it's too embarassingly simple to write it down here.

I agree, and as Jess S said above, "Load on Demand!"

You are not going to have all your images, friends, guestbook entries and mail all on one page. If you do, that will be a WTF in itself. Basic design would give you a page with a menu, and that menu has links to each of these sub sections. In the friends list guess what, you don't need your email, guestbook, or images data loaded. Loading just the User, mailbox and messages is not that difficult nor is it hard to debug.

It also is not that time consuming to load considering the fact that you will probably be paging the information so if they kept all 1,000,000 messeges ever sent you don't try to display all of them at once. A situation like this actrually shows another flaw in thinking. If you feel you have to load the entire object at once along with all it's related data even when not displayed, then try to serialize it to disk, imagine the space requirements to hold all of that information with all the textual content of every email serialized down. Now compare that to loading only the text content of the currently displayed email message.

2006-09-13 Reply Admin

Anonymous:

ammoQ:
Here is a reason why someone would want to keep serialised Java objects in a database: Consider a product table where you have, besides all those normal columns like name, description, price, size, weight etc, a blob column containing a Java applet that somehow displays the product; depending on the kind of product, this might be an applet displaying an interactive 3D model for one product, while the applet for another product shows a short video clip.
A better solution would be to store the "Java Applet" in the file system, and just store the path to the "Java Applet" in the database. But I wouldn't do this stunt in the first place.
There is never a good reason to store binary data in the database unless you like to gouge your eyes out with a hot poker, or if you have no other choice.

Exactly!

Why have one copy of the applet per record when you can have one copy period, referenced by each record.

2006-09-13 Reply Admin

Anonymous:
Now to load an UserAccount from the database, including all that users mail, you must do a join between the user, the messagegroup and the SingleMessage table. Parsing the result of the join on the client is diffucult, and require quite some cpu time, becasue the reply from the database contains redundent data. (And having to writing all that code that transform data to/from a object graph sucks.

can be lazy loaded:

class activepattern {
   load () {
   self->data = query("SELECT * FROM" + this->clasename);
   self->touch_data = self->data;
   }

get(item) {
if ( self->touch_data[ item ]) return self->data[item];
return self->data[ item] = query("SELECT " + item + " FROM" + this->classname);
}

}

class client extends activepattern;

note to self: seems that the new html textarea of TDWTF lack the button to mark some text as monospace or some font monospace :(

Hubert Farnsworth · 2006-09-13 Reply Admin

Anonymous:
I agree, and as Jess S said above, "Load on Demand!"
You are not going to have all your images, friends, guestbook entries and mail all on one page. If you do, that will be a WTF in itself. Basic design would give you a page with a menu, and that menu has links to each of these sub sections. In the friends list guess what, you don't need your email, guestbook, or images data loaded. Loading just the User, mailbox and messages is not that difficult nor is it hard to debug.
It also is not that time consuming to load considering the fact that you will probably be paging the information so if they kept all 1,000,000 messeges ever sent you don't try to display all of them at once. A situation like this actrually shows another flaw in thinking. If you feel you have to load the entire object at once along with all it's related data even when not displayed, then try to serialize it to disk, imagine the space requirements to hold all of that information with all the textual content of every email serialized down. Now compare that to loading only the text content of the currently displayed email message.

Right, that's what I meant -- loading on demand, others call it lazy loading. SQL "limit".

http://en.wikipedia.org/wiki/KISS_principle

biziclop · 2006-09-13 Reply Admin

Anonymous:
Anonymous:
For those times when you need to persist a serialized Java object, the filesystem works really well. Why people keep dumping object graphs into the database is beyond me.
When are the times you don't want to "persist" a serialized Java object? Why would anyone go to the trouble of serializing an object, if he wasn't going to "persist" it?

RMI? Distributed hashtables?

I've used once an Oracle database for storing session data of webservers in a cluster with a typical lifetime of 5 minutes. (I mean the lifetime of the sessions, not the webservers.) That's definitely not persisting in a real world point of view. (I'm not saying I'm proud of it, but there you go, I've had a single afternoon to write it, test it and deploy it.)

edremy · 2006-09-13 Reply Admin

I once heard an amusing anecdote about a physics student taking an exam. One question had him stumped, and in a panic he wrote down some really random stuff, hoping for partial credit. When he got the test back, the professor had written a comment: "You're not even close enough to be wrong."

This sounds like a version of the classic tale about the great physicist Wolfgang Pauli. When shown a student paper he once commented "It's not even wrong." He apparently was noted for similar scathing comments about poor work throughout his life.

2006-09-13 Reply Admin

I think the worst wtf right now is the whole application is responsible for data pattern.

I currently work in an environment were we're making the same mistake in this wtf. My peers dont understand why all database queries can't be acheieved by the same application.

Good thing I'm hourly.!!!:-)

emurphy · 2006-09-13 Reply Admin

Anonymous:
don:
Just what is it that some people have against relational data? That's only partly rhetorical - seriously, does anyone know?
I think it's the opposite of a lot of other WTFs. They try to find a better way of doing it when there isn't.

That presumes that they know about relational data in the first place.

James Brantly · 2006-09-13 Reply Admin

Anonymous:

I poked into some of the underlying tables for MS Sharpoint Portal Server and found, to me, the most ostensibly bad database design. It had columns named {int01 . . .int32}, {varchar01...varchar32}, {text01...text32}. That was impressive (impressively bad), but this . . .this is AMAZING!

I submitted that as a WTF a long time ago. Alex responded that since it was by Microsoft it must have some design reason behind it. There are also other oddities in the Sharepoint database. A number of different naming conventions are used, and there are occasions where a primary key in one table is used as a foreign key in another table under a different column name that seems to have no relation. Unfortunately I do not have the specifics anymore as this was a long time ago, but the Sharepoint db design is definitely a big WTF.

ammoQ · 2006-09-13 Reply Admin

Anonymous:
Anonymous:

A better solution would be to store the "Java Applet" in the file system, and just store the path to the "Java Applet" in the database. But I wouldn't do this stunt in the first place.
There is never a good reason to store binary data in the database unless you like to gouge your eyes out with a hot poker, or if you have no other choice.
Exactly!
Why have one copy of the applet per record when you can have one copy period, referenced by each record.

I could think of reasons why every product in the database has it's own applet, but I don't want to stretch this example too far.

Anyway, having binary data in the database instead of the filesystem means that you can use the access control, replication, network, backup features etc. of the database system instead of replicating that for the file system. In some cases, this is desireable without being too much of an abuse.

2006-09-13 Reply Admin

Anonymous:
Example
Class UserAccount {
    Vector<MessageGroup>myMessages;
}
class MessageGroup {
    String title; // Other data here.
    Vector<SingleMessage>allMessagesInThisGroup;
}
class SingleMessage {
String title,Message; // And so on.

If this is Java, the WTF is using

Vector

in a variable declaration. What the heck do we have a complete collection API with the generic interfaces

Map,List

and

Set

for? Using

Vector

just invites the next half-witted developer to use the old API (e.g.

addElement()

etc) which is not compatible with the newer collection classes (e.g.

add()

etc.) Changing collection implementations from e.g.

Vector

to

ArrayList

then becomes a major pain in the a**...

captcha: tango

2006-09-13 Reply Admin

ammoQ:
Anonymous:
Anonymous:

A better solution would be to store the "Java Applet" in the file system, and just store the path to the "Java Applet" in the database. But I wouldn't do this stunt in the first place.
There is never a good reason to store binary data in the database unless you like to gouge your eyes out with a hot poker, or if you have no other choice.
Exactly!
Why have one copy of the applet per record when you can have one copy period, referenced by each record.
I could think of reasons why every product in the database has it's own applet, but I don't want to stretch this example too far.
Anyway, having binary data in the database instead of the filesystem means that you can use the access control, replication, network, backup features etc. of the database system instead of replicating that for the file system. In some cases, this is desireable without being too much of an abuse.

Instead of replicating that for the file system? Are you proposing not backing up your necessary data on your production server? Of course a single applet I wouldn't really add to the back-up if the installs were available. Personally, you back up data, not executables. Executables are (relatively) unchanging and can be re-installed or restored from a master image. Data is ever changing and needs constant backups.

Now lets apply this to the current situation. The java applet (read as player) is a single instance, rather it is for a movie or 3d model display or just an image display. You save this once on disk as it is an installed app. Now the database either holds the location of this player, or more specifically a field denoting the type of data needing to be played so the client can select the player. Now the data to be played is specific to the record, 3d model, movie, img, etc. and it can be argued as to where to save this data, either file system or blob, but the player itself should not be saved in the database in this fashion.

So yes there is a reason why you might want to use the blob, but not for the example given. Extrapolating it out and actually thinking about it gives us an area where the filesystem or database become a matter of choice for saving the data.

ammoQ · 2006-09-13 Reply Admin

Anonymous:
ammoQ:
I could think of reasons why every product in the database has it's own applet, but I don't want to stretch this example too far.
Anyway, having binary data in the database instead of the filesystem means that you can use the access control, replication, network, backup features etc. of the database system instead of replicating that for the file system. In some cases, this is desireable without being too much of an abuse.

Instead of replicating that for the file system? Are you proposing not backing up your necessary data on your production server?

This was meant in a more general context. Imagine a forum board application that allows users to upload files (e.g. images). Of course you need a backup of the server as a whole, but if the uploaded files are stored in the database instead of the file system, the database is the only thing that constantly changes so your daily backup needs to include the database only.

Of course a single applet I wouldn't really add to the back-up if the installs were available. Personally, you back up data, not executables. Executables are (relatively) unchanging and can be re-installed or restored from a master image. Data is ever changing and needs constant backups.

OK, let's go back to the Java applet example. I don't think this is true anymore, since no-one does Java apples anymore, but in the beginning, it was rather difficult to make an applet that loads data from the server, especially if the client used a proxy server. A possible strategy to circumvent this problem was to include all the necessary data in the applet itself, as big "constants". So you would have a script that converts the movie into a big "string literal" (or something similar) and merge that with the rest of the source file, thus leading to "one applet per product".

smbell · 2006-09-13 Reply Admin

Am I the only one that feels a strong urge to watch Battlestar Galactica on SCI FI October 6th at 9pm?

2006-09-13 Reply Admin

ammo, I don't feel the need to quote any more for this.

The uploading of image (without a viewer) I think I covered and said yes, maybe in the blob. SO I would say this is a choice, no argument.

As for the java applet thing, something ment to work on the web that can't load data from a server? Now that seem like another WTF in itself, but let's not start there sine you say it is no longer in common use. (I'm assuming common use is what you ment because I imagine someone out there is doing this still)

2006-09-13 Reply Admin

"The best solution would be to use a object orientered database, but nobody have really made a good useable general purpose language neutral oodb, so it's not always a solution. "

Actually, the best solution would be, if the system is highly data-oriented to not try and model everything in the client as an object with a real-world match that has to be loaded from the database for any task, no matter how trivial. Nothing wrong with "dumb" classes that simply read relational data and write them to, say, a web site grid.

We can get so married to our object models, it hurts to give them up. So, we do stupid things like persist serialized objects to database tables.

I've seen way to many Object/Relational mapping "solutions" load an entire complex object graph into memory to perform one calculation that can be performed with only a small set of the object graph. However, the large overhead (read 5-10 seconds to open a web page that should have taken <1) was spent so that the dev team wouldn't have to worry about mapping objects to data and could use a nice code-gen'd "solution".

2006-09-13 Reply Admin

I once inherited a system that stored huge chunks of XML in the DB. The previous developer was very proud of this saying "it is completely extensible and dynamic" - the killer of that is holly.

But this thing...

It makes my case look like a candidate for best design of the year.

2006-09-13 Reply Admin

Now users want to use reports using sqlplus, crystal reports etc. So they want a solution to this blob issue.

Okay... what we need is to add code to the code that stores the object into the database. If we need reports out of 10 of the object's properties, then before/after we save the object, we write the data from the 10 fields into a relational table. We can probably handle multivalues as a list - so it'll probably be just one table. Yes, the data will be duplicated, but we don't really have to worry about the cost of disk storage anymore. Think of the relational table as an overhead, just like an index. We don't have to worry about concurrency issues as long as the tables are read-only. Hell, it could even be stored in the same table as the object, since we'd need to use the object's unique ID as the primary key.

This allows the object to evolve at will, and is not constrained by the mechanism by which it is persisted. Relational storage is fine in certain circumstances. But it can also weigh a system down with a lot of complexity. And isn't it true that a highly normalized schema reduces disk space and improves data retrieval - but slows down insert/update?

Alternatively, investigate using an object-relational mapping software like Hibernate. But this is pretty complex - it'll add a few zeros to the budget and add a few months to the timetable. It's not something to just take up because the DBA's want to see highly normalized relational tables.

captcha: java

ammoQ · 2006-09-13 Reply Admin

Anonymous:
As for the java applet thing, something ment to work on the web that can't load data from a server? Now that seem like another WTF in itself, but let's not start there sine you say it is no longer in common use. (I'm assuming common use is what you ment because I imagine someone out there is doing this still)

Back then, the problem was that a java applet allowed to contact only the server it was loaded from (for security reasons). IIRC there were problems when the client used a proxy, but maybe it was just a WTF caused by incompetent programmers.

2006-09-13 Reply Admin

I, for one, welcome our new blob overlords.

CAPTCHA: craptastic

elbaz · 2006-09-13 Reply Admin

IBM's DB2 V9.1 supports XML as a data type and they are intergrating XQuery and SQL so that you will be able to build queries that return both 'normal' data types as well as the values of nodes within the XML data.

DrPizza · 2006-09-13 Reply Admin

We had some offshore devs who wanted to pull this kind of stunt; they were adamant that the database should be a black box just used for storing stuff. Fortunately we said no, and insisted that the database was an interface in itself, and had to conform to various sanity constraints. To their credit, when we explained to them about how we could use off-the-shelf reporting software when we did things sanely, they did see the benefit in doing it the normal way.

2006-09-13 Reply Admin

Anonymous:

Anonymous:
For those times when you need to persist a serialized Java object, the filesystem works really well. Why people keep dumping object graphs into the database is beyond me.
When are the times you don't want to "persist" a serialized Java object? Why would anyone go to the trouble of serializing an object, if he wasn't going to "persist" it?
And, of course the filesystem works really well... that's why databases use the filesystem to do their "persisting". You must mean something else, like, "ASCII files work really well", or something like that.

Crossing network boundaries comes to mind.

2006-09-13 Reply Admin

Anonymous:
And, of course the filesystem works really well... that's why databases use the filesystem to do their "persisting". You must mean something else, like, "ASCII files work really well", or something like that.

Are you stupid? Of course I'm stating the obvious - it's so damned obvious, the dummies in the WTF couldn't figure it out.

2006-09-13 Reply Admin

fmobus:

People who can't understand relational databases are likely to reinvent it. Poorly

Hey, I'm guilty of that ... but at least I had an excuse. I was developing an app for a friend's dad's company when I was 15, and it needed a database. Of course, I'd never heard of SQL or RDBMS, so I created the system with flat files. Here's the funny thing, though: I had lots of tables, foreign keys, normalized data, equivalents of SELECT and JOIN, etc. I'm now pretty proud that I managed to invent relational flat files in a couple weeks. And I'd do it again given the same constraints: it had to fit on a single 3.5" disk, and the database had to be editable using Notepad when offline. I doubt you could even fit an XML parser onto the disk, much less an XML-encoded DB.

----

BSG girls > beanbag and foosball girls

VGR · 2006-09-13 Reply Admin

ammoQ:
Here is a reason why someone would want to keep serialised Java objects in a database: Consider a product table where you have, besides all those normal columns like name, description, price, size, weight etc, a blob column containing a Java applet that somehow displays the product; depending on the kind of product, this might be an applet displaying an interactive 3D model for one product, while the applet for another product shows a short video clip.

That is certainly a conceivable justification, but not an acceptable one. A better way by far is to have a column called "appletType" or something, of type char or int (or whatever you prefer for representing enumerated types), and have the program map those values to applet classes (or classes containing content which can be placed in an applet). Associating data with a UI is the job of the program, not the data or the database.

technites · 2006-09-13 Reply Admin

The relational crowd are clueless about OO, and the OO crowd think the DB is just an object store. I wonder what this whole non-debate will look like in 10 years?

peaveydk · 2006-09-13 Reply Admin

ammoQ:
Anonymous:
As for the java applet thing, something ment to work on the web that can't load data from a server? Now that seem like another WTF in itself, but let's not start there sine you say it is no longer in common use. (I'm assuming common use is what you ment because I imagine someone out there is doing this still)
Back then, the problem was that a java applet allowed to contact only the server it was loaded from (for security reasons). IIRC there were problems when the client used a proxy, but maybe it was just a WTF caused by incompetent programmers.

Indeed we are dealing with folklore to some extent here. The httprequest since at least java 1.1 uses the browser and its settings to recieve anything with a normal HTML 1.0 GET and if using a proxy, no problem, the content will be fetched via that proxy. The browser HTTP-gets the data and delivers it to the java plugin just like it got the web page and the applet you are looking at.

Now without the applet being signed, you can only contact the originating site. Well if you were able to place the applet code on the site, I guess it is reasonable to believe you could put the applet's data on the site too for dynamic loading. Hating that you cannot fetch data from another site is ok. But with the default setting of a sandboxed environment this is perfectly sane to me.

Applets are used heavily for casual online gaming and alot of people make their living writing applet games.. I did, for 3 years, great fun!

lpope187 · 2006-09-13 Reply Admin

Anonymous:
This allows the object to evolve at will, and is not constrained by the mechanism by which it is persisted. Relational storage is fine in certain circumstances. But it can also weigh a system down with a lot of complexity. And isn't it true that a highly normalized schema reduces disk space and improves data retrieval - but slows down insert/update?

Highly normalized schemas do reduce disk space since you are only storing the bulk of the data once. You still need to propagate a releatively tiny key field though to join the data. Normalization doesn't slow down inserts/updates but make it faster since you make the change on one record rather than many in a denormalized (flat) structure. Normalization does make selects slower since you need to join out to other tables to get valuable information. In a nutshell, its like reading the content from a single file (denormalized) versus reading from one master file and then conditionally reading from 10 others (normalized). The denormalized is definitely going to be faster since you don't have all the overhead of managing lookups, file pointers, etc. In transactional systems, it definitely preferred to have a normalized structure since most of the operations are insert/update/deletes and you want them completed as quickly as possible to avoid locking issues.

2006-09-13 Reply Admin

biziclop:
Jeff S:
Anonymous:

The problem that some people have with relational data, is that it conflicts with the way data are handled in the client(Client is here: The software that uses the database, it might very well, be a java servlet server) . If you for example are making a dating site, and you have a UserAccount class, which contains references to many other objecs such as MyImages,MyFriends,MyMailBox,MyGuestBook and so on,
you have an large object graph. This mean that when you need to load the user object, you alse need to reference to a lot of other tables, containing the GuestBok data, the ImageData, the Mails and so on.. The result is a major, ugly join which the java client, then have to read, and then greate the object graph again. (And things get even more ugly if you got sub objects with subobjects. Example
Class UserAccount {
    Vector<MessageGroup>myMessages;
}
class MessageGroup {
    String title; // Other data here.
    Vector<SingleMessage>allMessagesInThisGroup;
}
class SingleMessage {
String title,Message; // And so on.
}
Now to load an UserAccount from the database, including all that users mail, you must do a join between the user, the messagegroup and the SingleMessage table. Parsing the result of the join on the client is diffucult, and require quite some cpu time, becasue the reply from the database contains redundent data. (And having to writing all that code that transform data to/from a object graph sucks.)
There are however sereval solutions: There exists projects such as hibernate which contains code to automatic transform data to/from object to database. (Not perfect, and it sometimes cause to much overhead, and it somehow limit which queries you can ask about your data)
The best solution would be to use a object orientered database, but nobody have really made a good useable general purpose language neutral oodb, so it's not always a solution.

Actually, the best solution is to only load data when needed, write efficient SQL, used stored procedures (and/or a data access library to handle all database interfacing), have a good database design, follow best practices, and write good code. Every "problem" with relational databases that you described in your example is the result of not doing things in an optimal or correct way. Why load up all data into objects before they are needed? Why not load them from the database when they are actually *requested*. Just because a User object might have lots of related content, there is no reason to query it all in from the database until you actual need it. And if you do need it all, and you do pull in all related data from the database and create objects from it, what's wrong with that? I fail to see what the issue is. Too much code to write?
What you describe is what occurs when someone uses a relational database as a substitue for an XML file or a text file for persisting data, and doesn't properly utilize the database engine at their disposal.
The benefit of stroing your data properly into related tables instead of a massive objects stuffed into generic "object" columns far outweights the "hassle" of writing code to do the transformations. .... not only that, it is very easy to do with simple code generation tools. The key is to let those tools generate the boring stuff like assigning columns to properties, but never let those tools write your SQL, especially when joins are involved.

You're right in a way. You've only forgotten about one crucial resource: time. Writing, testing and especially debugging stored procedures manually takes a lot of time. Usually it takes more time than what you have.
You then might choose to develop a toolkit for helping you doing all the above. But then you risk building a potential monster, a horrible mesh of scripts, triggers and helper stored procs totally incomprehensible to anyone but you.
Or you might try using a "standard" O-R mapping tool, like Hibernate, sacrificing some of the performance to shorten development time.
Or you might choose something in between, depending on the actual task and deadline and speed requirements.
But we're overcomplicating things. You simply have to use the right tools for the right job. All you need is common sense. Sadly, that's what many programmers lack entirely.

Amongst the usual plethora of smug, arrogant and predictably ignorant responses on this site, there appears to be a voice of reason. I find it incredible that people can even argue this point - it's all about context. OR Mapping tools generally speed up development on most reasonably complex projects but there is usually a tradeoff in terms of performance . Using a datacentric approach has it's place too - usually in simple applications but also where performance is critical and there's adequate resources and time. OR mappers generally do an awful job of reporting and anyone who has actually worked in the real world would know this.

Do you guys actually read about this stuff before you post? Try Fowler, Evans and Nillsson just to begin with. As he said: You simply have to use the right tools for the right job. All you need is common sense. Sadly, that's what many programmers lack entirely.

lomaxx · 2006-09-13 Reply Admin

I'm sure that people that design DB's like this started out like the OP on this thread http://forums.whirlpool.net.au/forum-replies.cfm?t=587798

2006-09-13 Reply Admin

Anonymous:
"The best solution would be to use a object orientered database, but nobody have really made a good useable general purpose language neutral oodb, so it's not always a solution. "
Actually, the best solution would be, if the system is highly data-oriented to not try and model everything in the client as an object with a real-world match that has to be loaded from the database for any task, no matter how trivial. Nothing wrong with "dumb" classes that simply read relational data and write them to, say, a web site grid.
We can get so married to our object models, it hurts to give them up. So, we do stupid things like persist serialized objects to database tables.
I've seen way to many Object/Relational mapping "solutions" load an entire complex object graph into memory to perform one calculation that can be performed with only a small set of the object graph. However, the large overhead (read 5-10 seconds to open a web page that should have taken <1) was spent so that the dev team wouldn't have to worry about mapping objects to data and could use a nice code-gen'd "solution".

Right on.

People act like using a relational database is some incredible feat of engineering.Crap like using an object oriented solution to load an entire table into objects and then scan the list of objects instead of doing a query for the data you are interested in. Or updating a selection of rows by loading all the rows into objects, updating the objects, and then saving them all back, instead of executing one simple SQL statement.

Having relationships between different types of text data is (gasp) what relational database systems are good at. It's easier to use them properly than it is to screw around looking for some object-relational framework to do your work for you.

2006-09-13 Reply Admin

don:
Just what is it that some people have against relational data? That's only partly rhetorical - seriously, does anyone know?

Yes, though a thorough explanation requires a fundamental understanding of some non-trivial mathematical concepts (such as category theory) and logical methods of reasoning that are not often taught in computer science (hence the existence of this website). Those familiar with functional programming may be able to infer meaning, but I'll attempt a brief summary in any case.

A relational structure contains attributes that are symbiotic - that is, given a relation R with attributes a and b, one cannot observe a independently of b even though there exists a legitimate context such that a exists without any relationship to b. If this does not hold, then the separation of a from b in the first place makes no sense (reductio ad absurdum) and so there should not exist a and b, but just some a as an axiom. The contexts of symbiotism between a and b are relative to the observer. I realise this explanation is full of holes, so I'll attempt a metaphor. A chair is not a chair, but a grouping of parts; the legs, the base, the back, etc. That it can be observed to be a chair is a higher-order construct used only for efficient communication and nothing more. Is the leg of a chair an axiom of chairs such that all legs are equivalent? Not necessarily, since a leg is made up of ... ad infinitum. The alternative to relational databases - acknowledging this (perhaps poorly explained) problem that I have seen is RDF.

I hope I haven't undermined the work of those (including myself) who are trying to explain this is a more fashioned way, suffice to say that it also invalidates the nature of OO programming as a purely human construct and most importantly - at the expense of software - this problem is solved by FP languages. "RDBMS is to RDF as OO is to FP" is perhaps a convenient trivialisation.

Please ignore this stuff if you (understandably) feel the need to throw out the straw man argument (as I perceive it).

2006-09-13 Reply Admin

don:
Just what is it that some people have against relational data? That's only partly rhetorical - seriously, does anyone know?

As a procedural / OO programmer, I'll tell you what some people have against relational data; it doesn't make any sense! Or, to be honest, it doesn't until you have a couple DB integration projects under your belt. The lack of a step-by-step program is foreign to those of us with a basic, pascal, ada, c, c++, or java background. Of course you can delve into stored procedures and cursors and all the rest and replicate the procedural programming environment but my guess is that most of the people who are accessing the database that way are already fairly comfortable with relational data.

On the other hand, now that I've had to delve into relational database and SQL in particular, it's pretty cool. Especially when that additional index turns the the 10 minute query into a 3 second query.

2006-09-14 Reply Admin

Blob Fields contains a java serialised object, different objects based on types, though all of them implements same interface. We access this object always thru' j2ee container, so it works fine so far.

Was this person by any chance a Finn? The slightly odd word order and lack of articles looks strangely familiar.

2006-09-14 Reply Admin

Anonymous:
I used to do mostly 'flat' databases, though I had self-discovered normalization to some extent on the more complex tables. The biggest problem is that all these SQL references and books didn't even mention the concept of it. Accordingly, if anybody reading isn't exactly sure what the concept is, google for Database Normalization. It really makes a big difference in crazy database projects.

I have discovered that normalisation is process almost reverse to optimization even before db classes in college, when I made one query executing 10sec instead of 2min.

captcha: random

2006-09-14 Reply Admin

Anonymous:
As a procedural / OO programmer, I'll tell you what some people have against relational data; it doesn't make any sense! Or, to be honest, it doesn't until you have a couple DB integration projects under your belt. The lack of a step-by-step program is foreign to those of us with a basic, pascal, ada, c, c++, or java background. Of course you can delve into stored procedures and cursors and all the rest and replicate the procedural programming environment but my guess is that most of the people who are accessing the database that way are already fairly comfortable with relational data.

[poor english follow]

Yea. Its "fun" to watch a programmer with years of experience, tyiing to learn SQL. "Its everything wrong!" will comment, or.. "how can I add comments on standard SQL?", and the answer "The standard dont support comments".

But, imho, the way a database work, how and why Its much normal and easy to understand. Its the 4GL on SQL that is shocking. And the fact that look like code, but Its not code. HTML and all tag languajes are better on that, because dont try to look like code, so programmers dont try to use some paradigm (procedural, OOP, etc.)

A SQL 101 theacher sould start with a "SQL is not programming"*, imho, and everything else will be smooth. Else people will start writing cursors and storing serialized objects on the first hour of SQL..

*:the pedantic version is "SQL is not 3GL".

I am not In fall in love with databases. I still hate I can't use my favorite paradigms. But there are no other options. The filesystem don't scale (tons of files mean slowdown, and the filenames are limited on some OS). The memory is bad location for important data. Structures arrays of structure data can be messy and is not that flexible, but hardcoded on the program.

cthart · 2006-09-14 Reply Admin

Tom,

Have you (ever) looked inside an Oracle Warehouse Builder repository? This practice also occurs inside Oracle :-(

Cheers,

Colin

frumious · 2006-09-14 Reply Admin

James Brantly:
Anonymous:

I poked into some of the underlying tables for MS Sharpoint Portal Server and found, to me, the most ostensibly bad database design. It had columns named {int01 . . .int32}, {varchar01...varchar32}, {text01...text32}. That was impressive (impressively bad), but this . . .this is AMAZING!

I submitted that as a WTF a long time ago. Alex responded that since it was by Microsoft it must have some design reason behind it. There are also other oddities in the Sharepoint database. A number of different naming conventions are used, and there are occasions where a primary key in one table is used as a foreign key in another table under a different column name that seems to have no relation. Unfortunately I do not have the specifics anymore as this was a long time ago, but the Sharepoint db design is definitely a big WTF.

I've been having to fiddle around with Sharepoint myself recently, and although it seems like an almighty mess in general, I can sort of understand, as a non DB guru, why they've made that particular WTF. Those columns are used for Lists, in which there can be variable numbers of fields of different types with different names. By setting up the Lists table with a healthy number of columns of all types, and recording what goes in which field in a special (XML-based...) definitions column, they get the advantage of flexible lists, combined with easy and fast querying (once you know what's in what column, admittedly...), as opposed to having to do some obscene mega-join to turn whatever other structure they came up with into comprehensible data, or maybe having to change the DB schema every time a particular List changed. Genuine question: Is there an entirely better way of doing that? Someone also mentioned solving a "how to put hierarchical data in a DB" problem earlier, as a holder of the wrong kind of science degree I'd be interested in how you're meant to do that, too!

Richard

2006-09-14 Reply Admin

You might be onto something there - I've lost count of the number of times that I've been supplied with a 'datafeed' that turned out to be an arbitrarily formatted spreadsheet.

Any suggestion that we provide a web front end or Oracle forms interface to the users, in order to simplify data input, is usual met with low regards from the 'project manager' - thus committing us to endless rounds of debugging/rewriting when the users decide to change layouts.

Personally, I think that it's all a great big conspiracy by Psychiatrists, to make money from the legions of developers who will no doubt need therapy for the rest of their lives.

Bob.

2006-09-14 Reply Admin

Oops - forgot the quote (Typing is really difficult whilst wearing a straightjacket!)

Anonymous:
I suspect that one contributing factor to the lack of database intelligence comes from spreadsheets, where you can store pages of simple lists with formulas.
Scroll down or to the right and you've got a completely new set of information.<>
<><>So how hard can it be to implement this in a standard database? It's just one long list, right?
<>Relational databases, structural integrity, security, and normalization are things that a lot of people can pronounce, and because of that those people think they have an understanding of it. Oh well. I get paid a bunch to fix their mistakes.

You might be onto something there - I've lost count of the number of times that I've been supplied with a 'datafeed' that turned out to be an arbitrarily formatted spreadsheet.

Any suggestion that we provide a web front end or Oracle forms interface to the users, in order to simplify data input, is usual met with low regards from the 'project manager' - thus committing us to endless rounds of debugging/rewriting when the users decide to change layouts.

Personally, I think that it's all a great big conspiracy by Psychiatrists, to make money from the legions of developers who will no doubt need therapy for the rest of their lives.

Bob.

2006-09-14 Reply Admin

Storing XML data in tables is nothing new...

SQL 2005 comes with some extra functionality such as table column definition based upon XSD which allows indexing and "easier" reporting. However, nothing stops you to extract that data to an OLAP database using SSIIS for MI reporting. This isn't a WTF... clearly just a lack of understanding.

2006-09-14 Reply Admin

"All tables will have only a few columns (such as ID's and created dates), and the rest of the data will be stored in an XML-formatted TEXT column"

So basically you want a filesystem not a database.

2006-09-14 Reply Admin

There are some cases where this makes sense. The Quartz scheduler stores a map of data for scheduled jobs. If you are persisting your schedule obviously this stuff needs to go in the DB and you have to store it as a BLOB as you don't know what the client programmer will put in there. And in this case there is no sane reason for you to query the scheduler information without going through Quartz.

ParkinT · 2006-09-14 Reply Admin

don:
Just what is it that some people have against relational data? That's only partly rhetorical - seriously, does anyone know?

Some people just don't want to have "relations" with their data!

2006-09-14 Reply Admin

Hey. I read Don Burleson's stuff. But I can't stay and chat for long. Gotta go rebuild some more indexes. Again. ;-)

musigenesis · 2006-09-14 Reply Admin

This was my favorite line from the post:

Rational Behind:

Sounds much better than "irrational asshole".

musigenesis · 2006-09-14 Reply Admin

Anonymous:
don:
Just what is it that some people have against relational data? That's only partly rhetorical - seriously, does anyone know?
I think it's the opposite of a lot of other WTFs. They try to find a better way of doing it when there isn't.

This is possibly the best comment ever.

2006-09-14 Reply Admin

Since there seems to be so much expertise here I would like to table (no pun intended) a system I am working on at the moment. I would be interested in comments/suggestions.

Here are some of the main characteristics/requirements:

There are approximately 10 main standard 'types' of things that can be related to each other in any way. Typically these form a hierarchy like this:
a -> b -> c -> d -> e -> f -> g -> h -> i -> j
But any of these relationships can also be many-to-many. That is, a graph (with vertices and edges). So, an object 'f' typically has many children of type 'g', but sometimes a 'g' has multiple parents (of type 'f').

Customers want to decide whether to use these standard types or not. For example, some customers will say "We don't need types b,c and d, so let's turn them off. Oh and we think 'f' items should appear above 'e' items."

Customers want to be able to define their own new (custom) types. They want to be able to define their own type ('m') and insert instances of the new type anywhere within system.

Customers want to be able to define their own custom properties on any of the types.

Customers want to be able to see the state of the system as it was at any point in time. For example, show the objects and the relationships between them as they looked at Jan 1, 2005.

Customers want to use standard reporting tools such as Crystal Reports for custom reports.

They want standard and custom reports that show trending over time.

Well that's some of the main issues. I do have some direction/solution on these but I wont share them at this point as I am hoping for fresh ideas rather than discussions of my current thinking.

2006-09-14 Reply Admin

Jeff S:
Every "problem" with relational databases that you described in your example is the result of not doing things in an optimal or correct way. Why load up all data into objects before they are needed? Why not load them from the database when they are actually *requested*. Just because a User object might have lots of related content, there is no reason to query it all in from the database until you actual need it.

Jeff, you're completely correct. To me, a "complex" join isn't based on the number of tables, its based on the relationships between tables. If you use surrogate keys and referential integrity constraints, there's no reason why you can't construct, populate, and manage a very large, complex object tree. Oracle IS an ORDBMS -- OR=object relational -- after all. The only thing that makes joins get ugly is when you all of a sudden to need to infer relationships that you hadn't considered before. At that point, you need to consider denormalization.

Oracle even provides functions to read a tree struture from a single table..so if a table references itself you can still extract the entire subtree from a given node in a single, simple query.

Tom Kyte on The Ultimate Extensibility

Leave a comment on “Tom Kyte on The Ultimate Extensibility”