• (cs) in reply to Otto

    Otto:
    Okay, so the solution was a WTF, but I didn't have their source code to fix the problem, and would have had to charge more for doing so anyway. As it stood, I solved the problem in 3 days, and 1 of those was spent drinking, to try to dull the pain. They thought I was a miracle worker. :)

    One bad hack to fix another!

    I approve.
     

  • (cs) in reply to Phil

    Anonymous:
    Since there seems to be so much expertise here I would like to table (no pun intended) a system I am working on at the moment. I would be interested in comments/suggestions.  

    Here are some of the main characteristics/requirements:

    There are approximately 10 main standard 'types' of things that can be related to each other in any way. Typically these form a hierarchy like this:
         a -> b -> c -> d -> e -> f -> g -> h -> i -> j
    But any of these relationships can also be many-to-many. That is, a graph (with vertices and edges). So, an object 'f' typically has many children of type 'g', but sometimes a 'g' has multiple parents (of type 'f').

    Customers want to decide whether to use these standard types or not. For example, some customers will say "We don't need types b,c and d, so let's turn them off. Oh and we think 'f' items should appear above 'e' items."

    Customers want to be able to define their own new (custom) types. They want to be able to define their own type ('m') and insert instances of the new type anywhere within system.

    Customers want to be able to define their own custom properties on any of the types.

    Customers want to be able to see the state of the system as it was at any point in time. For example, show the objects and the relationships between them as they looked at Jan 1, 2005.

    Customers want to use standard reporting tools such as Crystal Reports for custom reports.

    They want standard and custom reports that show trending over time.


    Well that's some of the main issues. I do have some direction/solution on these but I wont share them at this point as I am hoping for fresh ideas rather than discussions of my current thinking.

    I kind of think you're setting yourself up for failure.  Systems that try to be everything to everyone more often than not end up as fodder for this site.

    That said, I take it you're looking for database design suggestions?  Well, here goes:

    You clearly need to separate information about the objects from information about the relationships among the objects.  If you have just a few, similar objects, I would go with 2 tables, an object table (each row is an object) and a relationship table (each row is a binding from one object to another).  Scale this up to many object tables / many relationship tables as required by the amount of data you anticipate your system being required to hold.
     

  • sb (unregistered) in reply to technites

    Design dev, and test time allowed is always an issue. I have a similar table design (id, text field for storing compressed, custom-serialised object), but it's a case of what works in the real world - low-spec laptops running outdoors, communicating with each other over a busy and unreliable wireless network. Machines need to send high volumes of messages between each other, allowing recovery and catch up if the network is down, so messages are stored in this type of design. The rest of the database has usual relational design, but if you were to query the raw data for thousands of objects, of different types - so a generic query wouldn't work - and create these objects to send over an unreliable network, it would kill the performance on the low-spec machines and make the app unusable.

    SO: What's wrong with having a message store, where, on receipt of a request for messages, e.g. "since 12345", they can be just retrieved and sent as-is to the other machine?

    Oh, and get it working by tomorrow.

  • Leo Petr (unregistered)

    DB2 9 lets you embed XQuery, a W3C standard for querying XML, in its SQL statements to query both traditional relational tables and inside the new XML column type. It's a very nice solution in situation where a database has to co-exist with lots of xml files -- pretty much every enterprise -- because you can stick the data in the database without shredding it. Shredding is a huge pain because you have to update and test the shredding script every time the schema changes.

    Microsoft SQL Server is also going down that route, and it looks as though there will be a way of embedding XQuery in SQL in the next ISO SQL standard.

  • Gsquared (unregistered) in reply to Phil

    On the a>b>c... thing, here's an idea:

    Customers table

    Types table

    Types_Relations table that includes the ID field for the customer, a SeniorTypeID field and a JuniorTypeID field.  This is your rule-set for what types can relate to what other types for what customers.

    Types_Types join table, with RelationID in it.

    Highly normalized, and allows customer-specific rules and integrity.

    Also, I suggest posting technical questions like this on pages like TechRepublic, where that's the purpose of the site.  Easier to track your answers and such.

  • RNH (unregistered) in reply to don
    don:
    Just what is it that some people have against relational data? That's only partly rhetorical - seriously, does anyone know?

    Some mother f*ckers are always trying to swim upstream.  And the worst part of it is that they just don't know enough for you to be able to reason with them during architecture meetings.

    I feel sorry for Tom, I really do.  His AskTom blog can be a fulfilling joy to write one day and a lesson in not clawing out your eyes the next.  For him, week after week, it usually goes somehting like this: "Please for the sending me of code to do X on Oracle 9i database with websphere and .NET and QuickBasic Monkey Poo Edition.  Pls hurry, I have big enterprise contract to deliver next week.  Many thanks in advance, Mr. Foreign Name"


  • (cs) in reply to Phil
    Anonymous:
         a -> b -> c -> d -> e -> f -> g -> h -> i -> j
    Implementing a database on top of a database is bad news. It sounds like you need a layer on top of the DB to deal with schema evolution (adding columns, none of that type-attribute-value table crap), and yet another layer to handle data versioning and the temporal or bi-temporal aspects. You pretty much ruled out object DBs with the Crystal Reports requirement (though they do schema evolution, and the whole many-to-many/graph thing).
  • Mike Swaim (unregistered) in reply to don

    don:
    Just what is it that some people have against relational data? That's only partly rhetorical - seriously, does anyone know?

      In a lot of shops, the DBA/DA is in a seperate group from the application developers, so getting database changes can require running a gauntlet. Rather than fight that battle, it's easier to just stick the data in a string or blob that you control. (That was the reasoning that the Sitara team at Enron used to stick data inside CLOBs. One of the system architects later admitted that they might have made a mistake. They had 2 (two!) databases with extracted data for outside applications to use to get reports, billing data, contact info, whatnot out of.

  • Mike Swaim (unregistered) in reply to Phil

    Anonymous:
    Customers want to be able to define their own new (custom) types. They want to be able to define their own type ('m') and insert instances of the new type anywhere within system.

    Customers want to be able to define their own custom properties on any of the types.

      We can do something like this, using a modified Entity-Attribute-Value schema. We have a table that lists all of the possible attributes with names, descriptions, and data types. We also have an Entity table which corresponds roughly to an object of a given type. (There are also some tables that specify what attributes a given entity might have). Attribute values are stored in a table whose primary key is (entity id, attribute id).

  • Anon Coward (unregistered) in reply to RNH
    I feel sorry for Tom, I really do.  His AskTom blog can be a fulfilling joy to write one day and a lesson in not clawing out your eyes the next.  For him, week after week, it usually goes somehting like this: "Please for the sending me of code to do X on Oracle 9i database with websphere and .NET and QuickBasic Monkey Poo Edition.  Pls hurry, I have big enterprise contract to deliver next week.  Many thanks in advance, Mr. Foreign Name"

     

    Ahem!  Would it make you feel better if it was signed off as "Many thanks in advance, Mr. Local Name"?

  • phred lost his login (unregistered) in reply to don

    don:
    Just what is it that some people have against relational data? That's only partly rhetorical - seriously, does anyone know?

     I'm one of those who don't like relational data, and its specifically because I hate all the database

    software I've ever had to use. My least favorite stuff right now is the entire Microsoft tool chain from

    dot net to sql server 2005. Everytime I invest effort into programming any significant project, a Microsoft

    upgrade will come along and basically trash what I've done. The linux world is even worse, its favorite

    database is mysql and that damn thing isn't even ACID ready. Then theres perl, with its proposed upgrade

    path to 6 called the "apocalypse".

     

    The only sturdy programming environment I know of seems to be regular old C, and I'm sure not going

    to be very productive there on a day to day basis.

  • (cs) in reply to James Brantly
    James Brantly:
    Anonymous:

    I poked into some of the underlying tables for MS Sharpoint Portal Server and found, to me, the most ostensibly bad database design.  It had columns named {int01 . . .int32}, {varchar01...varchar32}, {text01...text32}.  That was impressive (impressively bad), but this . . .this is AMAZING!



    I submitted that as a WTF a long time ago. Alex responded that since it was by Microsoft it must have some design reason behind it. There are also other oddities in the Sharepoint database. A number of different naming conventions are used, and there are occasions where a primary key in one table is used as a foreign key in another table under a different column name that seems to have no relation. Unfortunately I do not have the specifics anymore as this was a long time ago, but the Sharepoint db design is definitely a big WTF.

    The only "fundamental design reason" I could think of for doing something like that would be to make it impossible for anyone to access the data using competing software, and I doubt they'd do that... oh wait... 

  • (cs) in reply to Mike Swaim
    Anonymous:

    Anonymous:
    Customers want to be able to define their own new (custom) types. They want to be able to define their own type ('m') and insert instances of the new type anywhere within system.

    Customers want to be able to define their own custom properties on any of the types.

      We can do something like this, using a modified Entity-Attribute-Value schema. We have a table that lists all of the possible attributes with names, descriptions, and data types. We also have an Entity table which corresponds roughly to an object of a given type. (There are also some tables that specify what attributes a given entity might have). Attribute values are stored in a table whose primary key is (entity id, attribute id).

     Are you also applying for a database WTF?

    l.

  • Anon Coward (unregistered) in reply to Mike Swaim
    Anonymous:

    don:
    Just what is it that some people have against relational data? That's only partly rhetorical - seriously, does anyone know?

      In a lot of shops, the DBA/DA is in a seperate group from the application developers, so getting database changes can require running a gauntlet. Rather than fight that battle, it's easier to just stick the data in a string or blob that you control. (That was the reasoning that the Sitara team at Enron used to stick data inside CLOBs. One of the system architects later admitted that they might have made a mistake. They had 2 (two!) databases with extracted data for outside applications to use to get reports, billing data, contact info, whatnot out of.

    Data is duplicated in many real world situations.  In many companies, data from operational systems (order processing, inventory control, etc.) is copied into other tables on a nightly/weekly basis.  These other tables are then used by Crystal Reports to generate reports.  They do this so that the operational systems do not slow down when users run reports.  Along the way, info may be reformatted, certain business rules may be applied, and data from different tables might be combined into single tables to reduce the amount of data manipulation by the Crystal Reports server.

    Ideally, of course, all the info should be in a single place.  However, in order for such a set-up to provide acceptable performance to all users, one needs to get very expensive hardware, OS from the likes of Sun (or IBM) and a big, fat multi-CPU licence from Oracle.  Such a server will have to have the horsepower to handle very high loads during normal working hours and at the end of the month when many reports have to run - and then see minimal load at night.  Hardly optimal.

  • Thornton (unregistered) in reply to l1fel1ne

    WebLogic Integration does this for it's process database. In addition to the problem of not being able to use the data outside of the app server, it causes upgrade problems. If you deploy a modified version of a process, sleeping (serialized) instances cannot be deserialized because the object ID's of the classes don't match. Genius. (WTF, BEA?)

  • nerdguy (unregistered) in reply to Martin
    Anonymous:

    The problem that some people have with relational data, is that it conflicts with the way data are handled in the client

    Class UserAccount {
      Vector<MessageGroup>myMessages; 
    }
    class MessageGroup {
      String title; // Other data here.
      Vector<SingleMessage>allMessagesInThisGroup;
    }
    class SingleMessage {
     String title,Message; // And so on. 
    }
    
    Now to load an UserAccount from the database, including all that users mail, you must do a join between the user, the messagegroup and the SingleMessage table.  Parsing the result of the join on the client is diffucult, and require quite some cpu time, becasue the reply from the database contains redundent data. (And having to writing all that code that transform data to/from a object graph sucks.)
    ...you realize you've written this ENTIRELY backwards. And you probably don't understand OOP either. Sweet jesus.
    class UserAccount {
     long id = 1234567890;
     List MessageGroups = null;
     public List getMessageGroups() {
       return MessageGroup[0..n].getMessages(long userId)
     }
    }
    class MessageGroup {
     long id = 1234567890;
     String title;
     List Messages = null;
     public List getMessages(long userId) {
      from Message where userId=userId and groupId=this.id
     }
    }
    class Message {
     long userId = null;
     long groupId = null
     String title,message;
    }
    
    That's very easily mapped to RDBMS.
  • (cs)
    Anonymous:

    If you are designing a database, and you don't know all the columns in advance, put 20 flexfields in there.  Call them attribute1, 2, etc.  Then when you find you didn't create a column for hair_color, just stick it in attribute1.

    Depending on the general conditions, you might even consider executing an "ALTER TABLE ADD (...)" in run-time, whenever necessary.

    Of course this requires a level of control that some DBAs simply will not grant you. 

  • Plonk (unregistered) in reply to Martin
    Anonymous:

    The best solution would be to use a object orientered database, but nobody have really made a good useable general purpose language neutral oodb, so it's not always a solution. 


    Yeah get rid of that stuff based on relational (set) theory and grounded in maths. Maths is for intellectual weakling who need something robust underpinning their DBMS, not that SQL = Relational anyhow. Where is the mathematical theory underpinning the OO databases?

  • Joe (unregistered)

    Hey Tom,

     The poor guy is just trying to get around the awful impedence mismatch of RDBMS's and programming languages. When will Oracle come up with something better. And don't give me those ridiculous "Object Extensions". Steaming heap. Embedded sql (JDBC,ProC, whatever) is the worst hack that has happened to the industry in the past 25 years. Get off your high horse!!

     The real WTF is why hasn't this technology advanced??

     

  • Tom Kyte's fan (unregistered) in reply to Martin

    Is it so difficult to join??????????? In case, let do that to a DB developer while you have fun with your Java, .net or whatever ;-))

    And you get only the data that you want, no "useless" data...!!!

    Sigh... :-(

  • Tom Kyte's fan (unregistered) in reply to ammoQ
    ammoQ:
    Anonymous:

    If you are designing a database, and you don't know all the columns in advance, put 20 flexfields in there.  Call them attribute1, 2, etc.  Then when you find you didn't create a column for hair_color, just stick it in attribute1.

    Depending on the general conditions, you might even consider executing an "ALTER TABLE ADD (...)" in run-time, whenever necessary.

    Of course this requires a level of control that some DBAs simply will not grant you. 

     Aaahhh, better and better, concurrency-wise almost perfect!....WTF....

     

    captcha: lunch time

  • (cs) in reply to Tom Kyte's fan
    Anonymous:
    ammoQ:
    Anonymous:

    If you are designing a database, and you don't know all the columns in advance, put 20 flexfields in there.  Call them attribute1, 2, etc.  Then when you find you didn't create a column for hair_color, just stick it in attribute1.

    Depending on the general conditions, you might even consider executing an "ALTER TABLE ADD (...)" in run-time, whenever necessary.

    Of course this requires a level of control that some DBAs simply will not grant you. 

     Aaahhh, better and better, concurrency-wise almost perfect!....WTF....

    Normaly not a big deal, since people are not likely to change their datamodel that often. Probably the biggest problem is that packages referencing that table are invalidated.

     

  • Tom Kyte's fan (unregistered) in reply to ammoQ

    Alter table (in a controlled project) OK, runtime NOT OK. It means that a lot of SQL in the packages has to be built dinamically... Another killer for the db.... And all this for what? The users will extend and customize 20 times the LOVs, then, after one year, they will prefer to change the tool. And the db is still there waiting for the next miracle-interface-language-paradigma.

     

    captcha: create materialized view as select * from tom_kyte;

  • DJ (unregistered)

    Ever wondering why Cognos is running so slow?

    Take a look at the content store, a perfect example of the database Tom described. XML stored as blobs, clobs everywhere.

    No wonder... 

  • Axel (unregistered)

    Wow! Well, I had to "convert" 2D Sprits for a game once, and quickly found out the storage format was x86 Assembler (code+data segment) so I just read the code and wrote some of my own to decompile the data segement into a pcx file.

    But imagine conveting such a mess to a relational database. Well, problably just instance the objects using a jvm, and use some orm tool on the resulting class hirachy... an entrprise-scale operation, no doubt. probably works on the first go, being enterprise and all

     ;-)

  • Rich C (unregistered) in reply to lpope187

    One comment - In sql-server at least, selects can often be faster on normalized tables, as the data-pages needed are more likely to be found in RAM, which is many times faster than if it needs to hit the disks.

  • thaJeztah (unregistered)

    Funny, in 2006 this was called a WTF, in 2014 it's called "NoSQL" and very hip!

Leave a comment on “Tom Kyte on The Ultimate Extensibility”

Log In or post as a guest

Replying to comment #:

« Return to Article