• (cs) in reply to DrPizza
    DrPizza:

    ammoQ:
    Anonymous:
    why not use a few extra megabytes and insert an extra 200 years.


    A valid question, but one possible answer is: You wouldn't want to insert those dates with the IsHoliday column incorrectly set, would you? And who knows which holidays our great-grandchildren will have...

    Er, you couldn't insert 10 years if you were worried about that.  Two or three, maybe.


    Just how old ARE your grandchildren?
  • (cs) in reply to dcturner
    dcturner:
    Hee hee hee.

    The Integers Table

    Oh dear.


    Numbers tables are extremely useful and every good database with relatively complex querying should probably have one.  If the primary key of the integers table was an identity "NumberID" or something like that, then it would be a WTF.  But if you've never used a numbers table to write an eloquent SQL statement, you'd be amazed how useful they are.  If you'd like some examples of the things you can do with a numbers table that requires cursors or other clumsy techniques otherwise, let me know, I'll be happy to direct you to some examples.



  • (cs) in reply to sniper
    sniper:
    Anonymous:

    Date tables are fine.  Once you children get around the block a few times, you'll realize that date tables are a very good way to handle things like accounting dates.  (Hint: the problem is, accounting "dates" have little or nothing to do with calendar "dates"- they are simply strings with attributes.)

    The WTF is a business process WTF- why is the IT department defining dates?  The accounting department should be defining dates, and the IT guys merely uploading them into a table. 

    Agreed - I work in a brokerage house, and many of the systems we work with need to use dates that are published by a consortium of businesses. These dates represent logical events, but have neither rhyme nor reason w.r.t. calendar dates; there's absolutely no way to program/calculate them, and you can't buy a subscription to just upload the dates because it isn't sold anywhere - all you can do is cut-n-paste the dates from the annually published list into a special-date table, and use it for lookups.



    Does anyone here understand what the WTF is?????  How many more people need to post "but date tables can be useful" ?  100 ?
  • (cs) in reply to moose
    Anonymous:
    This is my favorite site in the whole wide world.  Having said that, I feel that I must point out that most all these WTF's are microsoft related.  Perhaps the easiest tool is not the best tool?

    paula.getPaula();


    Moose -- congrats on one of the most ignorant posts in the history of this site .. that's quite an accomplishment, I can assure you!

    By the way -- you forgot to make the "s" in Microsoft a dollar sign ($) .... shame, shame !!
  • (cs) in reply to Gnictigezoink
    Anonymous:
    There are reasons to make date tables, and reasons to create surrogate keys, and still today's WTF  manages to be an example of how and why NOT to do either of those things ... that's what elevates it above mere stupidity.


    Great post, very well put !
  • (cs) in reply to Randyd

    Oh my god.  Before I even read it, I looked at the picture and thought I recognized the system...   if it wasn't for a few different field names, I would have been totally convinced.  Then again, maybe the PhD that I was working with came up with this one too.

  • Charles Bretana Jr. (unregistered)

    This entire architecture could (SHOULD) be replaced with a User Defined type ("OurDate"), which actually contains a simple well-defined integral value (You could use the same rule SQL Server uses, number of days since 1 Jan 1900, or make up your own) ,  with an appropriate constraint,  and a set of User Defined Functions which take that type as input parameter and then returns:

    1)  the actual date,
    2)  the DateName,
    3)  the DateDescription,
    4)  Whether it's a holiday,  -- This one might be complex, or have special cases...
    etc.
    etc.
    Each table using a date need only have

  • Charles Bretana Jr. (unregistered) in reply to Charles Bretana Jr.

    This entire architecture could (SHOULD) be replaced with a User Defined type ("OurDate"), which actually contains a simple well-defined integral value (You could use the same rule SQL Server uses, number of days since 1 Jan 1900, or make up your own) ,  with an appropriate constraint,  and a set of User Defined Functions which take that type as input parameter and then returns:

    1)  the actual date,
    2)  the DateName,
    3)  the DateDescription,
    4)  Whether it's a holiday,  -- This one might be complex, or have special cases...
    etc.
    etc.
    Each table using a date need only have instance(s) of this user defined type, properly set, and queries that need access to the ancillary values previously stored in the date table can use the UDFs to generate them on the fly...  Performance will be comparable to doing another join/lookup on a separate table.

  • (cs) in reply to Charles Bretana Jr.

    And one more note,  to comments regarding OLAP implementations,  All data in an OLAP star schema is used only when refreshing the internal multi-dimensional aggregate data cube... it is not hit by queries against the data cube.  So, even if date tables are commonly used in star schemas, that doesn't mean they are the best way to generate those values (like DateName, Date SDescription, etc.)  A well-defined algorithm or rule which generates the same value dynamically can be a much better mechanism even in an OLAP implemnentation. 

  • anon (unregistered) in reply to vomjom
    Anonymous:
    You guys are all retarded.  Since you guys are clearly ignorant on the topic of Data Warehouse design, I suggest you start by reading "The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling".

    The date dimension is important for storing fiscal, holiday, quarterly, etc... information along with the date.  Every business has different important fiscal days and holidays, so where else would you store that information besides a table?  Remember: THIS TABLE IS NOT DESIGNED FOR LIVE PURPOSES.  It is an OLAP design, only to be used for analysis.

    For those who commented on the DateId: Kimball suggests that the DateId actually be stored in sequence with the dates, although this isn't necessarily a requirement.  Usually data warehouse information is aggregated along with a dimension's columns, for example "How much sales did we have in the last fiscal quarter?" which would necessitate a join on the Date table almost all the time.

    Also, for the comment on the naming convention regarding DateDate: would you suggest it be called Date instead and possibly clash with an internal SQL function?  What else would you name it?
    datedate isnot a good name. THE Design is fine or use just the Oracle Date trunc to midnite. as the PK.  The real Crime was not SEEDing the table. to begin with enough years
    Quick and dirty seeding 10years might be overkill but if you had to fill it why not.  Go back to bed.

    The OLAP app would use this table for anlaysis.....


  • mathew (unregistered)

    Hey, they'll be laughing if April 1st ever gets reassigned to be Smarch 32nd.

    This is a classic example of people believing the crap they are taught about normalization being some kind of holy grail.

  • (cs) in reply to none given
    Anonymous:
    Last time I had a problem like this, I wrote a utility for first-level support that lets them automatically add new dates by typing the numbers 4, 8, 15, 16, 23 and 42 into a console every 108 minutes...


    I take it your support has been "offshored"?


  • (cs) in reply to Jeff S
    Jeff S:
    dcturner:
    Hee hee hee.

    The Integers Table

    Oh dear.


    Numbers tables are extremely useful and every good database with relatively complex querying should probably have one.  If the primary key of the integers table was an identity "NumberID" or something like that, then it would be a WTF.  But if you've never used a numbers table to write an eloquent SQL statement, you'd be amazed how useful they are.


    Certainly useful yes, but one ought not to have to store a list of integers *on disk* to write eloquent code. It kind of suggests that the language is missing a useful feature. One wants to write

    <font face="Courier New">SELECT seq.value, things.name FROM things RIGHT JOIN (1..n) AS seq ON seq.value = things.value;</font>

    Where <font face="Courier New">(1..n)</font> is not a real table. Even better would be the same thing but with dates. Even *better* would be the same thing but with external code that generates an arbitrary sequence on the fly.
  • (cs) in reply to Charles Bretana Jr.
    Anonymous:

    This entire architecture could (SHOULD) be replaced with a User Defined type ("OurDate"), which actually contains a simple well-defined integral value (You could use the same rule SQL Server uses, number of days since 1 Jan 1900, or make up your own) ,  with an appropriate constraint,  and a set of User Defined Functions which take that type as input parameter and then returns:

    1)  the actual date,
    2)  the DateName,
    3)  the DateDescription,
    4)  Whether it's a holiday,  -- This one might be complex, or have special cases...
    etc.
    etc.
    Each table using a date need only have instance(s) of this user defined type, properly set, and queries that need access to the ancillary values previously stored in the date table can use the UDFs to generate them on the fly...  Performance will be comparable to doing another join/lookup on a separate table.



    You are joking, right ? 

    This has been a great thread ... Let's try to classify the (non-sarcastic, non-joking) responses so far:

    • Database "newbie":  "Date tables are stupid.  Who would create a table full of dates for any purpose?"
    • Experienced database admin/developer: "Date tables can be very useful, but creating integer DateID's to represent all dates throughout your system is not a good design and the column names are either silly (DateDate) or vague (what does the 'Name' or 'Description' of a date represent?). Plus the fact that this date table is not widely known and the UI does not provide a way to easily update or view this table apparently."
    • Data Warehousing "expert": "Data warehouses use ID's to represent entities, so this looks good to me.  Anyone who knows anything about databases knows that *everything* must be represented by an 'ID'."
    • WTF artist:  "What they should have done instead is ...." (Suggestions that are convoluted and even worse  .... triggers ???? user defined functions ????)
    • Ignorant Script Kiddie: "This bad design is the fault of Micro$oft / normalization / VB / Access / Republicans / Aliens / etc ....

    Of course, I suppose I have to add "condescending know-it-alls" to this list to classify my own post I suppose . ... :)
  • (cs) in reply to dcturner
    dcturner:
    Jeff S:
    dcturner:
    Hee hee hee.

    The Integers Table

    Oh dear.


    Numbers tables are extremely useful and every good database with relatively complex querying should probably have one.  If the primary key of the integers table was an identity "NumberID" or something like that, then it would be a WTF.  But if you've never used a numbers table to write an eloquent SQL statement, you'd be amazed how useful they are.


    Certainly useful yes, but one ought not to have to store a list of integers *on disk* to write eloquent code. It kind of suggests that the language is missing a useful feature. One wants to write

    <font face="Courier New">SELECT seq.value, things.name FROM things RIGHT JOIN (1..n) AS seq ON seq.value = things.value;</font>

    Where <font face="Courier New">(1..n)</font> is not a real table. Even better would be the same thing but with dates. Even *better* would be the same thing but with external code that generates an arbitrary sequence on the fly.


    If you feel that SQL should have this feature, next time just say it, instead of vaguely making fun of integer tables ....



  • (cs) in reply to dcturner
    dcturner:
    Jeff S:
    dcturner:
    Hee hee hee.

    The Integers Table

    Oh dear.


    Numbers tables are extremely useful and every good database with relatively complex querying should probably have one.  If the primary key of the integers table was an identity "NumberID" or something like that, then it would be a WTF.  But if you've never used a numbers table to write an eloquent SQL statement, you'd be amazed how useful they are.


    Certainly useful yes, but one ought not to have to store a list of integers *on disk* to write eloquent code. It kind of suggests that the language is missing a useful feature. One wants to write

    <font face="Courier New">SELECT seq.value, things.name FROM things RIGHT JOIN (1..n) AS seq ON seq.value = things.value;</font>

    Where <font face="Courier New">(1..n)</font> is not a real table. Even better would be the same thing but with dates. Even *better* would be the same thing but with external code that generates an arbitrary sequence on the fly.


    This "seq" thingy is relatively often usefull and the standard way Oracle guys do it is to do a
    "select rownum from some_large_table_or_view";
    rownum is a pseudocolumn that simply counts up from 1; for some_large_table_or_view", e.g.  the system view ALL_OBJECTS is used.
  • (cs) in reply to Jeff S

    Actually, No...

        Analogously, if you need your application to store the distances between two points, and therefore the queries need access to trigonometric values, you have two choices... Store them all in a table, at whatever degree of precision is required, or calculate them on the fly as you use them... For Trig functions, if performance is an issue, the answer is obviously to use the table - after all, once you populate the table with the values from 0-45 degrees, there will never be a requirememnt to add more... But for dates, if the algorithm to generate the date name/description, etc is well-defined, the requirement to manage this table may (and obviously in this case, did), outweigh whatever small performance implications this solution might incur.  

    Jeff S:
    Anonymous:

    This entire architecture could (SHOULD) be replaced with a User Defined type ("OurDate"), which actually contains a simple well-defined integral value (You could use the same rule SQL Server uses, number of days since 1 Jan 1900, or make up your own) ,  with an appropriate constraint,  and a set of User Defined Functions which take that type as input parameter and then returns:

    1)  the actual date,
    2)  the DateName,
    3)  the DateDescription,
    4)  Whether it's a holiday,  -- This one might be complex, or have special cases...
    etc.
    etc.
    Each table using a date need only have instance(s) of this user defined type, properly set, and queries that need access to the ancillary values previously stored in the date table can use the UDFs to generate them on the fly...  Performance will be comparable to doing another join/lookup on a separate table.



    You are joking, right ? 

    This has been a great thread ... Let's try to classify the (non-sarcastic, non-joking) responses so far:

    • Database "newbie":  "Date tables are stupid.  Who would create a table full of dates for any purpose?"
    • Experienced database admin/developer: "Date tables can be very useful, but creating integer DateID's to represent all dates throughout your system is not a good design and the column names are either silly (DateDate) or vague (what does the 'Name' or 'Description' of a date represent?). Plus the fact that this date table is not widely known and the UI does not provide a way to easily update or view this table apparently."
    • Data Warehousing "expert": "Data warehouses use ID's to represent entities, so this looks good to me.  Anyone who knows anything about databases knows that *everything* must be represented by an 'ID'."
    • WTF artist:  "What they should have done instead is ...." (Suggestions that are convoluted and even worse  .... triggers ???? user defined functions ????)
    • Ignorant Script Kiddie: "This bad design is the fault of Micro$oft / normalization / VB / Access / Republicans / Aliens / etc ....

    Of course, I suppose I have to add "condescending know-it-alls" to this list to classify my own post I suppose . ... :)
  • Jonathan Levy (unregistered) in reply to ammoQ

    That'll work, until the one time all_objects (or whatever table you chose) doesn't have enough rows.  You can get around this limitation using pipelined functions:

    <FONT size=1><FONT size=1>CREATE OR REPLACE TYPE "INTSET" AS TABLE OF Integer

    <FONT size=1>CREATE OR REPLACE FUNCTION ROWS
    RETURN IntSet PIPELINED IS
    BEGIN
    loop
    PIPE
    ROW(</FONT><FONT color=#0000f0 size=1>1</FONT><FONT size=1>);
    end loop;
    END;

    <FONT size=1>create or replace view integers_vw as
    select rownum - </FONT><FONT color=#0000f0 size=1>1</FONT><FONT size=1> int from table(rows)


    </FONT></FONT><FONT size=3>Then you can do select t.int from integers_vw t where rownum < X, and you'll get X rows, and you'll never run out.</FONT>

    <FONT size=3>Oracle can be really cool sometimes.</FONT>



    </FONT></FONT>
  • TheDoom (unregistered) in reply to Bullet

    Just use date.

    Yanks eh?

  • Gnictigezoink (unregistered) in reply to Jeff S

    create function ints( @from int, @to int )
    returns @seq table( i int )
    as
    begin
        declare @i int
        set @i = @from
        while @i <= @to
            begin
                insert into @seq values ( @i )
                set @i = @i + 1
            end
        return
    end

    GO

    select i from ints( 100, 200 )

    -- But having the table stored is probably faster

  • (cs) in reply to TheDoom

    I'm mostly a relational guy, but I don't see that the "dates" table is successful in OLAP terms either. For example, you can't ask "how much happened yesterday?" without joining every table. There is a fairly obvious relational solution that should make everybody happy:

    • "Transactions" is an implicit entity that should be made explicit. So, add a table called "transactions" with a date and a subtype with values like "Reservation", "Payment" and "Account_Cancellation". The other tables would then replace their "DateId" columns with a foreign key to the new table. "Customers.LastPaymentDateId" is now easy to get from the "transactions" table, so take it out of "Customers".
    • The "dates" table then becomes what I think it was supposed to be in the first place, an extension of the "Date" datatype. I bet that if you tracked it down all it would be is a lookup of holidays, in which case it would be in a relational domain. SQL (or any other DBMS AFAIK) does not support relational domains, so you'd probably implement it as a table called "holidays" or some such. However, because it is not an entity but a domain, you would be better off referencing it through a function call, not a foreign key. A SQL domain in PostgreSQL or a check constraint in Oracle ought to do the trick for validation, and "WHERE isHoliday(Transaction_Date) = 'N'" for querying.
    • Get rid of the damn suffixes. You're not supposed to know if the implementation was a table or view or array.
    Now the "how much happened yesterday?" question is easy to answer:
    SELECT type, COUNT(*) AS trx_count
    FROM transactions
    WHERE trx_date BETWEEN :start_date AND :end_date
    GROUP BY type

  • (cs) in reply to Jonathan Levy
    Anonymous:

    That'll work, until the one time all_objects (or whatever table you chose) doesn't have enough rows.  You can get around this limitation using pipelined functions:



    Nice idea, requires Oracle 9i or higher though; why no do it right?

    CREATE OR REPLACE FUNCTION intrange(p_from in integer, p_to in integer) return "intset" pipelined is
    begin
    for i in p_from..p_to loop
    pipe row(i);
    end loop;
    return;
    end;
    /

    SQL> select * from table(intrange(9,14));

    COLUMN_VALUE
    ------------
               9
              10
              11
              12
              13
              14

    6 rows selected.


  • (cs) in reply to RyuO
    RyuO:
    I'm mostly a relational guy, but I don't see that the "dates" table is successful in OLAP terms either. For example, you can't ask "how much happened yesterday?" without joining every table.


    Actually it is very helpful in terms of data warehouses especially if your integer surrogate keys are ordered along with the actual dates.  Consider the querying frontend that uses the underlying date table as the datasource for the input control.  They can be setup to be bound to the ID but display the actual date.  Therefore when the actual query is submitted, the query is "Select Blah from Yada where OrderDateID between 1000 and 1200".  Therefore you bypass the join statements completely. 

    That coupled with table partitioning and precalculated aggregates in HOLAP and MOLAP, you get extremely fast data retrieval.  Most OLAP frontends are tuned to generate optimized sql statements.

    Larry
  • Remco G (unregistered) in reply to Pvulmo
    Anonymous:
    The obvious enterprise solution is to refactor the table into three tables - year, month and day.  Finally, an appropriate place for the 'IsWeekend' column.

    Surely you jest. That column belongs in a 'couple_of_days' table, and nowhere else!
  • Jobu (unregistered) in reply to Mike5

    <FONT face=Arial color=#0000ff size=4><extremeLaughter/></FONT>

  • (cs) in reply to Bullet
    Bullet:

    Actually, No...

        Analogously, if you need your application to store the distances between two points, and therefore the queries need access to trigonometric values, you have two choices... Store them all in a table, at whatever degree of precision is required, or calculate them on the fly as you use them... For Trig functions, if performance is an issue, the answer is obviously to use the table - after all, once you populate the table with the values from 0-45 degrees, there will never be a requirememnt to add more... But for dates, if the algorithm to generate the date name/description, etc is well-defined, the requirement to manage this table may (and obviously in this case, did), outweigh whatever small performance implications this solution might incur.  



    It depends.  Some things you are missing:

    1) IsHoliday is *data* -- not a calculation.  We've covered this before, but if you feel that you can write a function that calculates all holidays and takes into account exceptions and each company's HR policy and which days of the week things fall on and so on, feel free to present it to us.  If you need to store data, you should use a table.  This also applies for accounting periods and the like (which change depending on your requirements and often where the 1st of the month falls).

    2) we are dealing with relational databases here.  Storing data allows for indexes and other things to be used.  Calc'ing dates and all of their relevant attributes on the fly will not allow for this.  If each date has a "period" assigned to it, I can quickly get all dates returned for a period.  If period is a calculation, all dates must be considered and the calculation performed for each before the proper rows can be joined to or retrieved.  You could make things a little more efficient by writing dozens of UDF's for all possibilities (GetDatesByperiod, GetDatesByYear, GetDatesByMonth, GetDatesByRange, GetHolidaysByPeriod, GetMonthsByDateRange, GetYearsByDateRange, GetDatesByYear, ... etc ...) but hopefully you will agree a simple indexed table and a few joins work pretty well.  I hear that some RDBMS's even have query optimizers and other fancy things that are designed to do this kind of stuff! :)




  • ObSimpsons (unregistered) in reply to meh

    "The Goggles:  They do nothing!"


  • (cs) in reply to Jeff S

    I agree with both yr points... especially the first... 

    1.   To speak to point 1 in this specific example,  just having a table with only the holiday DateIDs (or date values)  in it would allow for much faster performance, and data-driven management of the list of holidays,  without requireing management of the complete list of all dates....
        - And the complete system would not be subject to failure because you haven't entered enough dates...

    2.  To yr second point, I would agree, that if there are such requirements (to generate sets of data based on categories of dates such as you describe), that would be an additional argumemnt for using a table rather than dynamically generating the data...

    Jeff S:
    Bullet:

    Actually, No...

        Analogously, if you need your application to store the distances between two points, and therefore the queries need access to trigonometric values, you have two choices... Store them all in a table, at whatever degree of precision is required, or calculate them on the fly as you use them... For Trig functions, if performance is an issue, the answer is obviously to use the table - after all, once you populate the table with the values from 0-45 degrees, there will never be a requirememnt to add more... But for dates, if the algorithm to generate the date name/description, etc is well-defined, the requirement to manage this table may (and obviously in this case, did), outweigh whatever small performance implications this solution might incur.  



    It depends.  Some things you are missing:

    1) IsHoliday is *data* -- not a calculation.  We've covered this before, but if you feel that you can write a function that calculations all holidays and takes into account exceptions and each companies definition and other things, feel free to present it to us.  If you need to store data, you should use a table.  This also applies for accounting periods and the like (which change depending on your requirements and often where the 1st of the month falls).

    2) we are dealing with relational databases here.  Storing data allows for indexes and other things to be used.  Calc'ing dates and all of their relevant attributes on the fly will not allow for this.  If each date has a "period" assigned to it, I can quickly get all dates returned for a period.  If period is a calculation, all dates must be considered and the calculation performed for each before the proper rows can be joined to or retrieved.  You could make things a little more efficient by writing dozens of UDF's for all possibilities (GetDatesByperiod, GetDatesByYear, GetDatesByMonth, GetDatesByRange, GetHolidaysByPeriod, GetMonthsByDateRange, GetYearsByDateRange, GetDatesByYear, ... etc ...) but hopefully you will agree a simple indexed table works pretty well.




  • (cs) in reply to Jeff S
    Jeff S:
    • WTF artist:  "What they should have done instead is ...." (Suggestions that are convoluted and even worse  .... triggers ???? user defined functions ????)
    You got my number. Except, well, I was joking.
    Jeff S:
    • Ignorant Script Kiddie: "This bad design is the fault of Micro$oft / normalization / VB / Access / Republicans / Aliens / etc ....

    Of course, I suppose I have to add "condescending know-it-alls" to this list to classify my own post I suppose . ... :)

    Which reminds me - I'm going to check for news on Slashdot & Digg.

  • Anony Moose (unregistered) in reply to vomjom

    Anonymous:
    You guys are all retarded.  Since you guys are clearly ignorant on the topic of Data Warehouse design, I suggest you start by reading "The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling".

    Well, if we're retarded, what about the genius who did this wonderful system and forgot to tell anyone that occasionally the date bin needs to be refilled?

    It may be a great idea in some situations, but an even better idea would be for this to be known amongst the people actually running the system - and that includes the support staff who did the "oh, yeah, I guess maybe we didn't put enough dates in last time this happened".

    If they were building a data warehouse, they might have mentioned that fact in the documentation, no?

    (Captcha = "perfection"  ;) )

  • (cs) in reply to Anony Moose

    I don't believe this is a data warehouse. Which company runs a mission critical data warehouse that needs to be fixed at 3:00 am?

  • (cs) in reply to Frustrated Banking Programmer
    Anonymous:

    Now imagine that you have to support these rediculously obtuse rules across dozens of applications written in everything from Cobol to Java to C# to Lisp (I kid you not).  It suddenly becomes a lot easier to be able to query the database for the next business date or previous month-end.  And I haven't even brought up the slightly sticky situations (like vacations) or the very sticky situations (like vacations across multiple regions) or the horrendous situations (like daylight savings time across multiple regions).

    WTF?

    Why would you query a database to find out the value of "tomorrow?" Why would you query a database for the value of something that is well defined, or easy to calculate?

    What do you do, update all the fields in the database at midnight everynight?

     

    I can imagine lots of reasons why you want a table of dates, but know that june 7 is tomorrow (june 8 if you are reading this one the 7th, etc) is not one of them.

  • (cs) in reply to Jonathan Levy
    Jonathan Levy:
    That'll work, until the one time all_objects (or whatever table you chose) doesn't have enough rows.

    Indeed.

    Jonathan Levy:
    You can get around this limitation using pipelined functions.

    And that, sir, is why I read this board. I have learnt something this morning. Sadly it seems to be an Oracle-only feature at the moment, but at least I now know its name. Thank you.
  • (cs) in reply to dcturner
    dcturner:
    Jonathan Levy:
    You can get around this limitation using pipelined functions.

    And that, sir, is why I read this board. I have learnt something this morning. Sadly it seems to be an Oracle-only feature at the moment, but at least I now know its name. Thank you.
    I take that back. In PostgreSQL

    <font face="Courier New">CREATE OR REPLACE FUNCTION int_list(int4, int4) RETURNS SETOF int4 AS '
    DECLARE
            counter int4;
            start_val ALIAS FOR $1;
            end_val ALIAS FOR $2;
    BEGIN
            FOR counter IN start_val..end_val LOOP
                    RETURN NEXT counter;
            END LOOP;
            RETURN;
    END;' LANGUAGE plpgsql;

    SELECT i FROM int_list(3, 10) i;


    </font> seems to do the same thing. Great.
  • John Hensley (unregistered)

    Last time we had one of these WTFs about databases that run out of dates, I suggested that the right way to handle holidays was one table of holidays and one table of non-holidays. It was a joke, of course.

    Looks like WTF is stranger than fiction.

  • Gabe (unregistered)

    Not funny, not at all. One of the applications I support has exactly this in its "design".

  • Drone (unregistered)

    At a company I used to work for, I actually avoided this sort of WTF; it involved a valid_year table for a search engine. The design originally had year_id and year_value columns, with year_id columns as a foreign key in other tables. When searching for a range of years in other tables, it made for sequential scan fun. I was able to convince them that a single year_value column in the valid_year table was best.

  • SteveOC (unregistered)

    I like this solution, and I hope to convert all of my databases to use the same design paradigm in the near future.

    So you can add rows to the date table based on the description of a day .. think of the possibilities !!

    insert into dates_tb (datedate, datedesc) values ('2006-08-02','Seven and a half weeks later');

    So when a business application invariably has to parse a date in english, (eg - "Seven and a half weeks from now"), it can merely query the database rather than try and parse the date.

    select dateid from dates_tb where datedesc='Seven and a half weeks from now';

    Now, of course, in this example, the exact record wont be found .. since the spelling of that question is slightly different.

    Not to worry - we can fix that in the database by adding a TRIGGER to the dates_tb table. On inserting a record into the dates_tb table, the trigger can call a stored procedure to deconstruct the question in the DateDesc field, and generate permuations of this for further insertion.

    In our example - the stored procedure shall generate the following permutations of the DateDesc, and (recursively) insert new records for each of these :
    'Seven and a half weeks time'
    'In seven and a half weeks from now'
    'In seven and a half weeks time'
    'Just over another seven weeks'
    'Around seven or so weeks from now'
    'Probably a bit more than seven weeks'
    'Almost eight weeks'
    'Just a tad under eight weeks, but certainly no later'
    'Couple of months is all we need to sort this one out'
    'Should be all sorted by the time the next quarter comes around'
    'I guarantee that you wont have to wait more than 3 months for this'
    ... etc

    Having generated all those date permutations, the application performing the query is saved from having to calculate the date from the description. Beautiful.

    So, at the end of the financial year, when the database server is flat out calculating all of its end of year reports ... the business application has to rapidly report to the company's creditors stating exactly WHEN those outstanding bills can be paid .. and then scheduling a funds transfer with the bank for several million dollars to keep those corporate wolves at bay !!.

    insert into funds_transfer_tb (DueDateId, Amount) select (DateId, '1000000') from dates_tb where DateDesc='Seven and a half weeks from now';

    Yeah !!!  Too easy !!!

    And so, the million dollars hits the bank on .. 2nd August 2006.

    Hang on ......   Shit !!!  Thats too early !!!   We need at least 7.5 weeks from the start of the financial year - thats like, 16th Augus, not 2nd.

    So what we need to do is add a scheduled job that goes off every week, and increments the DateDate field in the dates_tb table.  So that way, at the start of each week, all of those records that we inserted point to the correct date in 7.5 weeks from now - all the freaking time ! Luckily, most SQL engines support date calcluation functions  - DESIGNED JUST TO MAKE JOBS LIKE THIS REAL EASY ;-)

    update dates_tb set DateDate = date_add(DateDate, interval 7 days);

    Great, that completely fixes that problem, and we now have a super-smart date lookup mechanism at the heart of our database .. that stays current all the time. Arnt we freaking clever ;-)

  • franck (unregistered) in reply to Drive-by poster

    Someone forgot the month table, the year table, the day table, the hour , minute, and seconde table....

    !!!


  • dba (unregistered) in reply to ammoQ
    ammoQ:
    Apparently nobody sees the real value of this solution.
    Let's say you ask a customer when he will pay and his answer is "when hell freezes over" .
    In a normal database, all you can do is entering "31.12.2199" or null or some other magical value.
    This system allows to enter a row into dates_tb with DateDesc="when hell freezes over" and DateDate=null until you find out when hell will freeze over.



    :-D :-D :-D :-D :-D
  • I am not a robot (unregistered)
    Alex Papadimoulis:


    This pic is as scarey and frightening as the solution itself.
    I won't be able to sleep today, or even ever!!!!!!
  • Algirdas Kepezinskas (unregistered) in reply to ChiefCrazyTalk
    Anonymous:

    What...The....F.....

    This is easily the worst WTF I've seen.

    you need to look more.

  • Holy Gauss (unregistered) in reply to Algirdas Kepezinskas

    <FONT style="BACKGROUND-COLOR: #a9a9a9">This is beautiful. I strongly recommend to use this database design to bad employer. Serious!</FONT>

     

  • r937 (unregistered) in reply to Jeff S

    Jeff S:
    Does anyone here understand what the WTF is????? 
    agreed

    the real wtf here is that so few people understand that the real wtf is the surrogate key

  • 0x09 (unregistered) in reply to Tei

    Couldent agree more :-) haha

  • Jeff (unregistered) in reply to vomjom
    <font color="#a9a9a9">You guys are all retarded. Since you guys are clearly ignorant on the topic of Data Warehouse design, I suggest you start by reading "The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling". The date dimension is important for storing fiscal, holiday, quarterly, etc... information along with the date. Every business has different important fiscal days and holidays, so where else would you store that information besides a table? Remember: THIS TABLE IS NOT DESIGNED FOR LIVE PURPOSES. It is an OLAP design, only to be used for analysis. For those who commented on the DateId: Kimball suggests that the DateId actually be stored in sequence with the dates, although this isn't necessarily a requirement. Usually data warehouse information is aggregated along with a dimension's columns, for example "How much sales did we have in the last fiscal quarter?" which would necessitate a join on the Date table almost all the time. Also, for the comment on the naming convention regarding DateDate: would you suggest it be called Date instead and possibly clash with an internal SQL function? What else would you name it?</font>
    I really hope I've missed some sarcasm in your post... Otherwise, are you seriously suggesting that in data warehousing solutions you can't use the date associated with records in, say, the reservations table had that been stored as a date and not some arbitrary numeric lookup value? And rather than DateDate, and not using Date as it would clash with the SQL reserved word, maybe <applicationName>Date, or something alittle more descriptive? That one took me a whopping .5 seconds to come up with. I'll bet if I spent 3-4 seconds I could come up with an even better field name. </applicationName>
  • K S Chanchal (unregistered) in reply to Jojosh_the_Pi

    the databse is obsolete you say. But the problem is not. Can you tell me what is the current implementation?.

     

    i am having a similar problem and i did a similar design  but the higher ups are in no mood to approve the design..

     

     

    SOS!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

     

     

    It is a financial module  which is to use fisical years, wuater , half years as well as months

  • (cs) in reply to Frustrated Banking Programmer
    Anonymous:

    You know, these kinds of tables are heavily used in banking.

    So let's say, in your infinite wisdom, that you used a Date column for the date for, oh, I don't know, the day a futures order is supposed to expire (can you guess what industry I worked in).  Now, you need to generate reports for the traders stating what orders are expiring in the next 1, 2 and 3 days.

    The beautiful part about working with financials people is that words like "tomorrow" and "in 2 days" don't mean "tomorrow", and they sure as hell don't mean "in 2 days".  They mean "business day + 1" and "business day + 2" where the business day rolls at 4:00 PM for one trading floor, 5:00 PM for another trading floor, and 6:00 AM for the overnight market.

    Now imagine that you have to support these rediculously obtuse rules across dozens of applications written in everything from Cobol to Java to C# to Lisp (I kid you not).  It suddenly becomes a lot easier to be able to query the database for the next business date or previous month-end.  And I haven't even brought up the slightly sticky situations (like vacations) or the very sticky situations (like vacations across multiple regions) or the horrendous situations (like daylight savings time across multiple regions).

    The real WTF is the financial industry

     And you can use a stored procedure for that??

  • Marc (unregistered)

    OMG, that response just makes me wanna cry!

     Marc.

  • Douglas Jones (unregistered) in reply to vomjom
    vomjom:
    You guys are all retarded.  Since you guys are clearly ignorant on the topic of Data Warehouse design, I suggest you start by reading "The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling".The date dimension is important for storing fiscal, holiday, quarterly, etc... information along with the date.  Every business has different important fiscal days and holidays, so where else would you store that information besides a table?  Remember: THIS TABLE IS NOT DESIGNED FOR LIVE PURPOSES.  It is an OLAP design, only to be used for analysis.For those who commented on the DateId: Kimball suggests that the DateId actually be stored in sequence with the dates, although this isn't necessarily a requirement.  Usually data warehouse information is aggregated along with a dimension's columns, for example "How much sales did we have in the last fiscal quarter?" which would necessitate a join on the Date table almost all the time.Also, for the comment on the naming convention regarding DateDate: would you suggest it be called Date instead and possibly clash with an internal SQL function?  What else would you name it?

Leave a comment on “Journey to the Center of the Database”

Log In or post as a guest

Replying to comment #:

« Return to Article