- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Just how old ARE your grandchildren?
Admin
Numbers tables are extremely useful and every good database with relatively complex querying should probably have one. If the primary key of the integers table was an identity "NumberID" or something like that, then it would be a WTF. But if you've never used a numbers table to write an eloquent SQL statement, you'd be amazed how useful they are. If you'd like some examples of the things you can do with a numbers table that requires cursors or other clumsy techniques otherwise, let me know, I'll be happy to direct you to some examples.
Admin
Does anyone here understand what the WTF is????? How many more people need to post "but date tables can be useful" ? 100 ?
Admin
Moose -- congrats on one of the most ignorant posts in the history of this site .. that's quite an accomplishment, I can assure you!
By the way -- you forgot to make the "s" in Microsoft a dollar sign ($) .... shame, shame !!
Admin
Great post, very well put !
Admin
Oh my god. Before I even read it, I looked at the picture and thought I recognized the system... if it wasn't for a few different field names, I would have been totally convinced. Then again, maybe the PhD that I was working with came up with this one too.
Admin
This entire architecture could (SHOULD) be replaced with a User Defined type ("OurDate"), which actually contains a simple well-defined integral value (You could use the same rule SQL Server uses, number of days since 1 Jan 1900, or make up your own) , with an appropriate constraint, and a set of User Defined Functions which take that type as input parameter and then returns:
1) the actual date,
2) the DateName,
3) the DateDescription,
4) Whether it's a holiday, -- This one might be complex, or have special cases...
etc.
etc.
Each table using a date need only have
Admin
This entire architecture could (SHOULD) be replaced with a User Defined type ("OurDate"), which actually contains a simple well-defined integral value (You could use the same rule SQL Server uses, number of days since 1 Jan 1900, or make up your own) , with an appropriate constraint, and a set of User Defined Functions which take that type as input parameter and then returns:
1) the actual date,
2) the DateName,
3) the DateDescription,
4) Whether it's a holiday, -- This one might be complex, or have special cases...
etc.
etc.
Each table using a date need only have instance(s) of this user defined type, properly set, and queries that need access to the ancillary values previously stored in the date table can use the UDFs to generate them on the fly... Performance will be comparable to doing another join/lookup on a separate table.
Admin
And one more note, to comments regarding OLAP implementations, All data in an OLAP star schema is used only when refreshing the internal multi-dimensional aggregate data cube... it is not hit by queries against the data cube. So, even if date tables are commonly used in star schemas, that doesn't mean they are the best way to generate those values (like DateName, Date SDescription, etc.) A well-defined algorithm or rule which generates the same value dynamically can be a much better mechanism even in an OLAP implemnentation.
Admin
Quick and dirty seeding 10years might be overkill but if you had to fill it why not. Go back to bed.
The OLAP app would use this table for anlaysis.....
Admin
Hey, they'll be laughing if April 1st ever gets reassigned to be Smarch 32nd.
This is a classic example of people believing the crap they are taught about normalization being some kind of holy grail.
Admin
I take it your support has been "offshored"?
Admin
Certainly useful yes, but one ought not to have to store a list of integers *on disk* to write eloquent code. It kind of suggests that the language is missing a useful feature. One wants to write
<font face="Courier New">SELECT seq.value, things.name FROM things RIGHT JOIN (1..n) AS seq ON seq.value = things.value;</font>
Where <font face="Courier New">(1..n)</font> is not a real table. Even better would be the same thing but with dates. Even *better* would be the same thing but with external code that generates an arbitrary sequence on the fly.
Admin
You are joking, right ?
This has been a great thread ... Let's try to classify the (non-sarcastic, non-joking) responses so far:
Of course, I suppose I have to add "condescending know-it-alls" to this list to classify my own post I suppose . ... :)
Admin
If you feel that SQL should have this feature, next time just say it, instead of vaguely making fun of integer tables ....
Admin
This "seq" thingy is relatively often usefull and the standard way Oracle guys do it is to do a
"select rownum from some_large_table_or_view";
rownum is a pseudocolumn that simply counts up from 1; for some_large_table_or_view", e.g. the system view ALL_OBJECTS is used.
Admin
Actually, No...
Analogously, if you need your application to store the distances between two points, and therefore the queries need access to trigonometric values, you have two choices... Store them all in a table, at whatever degree of precision is required, or calculate them on the fly as you use them... For Trig functions, if performance is an issue, the answer is obviously to use the table - after all, once you populate the table with the values from 0-45 degrees, there will never be a requirememnt to add more... But for dates, if the algorithm to generate the date name/description, etc is well-defined, the requirement to manage this table may (and obviously in this case, did), outweigh whatever small performance implications this solution might incur.
Admin
That'll work, until the one time all_objects (or whatever table you chose) doesn't have enough rows. You can get around this limitation using pipelined functions:
<FONT size=1><FONT size=1>CREATE OR REPLACE TYPE "INTSET" AS TABLE OF Integer
<FONT size=1>CREATE OR REPLACE FUNCTION ROWS
RETURN IntSet PIPELINED IS
BEGIN
loop
PIPE ROW(</FONT><FONT color=#0000f0 size=1>1</FONT><FONT size=1>);
end loop;
END;
<FONT size=1>create or replace view integers_vw as
select rownum - </FONT><FONT color=#0000f0 size=1>1</FONT><FONT size=1> int from table(rows)
</FONT></FONT><FONT size=3>Then you can do select t.int from integers_vw t where rownum < X, and you'll get X rows, and you'll never run out.</FONT>
<FONT size=3>Oracle can be really cool sometimes.</FONT>
</FONT></FONT>Admin
Just use date.
Yanks eh?
Admin
create function ints( @from int, @to int )
returns @seq table( i int )
as
begin
declare @i int
set @i = @from
while @i <= @to
begin
insert into @seq values ( @i )
set @i = @i + 1
end
return
end
GO
select i from ints( 100, 200 )
-- But having the table stored is probably faster
Admin
I'm mostly a relational guy, but I don't see that the "dates" table is successful in OLAP terms either. For example, you can't ask "how much happened yesterday?" without joining every table. There is a fairly obvious relational solution that should make everybody happy:
- "Transactions" is an implicit entity that should be made
explicit. So, add a table called "transactions" with a date and a
subtype with values like "Reservation", "Payment" and
"Account_Cancellation". The other tables would then replace their
"DateId" columns with a foreign key to the new table.
"Customers.LastPaymentDateId" is now easy to get from the "transactions" table, so take it out of "Customers".
- The "dates" table then becomes what I think it was supposed
to be in the first place, an extension of the "Date" datatype. I bet
that if you tracked it down all it would be is a lookup of holidays, in
which case it would be in a relational domain. SQL (or any other DBMS
AFAIK) does not support relational domains, so you'd probably
implement it as a table called "holidays" or some such. However,
because it is not an entity but a domain, you would be better off
referencing it through a function call, not a foreign key. A SQL domain
in PostgreSQL or a check constraint in Oracle ought to do the trick for
validation, and "WHERE isHoliday(Transaction_Date) = 'N'" for querying.
- Get rid of the damn suffixes. You're not supposed to know if the implementation was a table or view or array.
Now the "how much happened yesterday?" question is easy to answer:Admin
Nice idea, requires Oracle 9i or higher though; why no do it right?
CREATE OR REPLACE FUNCTION intrange(p_from in integer, p_to in integer) return "intset" pipelined is
begin
for i in p_from..p_to loop
pipe row(i);
end loop;
return;
end;
/
SQL> select * from table(intrange(9,14));
COLUMN_VALUE
------------
9
10
11
12
13
14
6 rows selected.
Admin
Actually it is very helpful in terms of data warehouses especially if your integer surrogate keys are ordered along with the actual dates. Consider the querying frontend that uses the underlying date table as the datasource for the input control. They can be setup to be bound to the ID but display the actual date. Therefore when the actual query is submitted, the query is "Select Blah from Yada where OrderDateID between 1000 and 1200". Therefore you bypass the join statements completely.
That coupled with table partitioning and precalculated aggregates in HOLAP and MOLAP, you get extremely fast data retrieval. Most OLAP frontends are tuned to generate optimized sql statements.
Larry
Admin
Surely you jest. That column belongs in a 'couple_of_days' table, and nowhere else!
Admin
<FONT face=Arial color=#0000ff size=4><extremeLaughter/></FONT>
Admin
It depends. Some things you are missing:
1) IsHoliday is *data* -- not a calculation. We've covered this before, but if you feel that you can write a function that calculates all holidays and takes into account exceptions and each company's HR policy and which days of the week things fall on and so on, feel free to present it to us. If you need to store data, you should use a table. This also applies for accounting periods and the like (which change depending on your requirements and often where the 1st of the month falls).
2) we are dealing with relational databases here. Storing data allows for indexes and other things to be used. Calc'ing dates and all of their relevant attributes on the fly will not allow for this. If each date has a "period" assigned to it, I can quickly get all dates returned for a period. If period is a calculation, all dates must be considered and the calculation performed for each before the proper rows can be joined to or retrieved. You could make things a little more efficient by writing dozens of UDF's for all possibilities (GetDatesByperiod, GetDatesByYear, GetDatesByMonth, GetDatesByRange, GetHolidaysByPeriod, GetMonthsByDateRange, GetYearsByDateRange, GetDatesByYear, ... etc ...) but hopefully you will agree a simple indexed table and a few joins work pretty well. I hear that some RDBMS's even have query optimizers and other fancy things that are designed to do this kind of stuff! :)
Admin
"The Goggles: They do nothing!"
Admin
I agree with both yr points... especially the first...
1. To speak to point 1 in this specific example, just having a table with only the holiday DateIDs (or date values) in it would allow for much faster performance, and data-driven management of the list of holidays, without requireing management of the complete list of all dates....
- And the complete system would not be subject to failure because you haven't entered enough dates...
2. To yr second point, I would agree, that if there are such requirements (to generate sets of data based on categories of dates such as you describe), that would be an additional argumemnt for using a table rather than dynamically generating the data...
Admin
Which reminds me - I'm going to check for news on Slashdot & Digg.
Admin
Well, if we're retarded, what about the genius who did this wonderful system and forgot to tell anyone that occasionally the date bin needs to be refilled?
It may be a great idea in some situations, but an even better idea would be for this to be known amongst the people actually running the system - and that includes the support staff who did the "oh, yeah, I guess maybe we didn't put enough dates in last time this happened".
If they were building a data warehouse, they might have mentioned that fact in the documentation, no?
(Captcha = "perfection" ;) )
Admin
I don't believe this is a data warehouse. Which company runs a mission critical data warehouse that needs to be fixed at 3:00 am?
Admin
WTF?
Why would you query a database to find out the value of "tomorrow?" Why would you query a database for the value of something that is well defined, or easy to calculate?
What do you do, update all the fields in the database at midnight everynight?
I can imagine lots of reasons why you want a table of dates, but know that june 7 is tomorrow (june 8 if you are reading this one the 7th, etc) is not one of them.
Admin
Indeed.
And that, sir, is why I read this board. I have learnt something this morning. Sadly it seems to be an Oracle-only feature at the moment, but at least I now know its name. Thank you.
Admin
<font face="Courier New">CREATE OR REPLACE FUNCTION int_list(int4, int4) RETURNS SETOF int4 AS '
DECLARE
counter int4;
start_val ALIAS FOR $1;
end_val ALIAS FOR $2;
BEGIN
FOR counter IN start_val..end_val LOOP
RETURN NEXT counter;
END LOOP;
RETURN;
END;' LANGUAGE plpgsql;
SELECT i FROM int_list(3, 10) i;
</font> seems to do the same thing. Great.
Admin
Last time we had one of these WTFs about databases that run out of dates, I suggested that the right way to handle holidays was one table of holidays and one table of non-holidays. It was a joke, of course.
Looks like WTF is stranger than fiction.
Admin
Not funny, not at all. One of the applications I support has exactly this in its "design".
Admin
At a company I used to work for, I actually avoided this sort of WTF; it involved a valid_year table for a search engine. The design originally had year_id and year_value columns, with year_id columns as a foreign key in other tables. When searching for a range of years in other tables, it made for sequential scan fun. I was able to convince them that a single year_value column in the valid_year table was best.
Admin
I like this solution, and I hope to convert all of my databases to use the same design paradigm in the near future.
So you can add rows to the date table based on the description of a day .. think of the possibilities !!
insert into dates_tb (datedate, datedesc) values ('2006-08-02','Seven and a half weeks later');
So when a business application invariably has to parse a date in english, (eg - "Seven and a half weeks from now"), it can merely query the database rather than try and parse the date.
select dateid from dates_tb where datedesc='Seven and a half weeks from now';
Now, of course, in this example, the exact record wont be found .. since the spelling of that question is slightly different.
Not to worry - we can fix that in the database by adding a TRIGGER to the dates_tb table. On inserting a record into the dates_tb table, the trigger can call a stored procedure to deconstruct the question in the DateDesc field, and generate permuations of this for further insertion.
In our example - the stored procedure shall generate the following permutations of the DateDesc, and (recursively) insert new records for each of these :
'Seven and a half weeks time'
'In seven and a half weeks from now'
'In seven and a half weeks time'
'Just over another seven weeks'
'Around seven or so weeks from now'
'Probably a bit more than seven weeks'
'Almost eight weeks'
'Just a tad under eight weeks, but certainly no later'
'Couple of months is all we need to sort this one out'
'Should be all sorted by the time the next quarter comes around'
'I guarantee that you wont have to wait more than 3 months for this'
... etc
Having generated all those date permutations, the application performing the query is saved from having to calculate the date from the description. Beautiful.
So, at the end of the financial year, when the database server is flat out calculating all of its end of year reports ... the business application has to rapidly report to the company's creditors stating exactly WHEN those outstanding bills can be paid .. and then scheduling a funds transfer with the bank for several million dollars to keep those corporate wolves at bay !!.
insert into funds_transfer_tb (DueDateId, Amount) select (DateId, '1000000') from dates_tb where DateDesc='Seven and a half weeks from now';
Yeah !!! Too easy !!!
And so, the million dollars hits the bank on .. 2nd August 2006.
Hang on ...... Shit !!! Thats too early !!! We need at least 7.5 weeks from the start of the financial year - thats like, 16th Augus, not 2nd.
So what we need to do is add a scheduled job that goes off every week, and increments the DateDate field in the dates_tb table. So that way, at the start of each week, all of those records that we inserted point to the correct date in 7.5 weeks from now - all the freaking time ! Luckily, most SQL engines support date calcluation functions - DESIGNED JUST TO MAKE JOBS LIKE THIS REAL EASY ;-)
update dates_tb set DateDate = date_add(DateDate, interval 7 days);
Great, that completely fixes that problem, and we now have a super-smart date lookup mechanism at the heart of our database .. that stays current all the time. Arnt we freaking clever ;-)
Admin
Someone forgot the month table, the year table, the day table, the hour , minute, and seconde table....
!!!
Admin
:-D :-D :-D :-D :-D
Admin
This pic is as scarey and frightening as the solution itself.
I won't be able to sleep today, or even ever!!!!!!
Admin
you need to look more.
Admin
<FONT style="BACKGROUND-COLOR: #a9a9a9">This is beautiful. I strongly recommend to use this database design to bad employer. Serious!</FONT>
Admin
the real wtf here is that so few people understand that the real wtf is the surrogate key
Admin
Couldent agree more :-) haha
Admin
Admin
the databse is obsolete you say. But the problem is not. Can you tell me what is the current implementation?.
i am having a similar problem and i did a similar design but the higher ups are in no mood to approve the design..
SOS!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
It is a financial module which is to use fisical years, wuater , half years as well as months
Admin
And you can use a stored procedure for that??
Admin
OMG, that response just makes me wanna cry!
Marc.
Admin