- Feature Articles
- CodeSOD
-
Error'd
- Most Recent Articles
- Secret Horror
- Not Impossible
- Monkeys
- Killing Time
- Hypersensitive
- Infallabella
- Doubled Daniel
- It Figures
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
I would post the frist comment, but I don't have time to dig through the APIs and write the interface to do it.
Admin
With only 24 rows in the "HourOfDay" column, what do they do for daylight savings?
Admin
Why use sql date functions? They might add a new day to the week! This way, the application will continue to work ;-) LOL!
Admin
Maybe they have to change it every time daylight savings changes! That generates work! (It's good right? :P )
Admin
Only 2 fields in hours table? Now that's a real wtf, the following is absolute minimum.
describe hours; +---------------+---------------------+-----+ | Field | Type | Null | Key | +---------------+---------------------+-----+ | hour_id | guid | NO | PRI | | next_hour_ref | guid | NO | | | prev_hour_ref | guid | NO | | | name | varchar(255) | NO | | | value_12h | tinyint(1) | NO | | | value_24h | tinyint(1) | NO | | | is_night_hour | bool | NO | | +---------------+---------------------+-----+
Admin
Seems like there might be aware out. What if drops the table and creates a view by the same name, the view will just be a select from a table valued function that generates the dates?
Still ugly but might swap expensive disk I/O for comparatively cheap CPU.
Admin
One thing ive learnt is that the uglier, more denormalised a DataWarehouse backing table is, the easier the query is to write against the DW.
And for anyone who has written MDX, thats a VERY good thing.
Admin
Wonderful, nearly perfect!
You know how Monday is dreaded because it's the first day of the working week. Suppose we vote to swap Mondays and Tuesdays starting from tomorrow: that table will do the work after a SET, while the rest of the world will have to wait for the next version of their RDBMS.
But something is still missing: the global chronological index! What if we decide that 27-Oct-2002 came before 3-Feb-1998?
CAPTCHA: praesent: indeed, talking about time: what comes after the paest and before the fauture.
Admin
That's pretty standard for datawarehouse tables... everything is pre-calculated, you don't do anything on the fly... plus this table may be small, but the ones with which you join this one may not...
Admin
I completely believe all the tables are very much required for smooth operation of application's report features.
Admin
I know this kind of table very well. Most of the time i encounter it, it was created while creating a OLAP-Cube with Microsoft SQL-Server. The Wizard for creating cubes often notes, that there is no Time-Dimension Table and asks if the user wants to create this table. You can even check which of all those fields you want. It seems to me, this table was created exactly as such.
See this screenshot from the Wizard: [image]
Admin
Admin
This is only true if either you are terrified of joins (because you don't understand them) or you are normalising into the extreme.
I believe the former applies to your case. Generally speaking, join equals denormalise (on demand). When selecting, a denormalised database gives the illusions of being easy to query. However, these queries probably don't consider all of the integrity problems denormalised storage is vulnerable to. Problems you may have to check for, unless someone extremely adept has written the code for anything that modifies data. Writing queries to take into account such possible anomalies may not only be hard, but may be impossible.
In summary, what you mean to say is that you suck at joins and are unable to mentally visualise the intermediate dataset created by joins with ease. You need joins to be done for you before hand. I pity the fool who hires you.
Admin
To all those who are saying this is unnecessary and wtf-worthy, let me tell you this is perfectly standard in a DW world. While this table is missing some columns that I would personally add in (eg. weekday vs weekend), it is still very useful for data warehouses for 2 reasons:
reporting tools can pick this up as a standard dimension very easily and quickly, making it intuitive for the (non-tech) user to get at data. I haven't seen a lot of reporting tools generate good sql or use functions properly, so this is a huge help
It actually makes querying a lot faster as well by avoiding on the fly calculations which can be very tedious if running against a large (think terabyte-size) fact table. This dimension table is small enough to fit in memory, so disk access is not an issue. However, if you're doing on-the-fly calculations or using functions against every row in a large fact table (quite the norm for data warehouses), it will slow down the query by a non-significant amount, besides causing cpu issues.
Admin
No, what I tink he really meant to say is that tables like this are pretty normal for the backing of a time dimension in a data warehouse. Seems like he understands data warehousing much better than you do.
Admin
Actually, one part of that sane. It's useful to store business days in the database because computing holidays can sometimes be iffy. (Especially if you're a multinational, and have to know if an arbitrary date is a business day in different countries.)
Admin
Once there was a programmer who needed some date calculations. So he decided to use a table.
Then he had two problems.
Admin
Not a WTF at all.
This is best practice for a star schema DW.
Admin
Performing a joined query with a SIMPLE "date" table can indeed simplify the exercise tremendously, especially when you're interesting in a range of records being returned but the data is stored with from/to date values... it's the difference between a very simple query and a reasonably painful stored procedure or even application code...
Admin
Admin
This is standard for data warehouses. It only looks like a WTF to people used to working with normalized databases for production application use. Data warehouses are a completely different topic.
Admin
I remember those days... when the cost of 4MB of RAM was roughly equal to 3 hours of contracting wages...
Admin
Admin
This is a perfectly normal thing to do in a data warehouse. Once you've got a few million rows in a fact table, filtering and grouping by DATENAME() functions will get really slow, whereas looking up the values in a date dimension will perform just like the lookups on any other dimension table. It also allows you to capture all the other date-related information which there isn't a function for - public holidays and the like.
Admin
Perfectly sane, except for DST problems.
Admin
The real WTF is that this is on TDWTF.
Date dimension tables are entirely normal and basically a mandatory standard in the data warehousing world. Actually, the problem with this particular table is that it's missing certain important things, such as IsWorkDay, IsHoliday and DST changes. The jury is still out on whether the primary key should be a natural (date) or a surrogate (int) in date dimension tables, as that's one of the major IT holy wars.
Admin
TRWTF is a data warehouse "developer" who doesn't know that pre-calculating all these values makes OLAP queries much quicker, which is the point of a data warehouse.
Admin
For those people that don't understand datawarehousing this table seems redundant but it is not. It will greatly reduce query times.
For those saying that you should not be scared of joins, please google "Snowflake" and you will see it is not a good thing when working with datawarehouses.
Admin
OK, the votes are now in and it looks like the whole of Florida has commented that "this is perfectly normal for data warehousing", twice. So TRWTF is Florida, then?
Admin
You should make all your columns VARCHAR. That way you can handle times like value_24h='16am'.
Admin
I see a time table, but don't see any reference of which train to catch.
Admin
At least 90% of WTFs seem to arise from attempts to make technology "easier" for people who have no aptitude, inclination or intention of learning the first thing about it.
Theres a lesson to be learned there. If only someone smart could work out what it is.
Admin
If you're not using a date table, and you think doing the calculations on the fly is better approach, then you're doing it wrong. This applies to normal database development every bit as much as data warehousing.
Admin
A Time Table inside an OLAP Cube. Is that a nature's harmonic simultaneous 4-day time cube?
Admin
This isn't a WTF...
Admin
My current implementation has several time dimension tables at different levels. So there's a day table, a month table, a quarter table and a year table. The tables get narrower as their scope gets wider. We maintain SQL whose sole purpose is to populate all the columns in all the tables, so if a column is ever added we know what all the values in all the tables will be.
Sure, the day table has 10k or 15k rows, but it mostly gets used during ETL, where most large input tables get rolled to the month level (at their most granular). Users who are crafting reports using the off-the-shelf GUI tool normally only use the quarter and year tables and performance is great for that.
I think these guys went a little off on their columns, but I can't say much because my day table has both the old-school Excel equivalent integer date and the new-school.
Admin
Admin
Your absolutely right. A Date dimensional table is very common and much much faster for large datawarehouse queries. Datawarehousing is very different from OLTP systems and standard rules of good vs bad change a lot.
Also in general using function calls within your sql code is a performance hit especially if its in the where clause, because it causes that function to be run for every row in the table. It's even worse when you nest them in a complex expression because it confuses the query optimizer and you'll often get a bad plan as a result. At least if the function call is within the list of columns to output, you only run it for those rows you are actually returning.
Admin
Someone has obviously never worked on a good Star-Schema Data Warehouse before? A few of those columns are sketchy, but the majority are extremely useful, especially when dealing with a FISCAL calendar rather than a Julian one. Good luck doing SQL Server date functions to calculate the next month's start when months vary from 28 to 42 days in length with no tie to a regular calendar in any way.
Also, the dimension tables for all the time / date related information is to allow Cognos(the real WTF?) or other Business Reporting systems to print User Strings for date/time/hour descriptions instead of having to use intelligent keys and some other hard-wired translation.
Addendum (2012-11-15 10:21): I must say though, I may disagree with the prev_prior_etc columns though. Especially if those are a poor attempt at a slowly changing dimension. Using columns to mimic a fixed size slowly changing dimension is a terrible hack of a solution 99% of the time.
Addendum (2012-11-15 10:27): Or is that prev_prior to connect to 2 years back? They really went to the extreme with the Year over Year over Year over Year reporting columns!
Admin
Last field should be named "is_midnight_hour" so one can do:
' Check in by wpickett ' Code reviewed by scropper do { wait() } until { is_midnight_hour }
me->love("begin to shine")
Admin
It seems obvious, that there is an equivalent need for a truth table with 3 rows in it.
Admin
You clearly have no experience of DataWarehousing whatsoever. This has nothing to do with "Joins". Troll harder next time.
Admin
Admin
This is the first time that i have to comment on a Daily WTF post because i cannot agree with it at all. I'm working in a BI department of a large company and there are several reasons to use this kind of tables.
For the hours of day table one thing you might want to have is a report like that shows the revenue for each hour (or each day) - the problem is that if you have no revenue for an hour all built-in functions (like to_char(date,'hh24') will just ignore it and it will be missing from the resultset. These tables solve the problem.
select hour, sum(revenue) from tblHours left join tblRevenue on to_char(date,'hh24') = hour group by hour
Other aspects that are much easier using these tables:
There are many more of these ...
greetings
Michael
Admin
"Here, add this and pick what you need." That said, the table here is a bit over the top in its WTF-ness.
(And a simple weekday/weekend flag doesn't allow for holidays etc.)
Admin
Exactly. The OP sounds like someone who understands relational databases for OLTP -- in which case this would be WTF code -- but doesn't quite grok a star schema for a data warehouse. The fact that he could write date functions in code is missing two points: 1. The table can be used to populate e.g. dropdown lists in reports, for users. 2. With appropriate indexing the report query may well run faster.
Admin
Having a time/date dimension table in a data warehouse is, as stated, perfectly normal.
This one does appear to be going a bit overboard with columns, though. A lot of those are probably better handled by MDX functions like ParallelPeriod (which introduces its own special brand of leap-year WTFs).
Admin
I once worked on an enterprisey database where I found a table with multiple fields of one was "DayOfWeek" which for some strange reason had values 1 .. 31. A colleague of mine explained:"Yes, erm, we mislabeled that field."
Admin
Yeah, sorry buddy, but you clearly don't have experience in data warehousing. This is precisely how a date dimension should be designed. Using SQL server date functions to filter large volumes of fact data is utter madness. And that's ignoring the fact that things like the definitions for quarters, week numbers, and so forth, is variable and quite complicated to calculate... there's no way I'd want to be doing that at the database level at query time.
Please, if you don't understand this stuff, go read a book or two by Kimball on data warehousing. Because this is fundamental to the way warehouses are designed.
Admin
Of course, I'm trying to learn how to play guitar, so I don't have that problem, and I know that the real purpose of USB ports is connecting Rocksmith cables.