- Feature Articles
- CodeSOD
-
Error'd
- Most Recent Articles
- Secret Horror
- Not Impossible
- Monkeys
- Killing Time
- Hypersensitive
- Infallabella
- Doubled Daniel
- It Figures
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Hear, hear <g>. The number of IT folks who forget that those lusers pay IT salaries for the sole purpose of getting their work done more efficiently is depressing. If the user has to dedicate large blocks of time to learn about technology for its own sake, we failed.
For folks insisting on using normalized structures instead of stars, imagine you need to query by day-of-month from a table containing billions of rows of timestamped events, a not uncommon requirement, and compute aggregates on the rows. Sure, you can select the computed day of month from the timestamp, if you have 20 or 30 minutes to wait. (Please don't suggest creating an index on a timestamp.)
Or you could create an indexed view of the table, with the supporting structures needed, essentially recreating the structure of a data warehouse but with a much more inefficient design. Other benefits of this approach include saddling OLTP transactions with the overhead of updating these structures in real time, and enabling analytic and reporting tasks to interfere with OLTP so your LOB applications can slow down unpredictably.
Or, you can learn how to use data warehouse techniques, present the structure the business needs in a data warehouse, achieve sub- or unit-second query responses, and decouple analytic processing from the ongoing transaction workload.
Hmmm, I just can't make up my mind. Anyone have any Kool-Aid?
Admin
I think that those who suggest using functions instead of such table, are missing one important thing : a table is a relation, not necessarly a function. You can use relation in many different ways, while functions are usually one-way. Consider a function getDayOfTheWeek(date) and comapre it to the relation between all dates and 7 days of week. Can you use the function to actually find all fridays?
Are the following two queries equally fast: SELECT whatever FROM somewhere WHERE getDayOfTheWeek(day)=5
SELECT whatever FROM somewhere, dates WHERE somewhere.day=dates.day AND dates.day=5
No. And I do not mean that getDayOfTheWeek is slow or something. It's not about comparing constants behind the O(1) notation. It is about the fact, that the first query has to scan all rows, while the other one can first use the index on dates to find only these dates which are fridays, and then looup only those days in
somewhere
.This example is probably not so good as scanning every 1 of 7 rows is not quite fast in reality. But function getMonth -- now the rows which you are looking for form a nice slice on the disk.
Above examples are purely fictional and I am not sure if in these cases join is really faster from using a function. The point is I believe there are situations in which this is faster, if the engine stores data column-wise etc.
Admin
Nothing, since there is no "daylight savings". It's "saving," singular.
Admin
Very astute observation, and that's why a proper relational database should implement functions as virtual tables. You would, of course, still have to implement the inverses, but that's usually not too hard. SQL, alas, isn't that clever.
For a column-oriented system, you're looking for gains when doing scans of huge runs of data; it's a very different mentality than with typical transactional systems.
Admin
Rubbish! IT is the only professional discipline where people expect extremely complex systems to "just work" without any effort whatsoever on their part. Sometimes this is fair, but often it is not. If technology has failed at all, it has failed to manage user expectations so that users have some idea whether their demands are reasonable, or even (in some extreme cases) possible.
FTFY.
Admin
Oh I think you've driven a car, watched TV or played a video game? Fairly complex, those items. I tend to think that IT (app design, actually) may be the only place where people can say that something is hard to use because it does complicated things, and people still let them get away with it.
Admin
I have. It took months of lessons before I could pass my test. And I still wouldnt know how to rebuild a gearbox if it broke.
If these things "just work" for you, you were evidently born with an intuitive knowledge of things that are even now almost beyond my ken. Truly I am an unworthy interlocutor.
Admin
I agree. This sort of stuff makes life easier in data warehousing (though I don't grok the point of all the seq fields).
Hour of Day and Day of Week dimensions? Very useful for aggregate fact tables and such. Month dimensions too.
Admin
Even a table of numbers is extremely useful.
By Adam Machanic, who is a SQL guru: http://sqlblog.com/blogs/adam_machanic/archive/2006/07/12/you-require-a-numbers-table.aspx
The links from that page show cool uses of a numbers table. Date tables, as decribed in the original "wtf" here, have similar uses to a numbers table.
Admin
TRUE FALSE NULL
But perhaps you were thinking of an alternative third row?
Admin
To add to the previous posters, this is not a WTF. This is standard practice for data warehousing, as opposed OLTP.
Admin
Nothing wrong with having a Fact date table in a data warehouse. I tend to create the primary key as an int rather than datetime which makes it much easier to use. i.e. 20121128 is the pkid of the table.
Having control of the Fact date makes reporting and populating a data warehouse much easier.
Admin
Finally, a new TDWTF meme to replace "this is standard practice in embedded systems"!