• billr (unregistered) in reply to starsky51

    Hear, hear <g>. The number of IT folks who forget that those lusers pay IT salaries for the sole purpose of getting their work done more efficiently is depressing. If the user has to dedicate large blocks of time to learn about technology for its own sake, we failed.

    For folks insisting on using normalized structures instead of stars, imagine you need to query by day-of-month from a table containing billions of rows of timestamped events, a not uncommon requirement, and compute aggregates on the rows. Sure, you can select the computed day of month from the timestamp, if you have 20 or 30 minutes to wait. (Please don't suggest creating an index on a timestamp.)

    Or you could create an indexed view of the table, with the supporting structures needed, essentially recreating the structure of a data warehouse but with a much more inefficient design. Other benefits of this approach include saddling OLTP transactions with the overhead of updating these structures in real time, and enabling analytic and reporting tasks to interfere with OLTP so your LOB applications can slow down unpredictably.

    Or, you can learn how to use data warehouse techniques, present the structure the business needs in a data warehouse, achieve sub- or unit-second query responses, and decouple analytic processing from the ongoing transaction workload.

    Hmmm, I just can't make up my mind. Anyone have any Kool-Aid?

  • qbolec (unregistered)

    I think that those who suggest using functions instead of such table, are missing one important thing : a table is a relation, not necessarly a function. You can use relation in many different ways, while functions are usually one-way. Consider a function getDayOfTheWeek(date) and comapre it to the relation between all dates and 7 days of week. Can you use the function to actually find all fridays?

    Are the following two queries equally fast: SELECT whatever FROM somewhere WHERE getDayOfTheWeek(day)=5

    SELECT whatever FROM somewhere, dates WHERE somewhere.day=dates.day AND dates.day=5

    No. And I do not mean that getDayOfTheWeek is slow or something. It's not about comparing constants behind the O(1) notation. It is about the fact, that the first query has to scan all rows, while the other one can first use the index on dates to find only these dates which are fridays, and then looup only those days in somewhere.

    This example is probably not so good as scanning every 1 of 7 rows is not quite fast in reality. But function getMonth -- now the rows which you are looking for form a nice slice on the disk.

    Above examples are purely fictional and I am not sure if in these cases join is really faster from using a function. The point is I believe there are situations in which this is faster, if the engine stores data column-wise etc.

  • Meep (unregistered) in reply to JM
    JM:
    With only 24 rows in the "HourOfDay" column, what do they do for daylight savings?

    Nothing, since there is no "daylight savings". It's "saving," singular.

  • Meep (unregistered) in reply to qbolec
    qbolec:
    I think that those who suggest using functions instead of such table, are missing one important thing : a table is a relation, not necessarly a function. You can use relation in many different ways, while functions are usually one-way. Consider a function getDayOfTheWeek(date) and comapre it to the relation between all dates and 7 days of week. Can you use the function to actually find all fridays?

    Are the following two queries equally fast: SELECT whatever FROM somewhere WHERE getDayOfTheWeek(day)=5

    SELECT whatever FROM somewhere, dates WHERE somewhere.day=dates.day AND dates.day=5

    No. And I do not mean that getDayOfTheWeek is slow or something. It's not about comparing constants behind the O(1) notation. It is about the fact, that the first query has to scan all rows, while the other one can first use the index on dates to find only these dates which are fridays, and then looup only those days in somewhere.

    Very astute observation, and that's why a proper relational database should implement functions as virtual tables. You would, of course, still have to implement the inverses, but that's usually not too hard. SQL, alas, isn't that clever.

    Above examples are purely fictional and I am not sure if in these cases join is really faster from using a function. The point is I believe there are situations in which this is faster, if the engine stores data column-wise etc.

    For a column-oriented system, you're looking for gains when doing scans of huge runs of data; it's a very different mentality than with typical transactional systems.

  • David (unregistered) in reply to billr
    billr:
    If the user has to dedicate large blocks of time to learn about technology for its own sake, we failed.

    Rubbish! IT is the only professional discipline where people expect extremely complex systems to "just work" without any effort whatsoever on their part. Sometimes this is fair, but often it is not. If technology has failed at all, it has failed to manage user expectations so that users have some idea whether their demands are reasonable, or even (in some extreme cases) possible.

    billr:
    Hmmm, I just can't make up my mind. Anyone have any snake oil?

    FTFY.

  • billr (unregistered) in reply to David
    David:
    billr:
    If the user has to dedicate large blocks of time to learn about technology for its own sake, we failed.

    Rubbish! IT is the only professional discipline where people expect extremely complex systems to "just work" without any effort whatsoever on their part. Sometimes this is fair, but often it is not. If technology has failed at all, it has failed to manage user expectations so that users have some idea whether their demands are reasonable, or even (in some extreme cases) possible.

    Oh I think you've driven a car, watched TV or played a video game? Fairly complex, those items. I tend to think that IT (app design, actually) may be the only place where people can say that something is hard to use because it does complicated things, and people still let them get away with it.

  • David, who is abject before billr (unregistered) in reply to billr
    billr:
    Oh I think you've driven a car...

    I have. It took months of lessons before I could pass my test. And I still wouldnt know how to rebuild a gearbox if it broke.

    If these things "just work" for you, you were evidently born with an intuitive knowledge of things that are even now almost beyond my ken. Truly I am an unworthy interlocutor.

  • Blorb (unregistered) in reply to JC

    I agree. This sort of stuff makes life easier in data warehousing (though I don't grok the point of all the seq fields).

    Hour of Day and Day of Week dimensions? Very useful for aggregate fact tables and such. Month dimensions too.

  • (cs)

    Even a table of numbers is extremely useful.

    By Adam Machanic, who is a SQL guru: http://sqlblog.com/blogs/adam_machanic/archive/2006/07/12/you-require-a-numbers-table.aspx

    The links from that page show cool uses of a numbers table. Date tables, as decribed in the original "wtf" here, have similar uses to a numbers table.

  • Neil (unregistered) in reply to Hegel
    Hegel:
    It seems obvious, that there is an equivalent need for a truth table with 3 rows in it.
    Well, the boring answer would be

    TRUE FALSE NULL

    But perhaps you were thinking of an alternative third row?

  • Jimmy (unregistered)

    To add to the previous posters, this is not a WTF. This is standard practice for data warehousing, as opposed OLTP.

  • anonymous (unregistered) in reply to PM

    Nothing wrong with having a Fact date table in a data warehouse. I tend to create the primary key as an int rather than datetime which makes it much easier to use. i.e. 20121128 is the pkid of the table.

    Having control of the Fact date makes reporting and populating a data warehouse much easier.

  • Uncle Al (unregistered) in reply to Jimmy
    Jimmy:
    To add to the previous posters, this is not a WTF. This is standard practice for data warehousing, as opposed OLTP.

    Finally, a new TDWTF meme to replace "this is standard practice in embedded systems"!

Leave a comment on “The Overloaded Time Table ”

Log In or post as a guest

Replying to comment #:

« Return to Article