Tuesday, May 5, 2009

Choosing SCT Type at Run Time

When a data warehouse is designed, the architect must choose a dimension type for each dimension in the warehouse.  The two must common types identified by Ralph Kimball are the Type 1 SCD, in which changes in dimensional attributes are overwritten.  In the Type 2 SCD, changes are not overwritten, but are recorded in multiple records for each object in the dimension table.  

After using Pentaho's PDI, it become clear that you can specify the dimension type at the attribute level as long as the underlying table is a Type 2 SCD.  Still these decisions need to be made when the dimension table's ETL is being designed.

What happens if you would rather delay the decision until run-time?  This flexibility can be provided with an auxiliary table for each dimension table.  The auxiliary table links each record in the dimension table with the most recent record for that object.  With that view defined, you can easily set up a view for a Type 2 dimension table that behaves like a Type 1 dimension.

I am surprised that I don't find any references to this in standard data warehouse references or in online articles.  If anyone can point out a reference to this, I would be grateful,.

No comments:

Post a Comment