Fact table consists of Measurements or facts of business process. It is centeralized table in star schema, called FACT. A fact table typically has two types of columns: those that contain facts and those that are foreign keys to dimension tables. The primary key of a fact table is usually a composite key that is made up of all of its foreign keys. Fact table contains the content of the Datawarehouse. A fact table might contain either detail level facts or facts that have been aggregated (fact tables that contain aggregated facts are often instead called summary tables). In the real world, it is possible to have a fact table that contains no measures or facts. These tables are called as Factless Fact tables.
Factless Fact means only the key available in the Fact and there are no measures available. Factless fact can have only keys or keys with count of occurrences/ events (For instance, no of accident in a month, no of policies has been closed in a month). It is used to support negative analysis report. For example a Store that did not sell a product for a given period. It is used to join the dimension tables.
Dimension table is a parent table which is connected with Fact table. Dimension has attributes which are normally descriptive and textual values. For instance, SALES table is fact table and possible dimension tables are TIME, PRODUCT, REGION, SALESPERSON, etc. Dimension never have foreign key in star schema data model.
Slowly changing dimension(SCD) applies to cases where the attribute for a record varies over time. We have three type of SCD, Type1, Type2 and Type3.
Type1: The new record replaces the original record. No trace of the old record exists
Type2: A new record is added into the customer dimension table. Therefore, the customer is treated essentially as two people.
Type3: The original record is modified to reflect the change
To learn more about SCD, please click SCD
Here is sample data model for star schema. In the below data model, the dimensions are products, customers, time and locations. The fact table is sales and it is connected with all dimension. Now we can see the fact data with different dimensions. The fact table Foreign key will be connected with primary key of dimension table. We never connect the one dimension table to another dimension table. Oracle optimizer understand the star query and generate the different execution plan to improve the performance.
Here is sample data model for snowflake schema.. Here customer dimension is normalized into another lookup table customertype. The same way, we can also normalize the customer dimension into another lookup table as country. Location dimension is normalized into territory. Product dimension can also be normalized into another lookup table as supplier...
Advantages of Snowflake...
1. No redundancy and hence more easy to maintain and change.
2. When we normalize the dimension, the dimension will not be sparsed.
3. It reduce the disk space since there is no redundancy. But in real world scenario, dimension is small table and disk space is not a major issue.
Disadvantages of Snowflake...
1. You need to join more tables when we write the query. It is less easy to understand.
2. In snowflake model, query process time will increase, since we are joining multiple tables.
When should we go for snowflake schema? Dimension has many columns, and the data are very sparse, most of the fields has no data, also disk space is a concern, then we can think of snowflake. Since dimension tables hold less space, snow flake schema approach may be avoided in some organization. We can go for snowflake model when we have small data mart or datawarehouse. Star schema model is good for bigger data mart/warehouse. To learn more about snowflake schema, please click.
To learn more about dimensional data modeling, please see these links. Link1 and Link2
No comments:
Post a Comment