Presenter: Erik Veerman
erik@solidq.com
Dimension Tables are one of the core components in a dimensional design and it is critical to design your dimension tables correctly in order to lay a solid foundation to a Business Intelligence system. This session dives into the dimension design techniques and considers the core components of a dimension table, surrogate keys, attributes, and hierarchies. In addition, we will consider advanced hierarchy types such as parent-child hierarchies, snowflake designs and unbalanced hierarchies. Finally, we will consider the best practices in tracking changes historically.
This session was intense so I just could not keep up with the slides. Here are just notes apart from the slides once I fell behind. Erik presents excellently and at a level appropriate for a 101 session without being condescending. Great job!
Dimensions (review from Dimensional Modeling 101 Session)
Dimensions – Qualitative
Facts (Measures) – Quantitative
- Dimensions (Qualitative information)
- Business perspective from which data is looked upon
- Collection of text attributes that are highly correlated
- e.g. Product, Store, Time, Manager of Store
- Conformity (How Dimensions relate to each other)
- Shared with multiple fact relationships
- Provides data correlation
- Attributes (many times is related to a table column, not always)
- Descriptive characteristics of an entity
- Building blocks of dimensions, describe each instance
- Usually text fields, with discrete values
- e.g., the flavor of a product, the size of a product
- Hierarchies
- drill-paths within the dimension
- Allows top-down analysis
- Natural groupings of data relationships
- e.g., date hierarchy is most common. Year to qtr to month to day for example)
- Dimension Keys
- Surrogate Keys (Primary key, DW created)
- Candidate Business Keys/Alternate key (Source system keys, uniqueness)
- Dimension Granularity
- Granularity in general is the level of detail of data contained in an entity (lowest level detail)
- A dimensions granularity is the lowest level object which uniquely identifies a member
- Typically the identifying name of a dimension
Name surrogate key with SK_
Dimension naming DIM_
Fact naming FACT_
http://www.codeplex.com/MSFTDBProdSamples to download the AdventureWorksDW example.
Hierarchies
ALL level is usually the top level
All -> Region -> Country -> City -> Office
Seeing one we could use would related to the various Bible translation project locations currently under way. Need to keep in mind project sensitivities.
A dimension can have multiple hierarchies.
Standard Hierarchies
- All levels have values.
Ragged Hierarchies
- Missing members at the mid-levels.
- All -> Country -> State -> City and then you might be missing a state is the country is Israel since there are not states there.
Unbalanced Hierarchies
- Multiple grains in the hierarchy
- May not always be able to drill down to the lowest detail level since the branch may stop short. (Think Org Chart)
Demo: SSMS connected to a SSAS Cube
Product – Standard hierarchy
Sales Territory – example of unbalanced hierarchy
Employee – another example of an unbalanced hierarchy
Dimension Keys
Business Keys
- Column(s) identify the unique instance of a business record form the source system.
- You may have multiple records with the same Business Key. This is right as it allows historical tracking.
- Used in the process that ties fact records with dimension members
- Business key is used to find the right surrogate key reference
Surrogate Keys
-
- Defined the dimension’s primary key
- Usually an integer
- Important to pick the right column width.
- 2,4,8 byte
- SmallInt – 2 bytes go for up to around 10k records
- BigInt – 8 bytes, for REALLY large dimensions
- Consolidate multi-value business keys
- Allows tracking of dimension history
- Standardizes dimension tables
- All are structured in the same way
- Business Key
- Surrogate key
- All are structured in the same way
Don’t put attributes and business keys in your fact table.
Design Practices
- Avoid smart keys
- Avoid production keys
- production may decide to reuse keys
- the company may acquire a competitor and thereby change the key building rules changed record, but deliberately not changed key
Types of Dimensions
Different dimensions types should be used in different scenarios.
Basic Dimension Types
- Standard Star dimension
- Single table
- Usually date
- Snowflake
- More than one table in a cascading
- Parent Child (unbalanced hierarchy)
- Self referencing surrogate to parent surrogate key relationship
- Self referencing business key relationship
- e.g. Org Chart, Chart of Accounts
Advanced Types
- Degenerate (Seldom used. Be careful)
- Dimensions business key with no corresponding dimension table
- Are embedded in the fact table
- Typically use only when your Dim record count is about the same as the fact table record count.
- Usually in line item oriented tables like sales tables
- Profile or Junk Dimensions
- Convenient grouping of flags and attributes to get them out of the fact table into a useful dimensional framework.
- Good for one-off lookups. Put them into a single dimension rather than a bunch of small dimensions
- True/False type attributes
- Role Playing Dimension
- A single dimension used for multiple purposes
- Typical example is the date dimension or the geography dimension (outrigger)
- Time Dimension
- Multiple calendars
- Fiscal
- Natural
- Usually a single table
- Multiple calendars
Tracking History
Industry best practices.
Changing Dimensions
- Slowly changing Dimensions
- o – No change
- Birthdate
- 1 – Not interested in history (Updating a row/record)
- 2 – Slow changes. Adds new row/record
- 3 – Fast changes. Adds new column
- o – No change
- Rapidly changing Dimensions
- Large dimensions
- Limit type 2’s
- Large dimensions
Good example graphical slides of Type 1 versus Type 2
Getting There
- Understand dimension hierarchies and drill-paths
- Confirm historically tracked attribute req’s.
- Don’t be afraid to push back a little
- Check source system data integrity, cleanliness
- Review current reports
Popularity: 6% [?]

