SSWUG vConf – Fact Table Design 101


2580276686_9e502d4b5ePresenter: Erik Veerman
erik@solidq.com

In a Business Intelligence solution, the fact tables hold the core data that you are analyzing – facts (also called measures). Therefore, fact tables are a critical component to get right the first time. Poor fact table design will lead to poor performance and difficult calculations. This session dives into fact table and considers the basic column types, measure aggregation types, fact table types, and volume considerations.

Facts

  • The fact itself
    • The “measure” that is being tracked. The thing
    • Quantity, count, amount, percent
    • Most always numerical, continuous values
      • e.g., price of a product, quantity sold, budget value, count of customers
  • Facts (or measures) can be classified by…
    • Numerical data type
    • Aggregation type
    • Additive nature
    • Granularity – level of detail stored
  • Fact tables
    • Capture measures/facts
    • Association with dimensions (using surrogate key as foreign key in fact table)
      • No dimensions attributes!
    • Some tracking information included
  • Different types of fact tables
    • Transactional – Additive facts tracking events over time (Star Schema)
      • Most common type of fact table
        • Track the occurrence of events, each detailed event is captured into a row in the fact table
        • Measures are typically additive across all dimensions
        • Common transactional fact table types
          • Sales, Visits, Web-page hits, Account transactions
    • Snapshot or inventory – Pictures in time of levels or balances
      • Periodic
      • Accumulating
      • Known as inventory level fact tables
        • Time dimension used to identify grain
        • Non additive measures across time, but typically additive across all other dimensions
        • Common transactional fact table types
          • Inventory levels, Event booking levels, Chart of account balance levels
    • Factless – Dimensionality relationships
      • No measured facts!
        • Are useful to describe events and coverage
        • Information that something has or has not happened
          • Often used to represent many-to-many relationships
          • Contain only dimension keys
          • Common factless fact tables:
        • Class attendance, Event tracking, Coverage tables, Promotion or campaign facts
  • Fact Table Granularity
    • Never mix the grain of the table!
    • The level of detail of data contained in the fact table
    • The description of a single instance (a record) of the fact table
    • Typically includes a time level and a distinct combinations of other dimensions
      • e.g. Daily item totals by product, by store, Weekly snapshot of store inventory by product

Maybe include ETL load date/time in the fact table.

Measures – Additive Nature

  • Additive: Facts that can be summed up/aggregated across all of the dimensions in the fact table
    • e.g. discrete numerical measures of activity, i.e. quantity sold, dollars sold
  • Semi-Additive: Facts that can be summed up for some of the dimensions in the fact table, but not the others
    • e.g. numerical measures of intensity, i.e. account balance, inventory level, distinct counts
  • Non-Additive: Facts that cannot be summed up for any of the dimensions present in the fact table.
    • e.g. room temp

Aggregations

  • Aggregation (Aggs): A summarization of base-level fact table records
    • Common aggregation scenarios
      • Category product by store by day
      • District store by product by day
      • Monthly sales by product by store
      • Category product by store district by day
      • Category product by store district by month
  • Aggregations need to account for the additive nature of the measures
    • Aggregations can be created on-the-fly or by the process of pre-aggregation
    • Common aggregations
      • Sum
      • Count, Distinct Count
      • Max, Min
      • Average
      • Semi-additive: Last Child, Last Non-empty Child

http://www.microsoft.com/fasttrack
SQL Server Fast Track Data Warehouse accelerates your data warehouse roadmap with new SQL Server 2008 Enterprise scalable reference architectures for HP, Dell and Bull. Reduce costs, save time and reduce risk with reliable, pre-tested hardware and best practices for warehousing. Read more.

Design with Additive in Mind

  • Think dimensionally!
  • Complex requirements don’t need to be designed with complex queries
  • Many times new fact tables can be designed that can answer specific questions, such as date attributes and ranges

Getting Started

  • Step 1: Identify high-value business process to model (orders, invoices, shipments, inventory)
    • Confirm data source availability
    • Understand value vs. complexity
  • Step 2: Identify reporting grain of the business process
    • The grains is the level of detail at which the data should be represented for analytics
    • This may not be the same grain as the source!
    • For snapshot facts, determine what time level will be captured for each snapshot (daily, weekly, monthly)
  • Step 3: Identify dimensionality that will apply to each fact table
    • Time, product, customer, store, etc.
    • Some dimensions are not grain-identifying
    • Validate the source can associate to the fact table
  • Step 4: Identify measured facts that will populate fact table
    • Validate the base measures are identifiable from the source
    • Some measures may be derived
    • Measure examples: product count, quantity sold, dollars sold, inventory quantity
  • Identify business questions:
    • How much total business did my newly remodeled stores do compared with the chain average?
    • How did leather goods items costing less than $5 do with my most frequent shoppers?
    • What was the revenue comparison of non-holiday weekend days to holiday weekend days?
  • Analyze questions to assist design!

I really like these “Getting There” bullets. Erik does a good job summarize and giving you something practical to walk away with. Nice stuff.

Popularity: 3% [?]

About Tom

Christ follower, husband, father, technology and photography enthusiast. Attempting to live life out as a light in this world and stumbling at times in this fallen world. Got a topic you want to have me look into? Did I miss something in a post? Let me know. Just add a comment below.