Fact and Dimension Tables

Fact and Dimension Tables are the core components of dimensional modeling in a Data Warehouse. They are commonly used in Star Schema and Snowflake Schema designs.

Understanding these two tables is very important in Data Engineering and Business Intelligence.

What is a Fact Table?

A Fact Table stores quantitative (numeric) data related to business events.

It contains measurable values and foreign keys that connect to dimension tables.

Key Characteristics of Fact Table

  • Contains numeric metrics (sales_amount, quantity, profit)
  • Contains foreign keys
  • Large in size
  • Grows continuously over time
  • Used for calculations and aggregations

Example: Sales Fact Table

sale_idcustomer_idproduct_iddate_idregion_idquantityamount

Here:

  • quantity and amount are facts (measures)
  • customer_id, product_id, date_id, region_id are foreign keys

Types of Fact Tables

  1. Transaction Fact Table
    Stores individual transactions (e.g., each sale).
  2. Snapshot Fact Table
    Stores data at a specific time (e.g., daily inventory).
  3. Accumulating Fact Table
    Tracks process milestones (e.g., order lifecycle).

What is a Dimension Table?

A Dimension Table stores descriptive information about the business.

It provides context to the numeric data in the fact table.

Key Characteristics of Dimension Table

  • Contains descriptive attributes
  • Smaller than fact tables
  • Used for filtering, grouping, and labeling
  • Usually contains a primary key

Example: Customer Dimension Table

customer_idnamecitysegmentage_group

This table describes the customer.

Common Dimension Tables

  • Customer
  • Product
  • Date
  • Region
  • Employee

Fact vs Dimension Table

FeatureFact TableDimension Table
Data TypeNumeric (measures)Descriptive (attributes)
SizeLargeSmaller
KeysForeign keysPrimary key
PurposeCalculationsFiltering & grouping
ExampleSales amountCustomer details

How They Work Together

Example Question:

What is the total sales amount by region?

Step 1: Fact table provides sales amount
Step 2: Dimension table provides region name
Step 3: Join them to calculate total sales by region

Fact Table + Dimension Tables = Business Insights

Real-World Example

In a retail company:

Fact Table:

  • Sales transactions

Dimension Tables:

  • Customer details
  • Product information
  • Store location
  • Time (date, month, year)

This structure is commonly used in tools like:

  • Snowflake
  • Amazon Redshift
  • Google BigQuery

Interview Answer (Short Version)

A Fact Table stores measurable business data and foreign keys, while a Dimension Table stores descriptive attributes that provide context to the facts.

Final Summary

Fact Table = Numbers (What happened?)
Dimension Table = Description (Who? What? When? Where?)

Together, they form the foundation of dimensional data modeling in a Data Warehouse.

Home » PYTHON FOR DATA ENGINEERING (PYDE) > Data Warehousing Concepts > Fact and Dimension Tables