3.1.2. Dimensional Model Preparation
💡 First Principle: Dimensional models (star schemas) optimize data for analytical queries by pre-joining and denormalizing—trading storage efficiency for query speed. Think of it like meal prep: you could cook each ingredient separately every time, or you could pre-combine common combinations so dinner is ready faster.
Scenario: An OLTP database stores orders in normalized tables (Orders, OrderItems, Customers, Products). Analytical queries must join all tables, causing slow performance. A dimensional model pre-joins and denormalizes for fast queries.
Star Schema Components
| Component | Purpose | Example |
|---|---|---|
| Fact Table | Measurable business events | FactSales: SalesAmount, Quantity |
| Dimension Table | Descriptive attributes | DimCustomer: CustomerName, Region |
| Surrogate Key | Artificial unique identifier | CustomerKey (not CustomerID) |
| Natural Key | Business identifier | CustomerID (from source) |
Visual: Star Schema Structure
Data Transformation for Dimensional Models
- Extract from source systems
- Clean data quality issues
- Conform dimensions (standardize attributes)
- Generate surrogate keys
- Load dimensions first, then facts
⚠️ Exam Trap: Using natural keys in fact tables creates problems when source keys change (customer ID format updates, product SKU restructuring). Natural keys can change; surrogate keys provide stable references. Always generate surrogate keys for dimension tables.