Copyright (c) 2026 MindMesh Academy. All rights reserved. This content is proprietary and may not be reproduced or distributed without permission.

3.2.2. Transformation Tools: Dataflows, Notebooks, KQL, T-SQL

💡 First Principle: Each transformation tool has a sweet spot—match the tool to the data volume, transformation complexity, and team skills. Using Dataflow Gen2 for petabyte-scale transformations will fail; using notebooks for simple column renames is overkill.

Comparative Table: Transformation Tools
ToolEngineBest ForLimitations
Dataflow Gen2Power Query (M)Low-code ETL, small-medium dataNot for massive datasets
NotebookApache SparkComplex transformations, ML, big dataRequires coding skills
T-SQLSQL EngineWarehouse transformations, set-based logicLimited to relational data
KQLKustoTime-series analysis, real-time queriesSpecialized for time-series

T-SQL Transformation Example

-- Create transformed table from staging
CREATE TABLE dbo.FactSales AS
SELECT 
    d.DateKey,
    p.ProductKey,
    c.CustomerKey,
    s.Quantity,
    s.UnitPrice,
    s.Quantity * s.UnitPrice AS SalesAmount
FROM staging.Sales s
JOIN dbo.DimDate d ON s.OrderDate = d.Date
JOIN dbo.DimProduct p ON s.ProductID = p.ProductID
JOIN dbo.DimCustomer c ON s.CustomerID = c.CustomerID;

Spark Transformation Example

# Read source data
df = spark.read.format("delta").load("Tables/staging_sales")

# Transform: Add calculated column, filter nulls
df_transformed = df \
    .withColumn("SalesAmount", col("Quantity") * col("UnitPrice")) \
    .filter(col("CustomerID").isNotNull())

# Write to lakehouse
df_transformed.write.format("delta").mode("overwrite").save("Tables/fact_sales")
Alvin Varughese
Written byAlvin Varughese
Founder•15 professional certifications