Copyright (c) 2026 MindMesh Academy. All rights reserved. This content is proprietary and may not be reproduced or distributed without permission.

4.3.2. Lakehouse Table Optimization (OPTIMIZE, V-Order)

💡 First Principle: Delta Lake accumulates small files during streaming and incremental writes. File consolidation and V-Order optimization dramatically improve query performance—like defragmenting a hard drive.

Scenario: A Delta table received millions of small writes from streaming ingestion. Queries that once took 10 seconds now take 5 minutes because Spark must open thousands of small files.

OPTIMIZE Command

  • Purpose: Consolidate small files into larger ones
  • Benefit: Reduce file count, improve query performance
  • Frequency: Schedule regularly for streaming tables
-- Consolidate small files
OPTIMIZE lakehouse.sales;

-- Optimize with Z-ordering on frequently filtered columns
OPTIMIZE lakehouse.sales ZORDER BY (region, date);

V-Order

  • Concept: Microsoft's columnar format optimization for Parquet
  • Benefit: Better compression and faster reads
  • Implementation: Enabled by default in Fabric; can be explicitly applied
-- Apply V-Order during optimization
OPTIMIZE lakehouse.sales VORDER;
Visual: File Optimization Impact

⚠️ Exam Trap: Never running OPTIMIZE on streaming tables guarantees performance degradation. Small file accumulation is inevitable with streaming—schedule OPTIMIZE regularly (hourly or daily depending on write volume).

Alvin Varughese
Written byAlvin Varughese
Founder•15 professional certifications