3.2.4. Mirroring: Database vs. Metadata
đź’ˇ First Principle: Mirroring replicates external data into Fabric, but what gets replicated depends on the source. Database mirroring copies actual data rows; metadata mirroring copies only the catalog information (table names, schemas). Understanding this distinction is critical for choosing the right approach.
Scenario: Your organization uses Azure SQL Database for operational systems and Databricks for data science. You want both available in Fabric without duplicating data unnecessarily.
Mirroring Types
| Type | What's Replicated | Use Case |
|---|---|---|
| Database Mirroring | Actual data (CDC-based) | Azure SQL, Cosmos DB, Snowflake |
| Metadata Mirroring | Catalog information only | Azure Databricks Unity Catalog |
Mirroring Selection
- Azure Cosmos DB: Database mirroring (replicates documents)
- Azure SQL Database: Database mirroring (replicates tables)
- Azure Databricks: Metadata mirroring (replicates catalog, not data)
Database Mirroring: How It Works
Database mirroring uses Change Data Capture (CDC) to continuously replicate changes:
Source Database → CDC captures INSERT/UPDATE/DELETE → Fabric syncs changes → OneLake Delta tables
| Aspect | Behavior |
|---|---|
| Initial sync | Full snapshot of selected tables |
| Ongoing sync | Near real-time CDC replication |
| Data format | Converted to Delta tables in OneLake |
| Latency | Typically seconds to minutes |
| Schema changes | Propagated automatically |
Metadata Mirroring: How It Works
Metadata mirroring syncs the catalog without copying data:
Databricks Unity Catalog → Fabric syncs table/view definitions → Queries route to Databricks
| Aspect | Behavior |
|---|---|
| What syncs | Table names, schemas, locations |
| What doesn't sync | Actual data rows |
| Query execution | Runs against Databricks compute |
| Benefit | Unified catalog without data duplication |
When to Use Which
| Scenario | Mirroring Type | Why |
|---|---|---|
| Need data in Fabric for fast queries | Database mirroring | Data is local |
| Want unified catalog, data stays in place | Metadata mirroring | Avoids duplication |
| Source is Azure SQL/Cosmos | Database mirroring | Only option available |
| Source is Databricks | Metadata mirroring | Only option available |
⚠️ Exam Trap: Databricks uses metadata mirroring, not database mirroring. The data stays in Databricks storage; only the Unity Catalog metadata is synchronized to Fabric. Questions that mention "mirroring Databricks data" are testing whether you understand this distinction.
⚠️ Common Pitfall: Expecting real-time data with metadata mirroring. Queries still execute against the source system—if Databricks is slow, your queries are slow.
Reflection Question: Your company wants to use data from both Azure Cosmos DB (operational data) and Azure Databricks (ML features). What mirroring type would you use for each?