3.2.3. Shortcuts: Accessing Data Without Duplication
💡 First Principle: Shortcuts provide virtual access to external data without copying—like a library catalog that tells you where a book is located rather than photocopying the entire book. This avoids data duplication but means your queries depend on the external source being available and performant.
Scenario: Your organization has 50 TB of historical data in AWS S3 that's rarely accessed. Creating shortcuts lets Fabric query this data without copying it into OneLake, saving storage costs and avoiding sync complexity.
Shortcut Types
| Type | Source | Use Case |
|---|---|---|
| Internal | Other OneLake locations | Cross-workspace data sharing |
| ADLS Gen2 | Azure Data Lake Storage | Existing Azure data lakes |
| S3 | Amazon S3 | Multi-cloud scenarios |
| GCS | Google Cloud Storage | Multi-cloud scenarios |
Accelerated vs. Non-Accelerated Shortcuts (KQL Database)
- Accelerated: Caches data in Eventhouse for fast KQL queries; consumes storage; has sync delay
- Non-Accelerated: Queries source directly; always current; slower performance
Decision Factors:
- Query Frequency: High-frequency queries benefit from accelerated shortcuts
- Latency Requirements: Sub-second requirements need acceleration
- Data Size: Very large datasets may be cost-prohibitive to cache
- Data Freshness: Accelerated shortcuts have a sync delay; non-accelerated are always current
Shortcuts vs. Copying: Decision Framework
| Factor | Use Shortcut | Copy the Data |
|---|---|---|
| Data ownership | External team owns it | You own and transform it |
| Update frequency | Source updates frequently | Point-in-time snapshot needed |
| Query performance | Occasional queries OK | High-frequency, fast queries |
| Transformation needed | Read as-is | Need to transform/cleanse |
| Storage costs | Want to avoid duplication | Performance worth the cost |
| Source availability | Source is highly available | Source has maintenance windows |
What Breaks with Shortcuts?
- Source goes offline → Your queries fail
- Source is slow → Your queries are slow
- Source schema changes → Your queries may break
- Source has network latency → Added query latency
Best Practice: Use shortcuts for reference data, historical archives, and cross-team data sharing. Copy data when you need transformation, guaranteed availability, or high-frequency queries.
⚠️ Exam Trap: Shortcuts provide access only—data remains in original format and location. If a question mentions needing transformation, shortcuts alone don't solve it. Also, accelerated shortcuts consume storage and have sync delays.
⚠️ Common Pitfall: Creating shortcuts to data you'll query thousands of times per day. The network overhead adds up—consider copying frequently-accessed data locally.