Copyright (c) 2026 MindMesh Academy. All rights reserved. This content is proprietary and may not be reproduced or distributed without permission.
3.2.5. Pipeline Ingestion and Continuous Integration
đź’ˇ First Principle: Pipelines orchestrate data movement from sources to destinations. The Copy Data activity handles basic movement; more complex scenarios require additional activities. For bulk loading into warehouses, the T-SQL COPY statement dramatically outperforms row-by-row INSERTs.
Copy Data Activity
- Purpose: Move data from source to destination
- Capabilities:
- Supports 100+ connectors
- Basic column mapping
- Schema drift handling
- Limitations: No complex transformations (use Dataflow Gen2 or Notebook)
COPY Statement (T-SQL)
- Purpose: Bulk load data into warehouse tables
- Advantages:
- High performance for large files
- Supports wildcards for multiple files
- Minimal transaction overhead
-- Load multiple Parquet files using wildcard
COPY INTO dbo.Sales
FROM 'https://storage.blob.core.windows.net/data/sales/*.parquet'
WITH (
FILE_TYPE = 'PARQUET'
);
⚠️ Exam Trap: Using individual INSERT statements for bulk loading creates excessive transaction overhead. The COPY statement is optimized for bulk operations—questions about "loading millions of rows" or "loading multiple files" typically require COPY, not INSERT.
Written byAlvin Varughese
Founder•15 professional certifications