Copyright (c) 2026 MindMesh Academy. All rights reserved. This content is proprietary and may not be reproduced or distributed without permission.

2.3.2. Scheduled Ingestion with MWAA and Glue Triggers

šŸ’” First Principle: Scheduling becomes complex when pipelines have dependencies — Job B can't run until Job A finishes, and Job C needs both. Simple cron scheduling can't express these dependencies, which is why MWAA (Apache Airflow) and Glue workflows exist: they model pipelines as graphs of dependent tasks, not just individual scheduled jobs.

Amazon MWAA (Managed Workflows for Apache Airflow) runs Apache Airflow — the industry-standard workflow orchestrator — as a managed service. You write DAGs (Directed Acyclic Graphs) in Python that define tasks and their dependencies. MWAA manages the Airflow web server, scheduler, workers, and metadata database.

Airflow excels when: pipelines have complex dependencies (fan-out, fan-in, conditional branching), you need visibility into task-level status and retries, your team already knows Airflow, or you're orchestrating non-AWS services alongside AWS services. The exam signals MWAA with phrases like "complex workflow dependencies," "task-level monitoring," or "existing Airflow DAGs."

Glue workflows provide simpler orchestration specifically for Glue jobs and crawlers. A Glue workflow defines triggers (schedule or event), crawlers, and jobs in a visual graph. It's less flexible than Airflow but requires zero code and integrates natively with Glue.

FeatureEventBridge SchedulerGlue WorkflowsAmazon MWAA
ComplexitySimple schedules, single triggersLinear/parallel Glue pipelinesComplex DAGs with any dependency pattern
TargetsAny AWS serviceGlue jobs and crawlers onlyAny system (AWS, external, custom)
Code requiredMinimal (rule config)None (visual designer)Python DAGs
DependenciesNone (individual triggers)Sequential, parallelFull DAG dependencies, branching, loops
Best forSimple scheduled triggersGlue-only ETL pipelinesComplex multi-service orchestration

āš ļø Exam Trap: MWAA is the most powerful orchestration option but also the most expensive and operationally complex. If a question describes a simple pipeline with 2–3 Glue jobs running sequentially, MWAA is overkill — Glue workflows or Step Functions are simpler. The exam rewards matching complexity to need.

Reflection Question: Your pipeline runs three Glue jobs sequentially, then a crawler, then loads data into Redshift. No external services are involved. Is MWAA the right choice?

Alvin Varughese
Written byAlvin Varughese
Founder•15 professional certifications