1.5. Reflection Checkpoint
Key Takeaways
Before proceeding, ensure you can:
- Explain why ML engineering is fundamentally different from traditional software engineering, focusing on data-driven failure modes
- Trace a business problem through all four ML lifecycle stages and identify the AWS services involved at each stage
- Map any SageMaker component to its lifecycle stage and articulate when an external AWS service is a better choice
- Evaluate cost-performance-latency trade-offs to select the appropriate endpoint type for a given scenario
- Position a problem on the managed-to-custom spectrum and choose the right level of AWS service
Connecting Forward
In the next phase, you'll dive into Domain 1: Data Preparation for ML—the highest-weighted exam domain at 28%. You'll learn how to ingest data from various AWS sources, transform it for ML readiness, engineer features that improve model performance, and ensure data integrity through bias detection and quality validation. The mental models from Phase 1—especially the lifecycle loop and the trade-off triangle—will guide every decision.
Self-Check Questions
-
An ML model trained on customer purchase data from Q1 performs poorly when deployed in Q4. No code changes were made. Using the four-stage lifecycle model, which stage detects this problem, what's the most likely root cause, and which AWS service would you use to investigate?
-
A team is deciding between Amazon Comprehend (AI service) and training a custom text classification model. What three factors from the scenario would push you toward one option over the other?
-
A company needs to serve predictions for a recommendation engine. During business hours, they receive 1,000 requests/second. At night, traffic drops to near zero. Using the cost-performance-latency triangle, which endpoint configuration would you recommend and why?