4.4. Reflection Checkpoint
Key Takeaways
Before proceeding, ensure you can:
- Select the correct SageMaker endpoint type (real-time, serverless, async, batch) based on latency, payload, traffic, and cost requirements
- Match compute instance families (general, compute-optimized, GPU, Inferentia) to model types
- Choose between pre-built containers, Script Mode, and BYOC based on framework support and customization needs
- Identify when SageMaker Neo and edge deployment are appropriate
- Configure auto scaling policies (target tracking, step, scheduled) with ML-relevant metrics
- Distinguish between CloudFormation (declarative) and CDK (programmatic) for infrastructure as code
- Configure VPC mode vs. network isolation for SageMaker security
- Integrate CodePipeline, CodeBuild, and CodeDeploy into an ML CI/CD workflow
- Distinguish between SageMaker Pipelines (ML workflow) and CodePipeline (CI/CD delivery)
- Select deployment strategies (blue/green, canary, linear) based on risk tolerance
- Design automated testing and retraining pipelines with quality gates
Connecting Forward
In the next phase, you'll learn how to monitor deployed models, optimize infrastructure costs, and secure ML resources. Deployment without monitoring is like launching a satellite without ground control—you won't know if something goes wrong until it's too late. Phase 5 closes the lifecycle loop by connecting monitoring signals back to data preparation and retraining.
Self-Check Questions
-
A company processes insurance claims by analyzing uploaded document images (average 20 MB). They need results within 30 minutes and process about 500 claims per day during business hours. Which endpoint type should they use, and how should they configure scaling?
-
A team has a CodePipeline that trains a model, evaluates it, and deploys to a SageMaker endpoint. Currently deployment is all-at-once. After a bad deployment caused a 2-hour outage, they need a safer strategy with instant rollback. What changes should they make?
-
An ML pipeline automatically retrains weekly and deploys the new model. Describe the minimum set of automated tests that should exist between training and deployment to prevent deploying a degraded model.