Copyright (c) 2026 MindMesh Academy. All rights reserved. This content is proprietary and may not be reproduced or distributed without permission.

4.2.3. Container Management with ECR, ECS, and EKS

💡 First Principle: SageMaker handles container orchestration for most ML workloads, but when you need custom container management—multiple microservices, complex networking, or non-SageMaker inference—you use ECS or EKS. ECR stores the container images regardless of which orchestrator you use.

ServiceWhat It DoesUse for ML When
ECRContainer image registryStoring BYOC images for SageMaker, custom inference containers
ECSAWS-native container orchestrationDeploying inference as microservices, non-SageMaker hosting
EKSManaged KubernetesTeam already uses Kubernetes, multi-cloud, complex orchestration needs

ECR in the ML workflow: Every SageMaker training and inference job pulls a container image. For pre-built algorithms, SageMaker manages the ECR image. For Bring Your Own Container (BYOC), you push your custom image to ECR and reference it in the training/inference configuration. ECR supports image scanning for vulnerabilities — a Domain 4 security detail the exam can test. Images should be tagged with version identifiers (not just latest) so that training jobs are reproducible.

ECS vs. EKS decision matrix: ECS is simpler — AWS handles the orchestration layer, and you define tasks and services. Choose ECS when you need container-based inference outside SageMaker but don't have existing Kubernetes expertise. EKS runs Kubernetes on AWS and is operationally heavier. Choose EKS when your team already manages Kubernetes clusters, when you need portability across cloud providers, or when the inference service requires fine-grained pod scheduling, custom autoscaling logic, or service mesh integration that SageMaker endpoints don't support.

SageMaker + EKS integration: SageMaker Operators for Kubernetes let you submit SageMaker training and hosting jobs directly from a Kubernetes cluster using custom resource definitions (CRDs). This keeps the Kubernetes workflow intact while offloading compute-intensive ML tasks to SageMaker's managed infrastructure — a pattern the exam may describe when the scenario involves an existing EKS-based platform adding ML capabilities.

⚠️ Exam Trap: For most ML inference, SageMaker endpoints are the answer—not ECS or EKS. Only choose ECS/EKS when the question explicitly mentions existing Kubernetes infrastructure, multi-container microservices beyond SageMaker's capabilities, or non-ML services that need to co-deploy with ML inference.

Reflection Question: A company runs their entire application on EKS. They want to add ML inference. Should they deploy the model as a SageMaker endpoint or as an EKS pod? What factors influence this decision?

Alvin Varughese
Written byAlvin Varughese
Founder15 professional certifications