Copyright (c) 2026 MindMesh Academy. All rights reserved. This content is proprietary and may not be reproduced or distributed without permission.

6.2. Quick Reference Decision Trees and Cheat Sheets

🎯 Quick Reference: Endpoint Selection

QuestionIf Yes →If No →
Need real-time predictions (<100ms)?Real-time endpointContinue ↓
Processing large payloads (>6MB) or long inference (>60s)?Async endpointContinue ↓
Traffic is intermittent (<1 req/min) and cold starts are OK?Serverless endpointContinue ↓
Processing large datasets without real-time need?Batch TransformRe-evaluate requirements

🎯 Quick Reference: Managed vs. Custom Decision

🎯 Quick Reference: Data Prep Service Selection

ScenarioServiceWhy
Visual data exploration and transformation with minimal codeSageMaker Data WranglerBuilt-in visualizations, no Spark required
Large-scale ETL on petabyte dataAWS GlueServerless Spark, scales massively
Simple data transformations with recipe-based UIAWS Glue DataBrew250+ built-in transformations, no code
Custom Spark processing on large datasetsAmazon EMRFull Spark/Hadoop cluster control
Streaming data transformationAmazon Kinesis + Lambda or Managed FlinkReal-time processing
Feature storage and reuse across teamsSageMaker Feature StoreOnline (real-time) + offline (batch) stores

🎯 Quick Reference: Monitoring Service Selection

What to MonitorServiceTrigger
Feature distributions vs. baselineModel Monitor (Data Quality)Scheduled job
Model accuracy vs. baselineModel Monitor (Model Quality)Requires ground truth
Fairness metrics for protected groupsModel Monitor (Bias Drift) via ClarifyScheduled job
Endpoint latency, CPU, memoryCloudWatch Metrics + AlarmsThreshold-based
Error logs, stack tracesCloudWatch Logs + Logs InsightsQuery-based
Cross-service request tracingAWS X-RayOn-demand
API call audit trailAWS CloudTrailContinuous
PII in training dataAmazon MacieScheduled scan
Resource configuration complianceAWS ConfigRule-based

🎯 Quick Reference: Security Controls

Protection LayerServiceWhat It Controls
Identity (who)IAM roles, policies, SageMaker Role ManagerWhat actions users and services can perform
Network (where)VPC, security groups, network isolation, PrivateLinkWhat resources can communicate
Data (what)KMS, SSE, Secrets ManagerWhether data is readable if intercepted
Audit (when/how)CloudTrail, Config, MacieWhether policies are followed

🎯 Quick Reference: Cost Optimization by Workload

ML WorkloadPricing StrategyAdditional Savings
Training jobs (fault-tolerant)Spot Instances (up to 90% off)Use managed spot training with checkpointing
Production endpoint (24/7 steady)Savings Plans (30-72% off)Rightsizing with Inference Recommender
Production endpoint (variable)On-Demand + auto scalingScale to zero during off-hours
Low-traffic endpoint (<1 req/min)Serverless endpointPay only per invocation
Batch scoring (weekly)Batch Transform + SpotNo persistent endpoint cost
Development/notebooksOn-Demand + lifecycle configsAuto-stop idle notebooks
Alvin Varughese
Written byAlvin Varughese
Founder15 professional certifications