Copyright (c) 2026 MindMesh Academy. All rights reserved. This content is proprietary and may not be reproduced or distributed without permission.

4.1.2. Compute Selection: CPU vs. GPU and Instance Families

💡 First Principle: GPUs excel at parallel matrix operations (neural network inference, image processing). CPUs excel at general-purpose computation (tree-based models, feature lookup). Choosing a GPU for an XGBoost model or a CPU for a ResNet model is paying a premium for capabilities you don't use—and the exam tests this alignment.

Instance FamilyOptimized ForWhen to UseExample
ml.m5 (General purpose)Balanced CPU/memorySmall models, preprocessing, general workloadsLightweight inference, data processing
ml.c5 (Compute optimized)CPU-intensive computationTree models (XGBoost), ensemble inferenceHigh-throughput tabular model inference
ml.r5 (Memory optimized)Large memory footprintLarge feature stores, embedding lookupsNLP models with large vocabularies
ml.p3/p4 (GPU accelerated)Parallel computationNeural network training and inferenceImage classification, NLP transformers
ml.g4dn/g5 (GPU inference)Cost-effective GPU inferenceDeep learning inference at scaleReal-time image/video processing
ml.inf1/inf2 (Inferentia)ML inference (AWS custom chip)High-throughput, cost-optimized inferenceServing transformer models at scale

SageMaker Inference Recommender automates instance selection by running load tests across multiple instance types and recommending the best cost-performance combination for your specific model. Use it instead of guessing—the exam tests whether you know this tool exists.

⚠️ Exam Trap: The cheapest instance that meets latency requirements is the correct answer, not the most powerful. If a question describes an XGBoost model serving 100 requests/second with sub-200ms latency, an ml.c5.xlarge might suffice—an ml.p3.2xlarge would work but costs 10× more for no benefit. Always match compute to model type.

Reflection Question: You need to deploy a BERT-based text classification model and an XGBoost tabular model to production. Would you use the same instance type for both? Why or why not?

Alvin Varughese
Written byAlvin Varughese
Founder15 professional certifications