Copyright (c) 2026 MindMesh Academy. All rights reserved. This content is proprietary and may not be reproduced or distributed without permission.

3.3.2. Model Bias Detection with SageMaker Clarify

💡 First Principle: A model can amplify biases present in training data, producing predictions that systematically disadvantage certain groups. SageMaker Clarify detects these biases both before training (data-level) and after training (prediction-level), providing specific metrics and explanations. The exam tests both when to use Clarify and how to interpret its outputs.

Post-Training Bias Metrics (different from the pre-training metrics in 2.3.1):

MetricWhat It MeasuresConcern If
Disparate Impact (DI)Ratio of positive outcomes between groupsFar from 1.0
Conditional Demographic Disparity (CDD)Disparity conditioned on other attributesSignificant after controlling for legitimate factors
Counterfactual FliptestWhether changing a protected attribute changes the predictionHigh flip rate

SHAP Values (Shapley Additive Explanations): Clarify uses SHAP to explain individual predictions—which features contributed most and in which direction. This is critical for interpretability: you can tell a loan applicant not just that they were rejected, but which factors (income, credit history, employment length) drove the decision and by how much.

Partial Dependence Plots (PDPs): Show how a single feature's value affects predictions on average, across the entire dataset. Useful for understanding the learned relationship between a feature and the target.

⚠️ Exam Trap: Clarify's pre-training bias detection (data-level) and post-training bias detection (model-level) use different metrics. Pre-training uses CI and DPL (data distribution metrics). Post-training uses DI and CDD (prediction outcome metrics). A question about "bias in the training data" points to pre-training metrics. A question about "biased predictions" points to post-training metrics.

Reflection Question: A lending model has DPL of 0.0 (training data is perfectly balanced between demographic groups) but Disparate Impact of 0.6 (the model approves loans for Group A at 1.6× the rate of Group B). How is this possible, and what does it mean?

Alvin Varughese
Written byAlvin Varughese
Founder15 professional certifications