AWS-MLS-C01 & AWS CERTIFICATION | Regression Metrics (MAE, MSE, R-squared) - AWS Certified Machine Learning

4.6.1. Regression Metrics (MAE, MSE, R-squared)

First Principle: Regression metrics fundamentally quantify the accuracy of continuous numerical predictions by measuring the discrepancy between predicted and actual values, guiding model selection and tuning for quantitative forecasting.

For regression problems, where the goal is to predict a continuous numerical value, different metrics are used to evaluate how close the model's predictions are to the actual values.

Key Regression Metrics:

Mean Absolute Error (MAE):
- Formula: MAE = (1/n) * Σ|y_i - ŷ_i| (average of the absolute differences between predicted and actual values).
- Interpretation: Represents the average magnitude of the errors in a set of predictions, without considering their direction.
- Strengths: Easy to understand and interpret. Less sensitive to outliers than MSE because it doesn't square the errors.
- Weaknesses: Does not penalize large errors as much as MSE.
Mean Squared Error (MSE):
- Formula: MSE = (1/n) * Σ(y_i - ŷ_i)^2 (average of the squared differences).
- Interpretation: Measures the average of the squares of the errors.
- Strengths: Penalizes larger errors more heavily (due to squaring), which can be desirable in some contexts. Mathematically convenient for optimization (differentiable).
- Weaknesses: Sensitive to outliers. The unit of MSE is the square of the unit of the target variable, making it less intuitive to interpret.
Root Mean Squared Error (RMSE):
- Formula: RMSE = √MSE (square root of MSE).
- Interpretation: The square root of the average of squared errors. It's in the same units as the target variable, making it more interpretable than MSE.
- Strengths: Combines the benefits of MSE (penalizes large errors) with interpretability (same units as target).
- Weaknesses: Still sensitive to outliers.
R-squared (Coefficient of Determination):
- Formula: R² = 1 - (SS_res / SS_tot) (where SS_res is sum of squared residuals, SS_tot is total sum of squares).
- Interpretation: Represents the proportion of the variance in the dependent variable that is predictable from the independent variables. Ranges from 0 to 1 (or can be negative for very poor models). A higher R-squared indicates a better fit.
- Strengths: Provides a relative measure of fit, easy to interpret as a percentage of variance explained.
- Weaknesses: Can increase with the addition of more features, even if they are not truly predictive. Does not indicate bias.

Choosing the Right Metric:

MAE: When outliers should not heavily influence the error, or when interpretability in original units is paramount.
MSE/RMSE: When large errors are particularly undesirable and should be penalized more. RMSE is generally preferred over MSE for interpretability.
R-squared: For a general understanding of how well the model explains the variance in the target variable.

AWS Tools:

SageMaker Processing Jobs: Can be used to run custom evaluation scripts that calculate these metrics.
SageMaker Automatic Model Tuning (HPO): Allows you to specify one of these metrics (e.g., validation:rmse) as the objective to optimize.
SageMaker Model Monitor: Can track these metrics in production if ground truth data is available.

Scenario: You have built a model to predict house prices. You need to evaluate its performance. Your stakeholders want to know the average absolute difference between predicted and actual prices, and also how much of the price variation your model explains.

Reflection Question: How do regression metrics like MAE (for average absolute error), MSE/RMSE (for penalizing larger errors), and R-squared (for explained variance) fundamentally quantify the accuracy of continuous numerical predictions, guiding model selection and tuning for quantitative forecasting?

💡 Tip: Always consider the business context when choosing a regression metric. A small RMSE might still be unacceptable if the business impact of large individual errors is high.