Phase 2: Machine Learning Fundamentals (Questions 11-18)
Question 11: You need to predict the probability of diabetes based on age and body fat percentage. Which ML model should you use?
- A. Hierarchical clustering
- B. Linear regression
- C. Logistic regression
- D. Multiple linear regression ✓
Rationale: TWO features (age, body fat) predicting a numeric value (probability) = Multiple linear regression.
Question 12: Which type of machine learning algorithm predicts a numeric label based on item features?
- A. Classification
- B. Clustering
- C. Regression ✓
- D. Unsupervised learning
Rationale: Predicting numeric values = Regression.
Question 13: A retailer wants to group customers with similar attributes for targeted marketing. Which ML type is this?
- A. Classification
- B. Multiclass classification
- C. Clustering ✓
- D. Regression
Rationale: Grouping similar items without predefined labels = Clustering (unsupervised).
Question 14: Which ML algorithm finds optimal groupings WITHOUT relying on labeled training data?
- A. Classification
- B. Regression
- C. Clustering ✓
- D. Logistic regression
Rationale: No labels required = Unsupervised learning = Clustering.
Question 15: For an e-scooter rental prediction model, which attributes are FEATURES?
Weather includes: temperature, weekday/weekend Predictions: battery levels, distance traveled, number of hires
- A. Battery levels and distance traveled
- B. Weather temperature and weekday/weekend ✓
- C. Number of hires and battery levels
- D. Distance traveled and temperature
Rationale: Features = inputs (temperature, day type). Labels = predictions (hires, battery, distance).
Question 16: Which Azure Machine Learning Designer module is used to TRAIN a model?
- A. Clean Missing Data
- B. Select Columns in Dataset
- C. Linear Regression ✓
- D. Evaluate Model
Rationale: Algorithm modules (Linear Regression) train models. Clean/Select are data prep. Evaluate measures performance.
Question 17: Which assumption must multiple linear regression satisfy to avoid misleading predictions?
- A. Features are dependent on each other
- B. Features are independent of each other ✓
- C. Labels are categorical
- D. Training data must be unlabeled
Rationale: Features must be independent (no multicollinearity).
Question 18: What should a model include to predict diabetes probability from age and body fat percentage?
- A. Three features
- B. Two labels and one feature
- C. Two features and one label ✓
- D. One feature and two labels
Rationale: Age and body fat = 2 features. Diabetes probability = 1 label.