7.4. Practice Questions: Machine Learning
Test your understanding of machine learning concepts, techniques, and Azure ML services.
Domain 2: Machine Learning Fundamentals (Questions 11-20)
Question 11: A model predicts whether a customer will purchase a product (yes/no) based on their browsing history. Which technique is this?
- A. Regression
- B. Classification ✓
- C. Clustering
- D. Anomaly Detection
Rationale: Predicting yes/no categories is classification. Regression would predict a numeric value.
Question 12: You need to predict house prices based on size and location. Which ML technique?
- A. Clustering
- B. Classification
- C. Regression ✓
- D. Anomaly Detection
Rationale: Predicting continuous numeric values (prices) is regression.
Question 13: A retailer wants to group customers by similar purchasing behaviors for targeted marketing, without predefined categories. Which technique?
- A. Regression
- B. Classification
- C. Clustering ✓
- D. Logistic Regression
Rationale: Grouping similar items without predefined labels is clustering—an unsupervised technique.
Question 14: Logistic regression is used to predict:
- A. Continuous numeric values
- B. Categories ✓
- C. Time series data
- D. Image classifications
Rationale: Despite its name, logistic regression is a CLASSIFICATION algorithm that predicts categories (typically binary yes/no).
Question 15: A company wants to predict diabetes probability based on age and body fat percentage. What should the model include?
- A. Three features
- B. Two features and one label ✓
- C. One feature and two labels
- D. Three labels
Rationale: Age and body fat percentage are inputs (features). Diabetes probability is the output (label). That's 2 features, 1 label.
Question 16: For a weather prediction model, which is a FEATURE?
- A. Tomorrow's temperature
- B. Today's humidity ✓
- C. The prediction output
- D. The model accuracy
Rationale: Features are inputs used to make predictions. Today's humidity is an input. Tomorrow's temperature is what we're predicting (the label).
Question 17: Why do we split data into training and validation sets?
- A. To reduce data storage costs
- B. To test if the model generalizes to new data ✓
- C. To speed up training
- D. To increase the amount of data
Rationale: Validation data tests whether the model learned patterns (generalizes) vs. just memorizing training examples.
Question 18: Which Azure ML capability automatically selects the best algorithm?
- A. Azure Machine Learning Designer
- B. Automated Machine Learning ✓
- C. Azure Cognitive Services
- D. Azure Data Factory
Rationale: Automated ML (AutoML) automatically tries different algorithms and selects the best one.
Question 19: Multiple linear regression requires features to be:
- A. Dependent on each other
- B. Independent of each other ✓
- C. Categorical only
- D. Numeric labels
Rationale: For multiple linear regression, features must be independent. Correlated (dependent) features lead to misleading predictions.
Question 20: Which machine learning algorithm module in Azure ML Designer trains regression models?
- A. Clean Missing Data
- B. Select Columns in Dataset
- C. Linear Regression ✓
- D. Evaluate Model
Rationale: Linear Regression is an algorithm module that trains regression models. The others are data transformation or evaluation components.