1.1.2. š” First Principle: Supervised, Unsupervised, and Reinforcement Learning
First Principle: The type of machine learning is defined by the nature of the data and the learning process: supervised learning uses labeled data, unsupervised learning uses unlabeled data, and reinforcement learning uses environmental feedback.
These three paradigms cover the vast majority of ML problems.
- Supervised Learning:
- Concept: The model is "supervised" by being trained on a dataset containing both the input features and the correct output labels. It learns to map inputs to outputs.
- Analogy: Learning with a teacher or an answer key.
- Examples: Predicting house prices (regression), classifying emails as spam/not spam (classification).
- Unsupervised Learning:
- Concept: The model is given a dataset without any explicit labels or correct outputs. Its task is to find hidden patterns, structures, or groupings on its own.
- Analogy: Learning by observation and finding similarities.
- Examples: Customer segmentation, identifying topics in news articles.
- Reinforcement Learning (RL):
- Concept: An "agent" learns by interacting with an "environment." It performs actions and receives rewards (for good actions) or penalties (for bad actions), with the goal of maximizing its total reward over time.
- Analogy: Training a pet with treats.
- Examples: Training a bot to play a game, a self-driving car learning to navigate traffic.
Scenario: A colleague asks you to explain the difference between a system that predicts customer churn (yes/no) and a system that groups customers into different personas.
Reflection Question: How do you use the concept of "labeled" versus "unlabeled" data as the key differentiator to explain why churn prediction is supervised learning and customer segmentation is unsupervised learning?
š” Tip: The first question to ask when faced with an ML problem is: "Do I have the correct answers (labels) in my historical data?" If yes, it's likely supervised. If no, it's likely unsupervised. If it's about learning a task through trial and error, it's reinforcement learning.