Copyright (c) 2026 MindMesh Academy. All rights reserved. This content is proprietary and may not be reproduced or distributed without permission.
3.1.3. Clustering: Finding Natural Groups
Clustering finds natural groupings in data without pre-defined categories. This is unsupervised learning—no labels are provided. The algorithm discovers patterns on its own.
Key characteristics:
- Groups similar items together
- NO labels in training data (unsupervised)
- Discovers natural patterns
- Number of clusters can be specified or discovered
How clustering differs from classification:
| Aspect | Classification | Clustering |
|---|---|---|
| Labels | Required (supervised) | Not used (unsupervised) |
| Categories | Pre-defined | Discovered |
| Goal | Predict which category | Find natural groups |
Common scenarios:
- Customer segmentation for marketing (group by behavior)
- Grouping similar documents for organization
- Anomaly detection (finding outliers that don't fit any cluster)
- Market basket analysis (which products are bought together)
Example: A retailer wants to segment customers but doesn't know what segments exist. Clustering analyzes purchase patterns and discovers groups like "weekend shoppers," "bulk buyers," and "premium customers"—categories the retailer didn't define in advance.
⚠️ Exam Tip: If a question mentions "no labels" or "discovering natural groups"—that's clustering. If categories are already defined, that's classification.
The following diagram helps you choose the right ML technique:
Written byAlvin Varughese
Founder•15 professional certifications