Copyright (c) 2026 MindMesh Academy. All rights reserved. This content is proprietary and may not be reproduced or distributed without permission.

3.1.3. Clustering: Finding Natural Groups

Clustering finds natural groupings in data without pre-defined categories. This is unsupervised learning—no labels are provided. The algorithm discovers patterns on its own.

Key characteristics:
  • Groups similar items together
  • NO labels in training data (unsupervised)
  • Discovers natural patterns
  • Number of clusters can be specified or discovered
How clustering differs from classification:
AspectClassificationClustering
LabelsRequired (supervised)Not used (unsupervised)
CategoriesPre-definedDiscovered
GoalPredict which categoryFind natural groups
Common scenarios:
  • Customer segmentation for marketing (group by behavior)
  • Grouping similar documents for organization
  • Anomaly detection (finding outliers that don't fit any cluster)
  • Market basket analysis (which products are bought together)

Example: A retailer wants to segment customers but doesn't know what segments exist. Clustering analyzes purchase patterns and discovers groups like "weekend shoppers," "bulk buyers," and "premium customers"—categories the retailer didn't define in advance.

⚠️ Exam Tip: If a question mentions "no labels" or "discovering natural groups"—that's clustering. If categories are already defined, that's classification.

The following diagram helps you choose the right ML technique:

Alvin Varughese
Written byAlvin Varughese
Founder15 professional certifications