3.2.4. Custom Vision: When Pre-built Isn't Enough
The Critical Service Selection Question:
The exam frequently tests whether you know when to use pre-built Image Analysis versus Custom Vision. This is one of the most important distinctions in the Computer Vision domain.
Decision Framework:
| Scenario | Service | Rationale |
|---|---|---|
| Detect cars, people, animals | Image Analysis (Pre-built) | Generic objects in training data |
| Detect YOUR company's logo | Custom Vision | Proprietary/unique to your business |
| Detect manufacturing defects | Custom Vision | Domain-specific, not in generic training |
| Identify famous landmarks | Image Analysis (Pre-built) | Specialized domain model exists |
| Classify your product SKUs | Custom Vision | Your specific products |
| General object detection | Image Analysis (Pre-built) | Common objects |
| Detect cracks in YOUR circuit boards | Custom Vision | Unique defect patterns |
The Key Question to Ask:
"Is this object/defect/item something that would be in a general-purpose training dataset, or is it unique to my organization?"
- General (cars, dogs, text, faces) → Pre-built Image Analysis
- Specific (your products, your defects, your logos) → Custom Vision
Custom Vision Project Types:
| Project Type | What It Does | Output |
|---|---|---|
| Classification | Assigns labels to entire images | Category + confidence |
| Object Detection | Locates your custom objects | Bounding boxes + labels |
Custom Vision Workflow:
- Create Custom Vision resource (Training + Prediction)
- Create project (Classification or Object Detection)
- Upload and tag training images
- Train the model
- Evaluate performance
- Publish to prediction endpoint
- Use in your application
Minimum Training Requirements:
| Project Type | Minimum Images |
|---|---|
| Classification | 5 images per tag |
| Object Detection | 15 images per tag |
More images = better accuracy. Microsoft recommends 50+ images per tag for production.
⚠️ Critical Exam Pattern:
"A manufacturing company needs to detect hairline cracks in circuit boards that are unique to their production process. Which service should they use?"
This is Custom Vision—the defects are proprietary and would not exist in any general-purpose training data.
"A traffic management system needs to detect vehicles on highways. Which service should they use?"
This is Image Analysis (Pre-built)—vehicles are generic objects that pre-built models already recognize.
Cost Consideration:
- Pre-built: Pay per API call only
- Custom Vision: Pay for training time + storage + prediction calls