6.1.2. ALM for Microsoft Foundry Agents and Custom Models
💡 First Principle: Model ALM extends traditional software ALM with two additional dimensions: the model artifact (weights, configuration, fine-tuning parameters) and the data artifact (training data, evaluation data, embeddings). Both must be versioned, tracked, and reproducible.
Model Lifecycle Stages:
| Stage | Activities | Artifacts to Version |
|---|---|---|
| Development | Model selection, prompt engineering, RAG configuration, fine-tuning | Prompt templates, RAG config, fine-tuning datasets, hyperparameters |
| Evaluation | Benchmark testing, bias assessment, safety testing | Evaluation datasets, test results, quality scores |
| Deployment | Model deployment, endpoint configuration, content safety setup | Deployment configuration, endpoint settings, safety filters |
| Monitoring | Drift detection, quality tracking, performance monitoring | Baseline metrics, alert thresholds, monitoring dashboards |
| Retirement | Model deprecation, data cleanup, documentation | Retirement records, data deletion certificates |
Model Registry:
Microsoft Foundry's model registry tracks model versions, their training data, evaluation metrics, and deployment history. The architect designs the registry's structure: naming conventions, versioning scheme (semantic versioning for model versions), approval gates between stages, and retention policies for retired model versions.
Deployment Pipelines:
AI deployment pipelines include steps that traditional CI/CD pipelines don't: evaluation gate (model must meet accuracy thresholds before deployment), safety gate (content safety tests must pass), and canary deployment (new model serves a small percentage of traffic to validate production behavior before full rollout).
Troubleshooting Scenario: A data science team deploys a custom model from Foundry to production. Three months later, a regulatory audit asks: "What training data was used for the model currently serving customers?" The team can't answer because they tracked model versions but not data lineage — they know which model is deployed but not which dataset version trained it. This is a compliance failure, not just a process gap. Foundry ALM must include: model registry (version, metadata, performance metrics), data lineage (which dataset version, preprocessing steps, feature engineering), and deployment tracking (which model version serves which endpoint).
⚠️ Exam Trap: Model versioning alone isn't sufficient for AI ALM. Without data lineage, you can reproduce the model artifact but can't explain or audit its behavior — a requirement in regulated industries.
Reflection Question: A Foundry model deployed in production is three versions behind the latest. The data science team wants to update it. What ALM steps must the architect ensure are completed before the new version reaches production?