2.1.3. Organizing Business Data for AI Consumption
Raw business data is rarely ready for AI consumption. The architect must plan how data will be structured, connected, and exposed so that agents across the organization can access it reliably. This isn't a one-time activity — it's an ongoing data operations discipline.
The key principle: organize data for discoverability, not just storage. Traditional data architectures optimize for transaction processing (writes) or analytics (reads). AI-ready data architectures must also optimize for semantic retrieval — finding the right information based on meaning, not just keywords or table joins.
Three Strategies for Making Business Data AI-Ready:
1. Unified Data Layer with Dataverse: For organizations using Dynamics 365 and Power Platform, Dataverse provides a centralized, governed data store that agents can access natively. Data from multiple D365 apps (Sales, Service, Finance) lives in a common schema, making cross-app agent queries straightforward.
2. Federated Access via Connectors and MCP: When data lives in multiple systems that can't be consolidated (legacy systems, third-party SaaS, on-premises databases), expose each source through connectors or MCP servers. The agent accesses data where it lives without requiring data migration.
3. Indexed Knowledge Bases: For unstructured data (documents, wikis, emails), create searchable indexes using Azure AI Search or Copilot Studio's built-in knowledge features. Chunking documents into retrievable segments, generating embeddings, and maintaining freshness are critical architectural decisions.
| Strategy | Best For | Trade-offs |
|---|---|---|
| Unified (Dataverse) | D365/Power Platform ecosystems | Requires data migration; strong governance; single schema |
| Federated (Connectors/MCP) | Multi-system environments | Higher latency; each source must be maintained; harder to ensure consistency |
| Indexed (Knowledge Bases) | Unstructured content | Requires chunking strategy; indexing pipeline; refresh management |
Cross-System Data Availability: When agents serve business processes that span multiple applications — a sales agent that needs CRM data, inventory data, and pricing data — the architect must ensure all data sources are accessible with consistent response times. A slow or unavailable data source doesn't just degrade one answer; it can break the agent's reasoning chain entirely.
Exam Trap: The exam may present a scenario where an organization wants "all data in one place for AI." Don't recommend migrating everything to Dataverse if the data sources are diverse and the migration cost is prohibitive. Federated access via MCP or connectors is often the pragmatic answer for heterogeneous environments.
Reflection Question: A manufacturing company has ERP data in D365 Finance, quality data in a custom SQL database, and supplier documentation in SharePoint. They want an agent that can answer questions spanning all three sources. Which data organization strategy would you recommend?