1.1. How Large Language Models Actually Work
💡 First Principle: A language model does not "know" facts the way a database does — it predicts the most statistically likely next word given everything before it. This is why outputs can be fluent and wrong simultaneously.
You've probably noticed that Copilot can write a polished, professional email in seconds — but occasionally states something confidently that is simply not true. Both of these facts flow from the same source: the mechanics of how language models work.
At its core, a Large Language Model (LLM) is trained on enormous amounts of text — articles, books, documentation, web pages — and learns statistical patterns in language. Given a sequence of words, the model predicts which word should come next, then the next, and so on until it completes a response. The model is not retrieving stored facts; it is generating text that statistically fits the pattern of correct-sounding answers to questions like yours.
This has a profound implication: fluency is not accuracy. A model can generate beautifully structured prose that contains made-up statistics, wrong dates, or invented citations — because statistically, those outputs "look right" even when they are wrong. The term for this is fabrication (also called hallucination), and it is not a bug that will be patched — it is a structural characteristic of how LLMs work.
Why does this matter for the exam? The AB-730 tests whether you understand the inherent limitations of AI tools — not just how to use their features. Questions about responsible AI, verification steps, and when not to rely on Copilot all trace back to understanding this fundamental mechanism.
⚠️ Exam Trap: Many people assume fabrications are rare edge cases they are unlikely to encounter. In reality, fabrications can occur on any topic, especially when the model is answering without grounding context (a document, a file, a web search). Treat any ungrounded Copilot output as requiring verification.
Reflection Question: If a language model always produces fluent, confident-sounding text, what is the only reliable way to know whether a specific Copilot output is accurate?