9. Comprehensive Glossary
This glossary provides concise definitions for the technical terms and Azure services covered throughout this guide. It serves as a quick reference for clarifying concepts and ensuring you understand the specific terminology used in the AI-102 exam. Reviewing these terms regularly will strengthen your technical vocabulary and improve your performance on terminology-heavy questions.
A
Active Learning: A QA feature that improves knowledge bases from user interaction patterns by suggesting new Q&A pairs or improvements.
Agent: An AI system combining an LLM with tools, memory, and reasoning to perform multi-step tasks autonomously.
Autogen: A framework for building multi-agent conversational systems where autonomous agents collaborate through structured dialogue.
Azure AI Search: A cloud search service that provides AI enrichment, indexing, and retrieval capabilities for knowledge mining.
AzureKeyCredential: A class used for API key authentication in Azure SDKs, typically found in the azure.core.credentials package.
B
BLEU Score: A metric used to measure machine translation quality; a score between 40-59 is generally considered high quality.
C
Chunking Strategy: The approach for splitting large documents into smaller pieces for embedding (e.g., fixed-size, semantic, or sentence-based).
CLU (Conversational Language Understanding): A customizable service for identifying user intents and extracting entities from natural language.
Composed Model: A Document Intelligence model that combines multiple custom models under a single model ID to handle diverse forms.
Confidence Score: A numeric value (0-1) indicating how certain a model is in its prediction or extraction result.
Confidence Threshold: The minimum confidence score required for a model's output to be returned or acted upon.
Content Safety: An Azure service for detecting and filtering harmful content across categories like hate, violence, and self-harm.
D
DALL-E: An Azure OpenAI model designed to generate high-quality images from natural language text prompts.
Data Source: An Azure AI Search component that defines the connection and query for reading data from storage (e.g., Blob, SQL).
DefaultAzureCredential: A class for managed identity authentication that simplifies credential management in production environments.
Deployment: A named instance of a specific model version in Azure OpenAI or Language services, ready for inferencing.
Document Intelligence: A service for extracting structured data, layouts, and key-value pairs from documents (formerly Form Recognizer).
E
Embeddings: Vector representations of text or images that capture semantic meaning for search and similarity tasks.
Fine-tuning: The process of training a pre-trained model on domain-specific data to improve its performance on specialized tasks. Fine-tuning requires labeled training data and is used when pre-built models don't meet accuracy requirements for your domain.
Entity: A specific data element extracted from text, such as a person, location, date, or custom domain term.
F
F1 Score: The harmonic mean of precision and recall, providing a single metric to balance model performance evaluation.
G
Grounding: The practice of anchoring an LLM's responses in factual, external data (often via RAG) to minimize hallucinations.
GroupChat: An Autogen construct that enables multiple autonomous agents to participate in a shared conversation.
GroupChatManager: The orchestrator in an Autogen GroupChat that manages dialogue flow and selects the next speaker.
H
Hub: A top-level Azure AI Foundry resource that manages shared connections, compute, and policies for multiple projects.
Hybrid Search: A search strategy that combines traditional full-text keyword matching with modern vector-based semantic search.
I
Index: The primary data structure in Azure AI Search that stores searchable content, metadata, and vector representations.
Indexer: A search component that automates the ingestion, cracking, and indexing of data from supported sources.
Intent: The primary goal or action expressed in a user's utterance, such as "BookFlight" or "CancelOrder."
K
Knowledge Base: The collection of Q&A pairs, URLs, and files used by the Custom Question Answering service.
Knowledge Store: A feature of Azure AI Search that persists AI-enriched data into tables, objects, or files for external analysis.
M
Managed Identity: An Azure AD identity that allows resources to authenticate to services securely without storing credentials in code.
Model reflection: A technique where an AI model evaluates its own responses to identify errors or areas for improvement.
Multi-turn: A conversation capability that maintains context across a series of related questions and answers.
O
OCR (Optical Character Recognition): The process of extracting printed or handwritten text from images and scanned documents.
P
Planner: A Semantic Kernel component that automatically decomposes complex goals into a sequence of executable steps.
Plugin: An extension for Semantic Kernel that adds specific callable functions (tools) to an AI agent's capabilities.
Precision: A performance metric representing the ratio of correct positive predictions to the total number of positive predictions.
Private Endpoint: A network configuration that secures AI services by restricting access to traffic originating within a specific VNet.
Project: An Azure AI Foundry workspace dedicated to a specific application, containing models, data, and prompt flows.
Projection: The specific format (Table, Object, or File) used to save enriched data from a skillset into a Knowledge Store.
Prompt Engineering: The iterative process of designing and refining inputs to guide LLMs toward more accurate and useful outputs.
Prompt shield: A Content Safety feature designed to detect and block malicious prompt injection and jailbreak attempts.
R
RAG (Retrieval-Augmented Generation): A pattern that retrieves relevant documents to provide context to an LLM before generation.
Recall: A performance metric representing the ratio of correct positive predictions to the total number of actual positive examples.
S
Scoring Profile: An Azure AI Search configuration used to boost the relevance ranking of documents based on specific field values.
Semantic Configuration: A search setting that enables ML-based semantic ranking to improve the relevance of search results.
Semantic Kernel: An open-source SDK for orchestrating AI services through plugins, planners, memory, and persona management.
Semantic Search: A search approach that understands the intent and meaning of a query rather than just matching keywords.
Severity Level: A numeric rating (0-6) in Content Safety indicating the intensity of potentially harmful content.
Skillset: A collection of AI enrichment steps (skills) applied by an indexer to extract information from unstructured content.
SSML (Speech Synthesis Markup Language): An XML-based language used to control speech attributes like pitch, rate, and speaking style.
Synonym: An alternative word or phrase defined in Question Answering or Search to help match user queries with different wording.
System Message: A persistent instruction provided to a chat model that defines its persona, rules, and behavioral boundaries.
STT (Speech-to-Text): Converting spoken audio into written text using speech recognition. In Azure, implemented via the SpeechRecognizer class. Also called speech recognition or transcription.
T
Temperature: A parameter that controls the randomness of model output; lower values are more deterministic, higher are more creative.
Token: The basic unit of text processing for LLMs, typically representing a word or a sub-word (~4 characters of English text).
Token Limit: The maximum number of tokens a model can process in a single request, including the prompt and completion.
Top-p (Nucleus Sampling): A parameter that limits token selection to the most probable subset whose cumulative probability reaches a threshold.
TTS (Text-to-Speech): Converting written text into spoken audio using speech synthesis. In Azure, implemented via the SpeechSynthesizer class. Also called speech synthesis.
U
Utterance: A specific example of natural language input from a user, used to train intent and entity recognition in CLU.
V
Vector Dimensions: The length of the numeric array produced by an embedding model (e.g., 1536 for text-embedding-ada-002).
Vector Search: A search methodology that uses vector similarity (e.g., cosine similarity) to find content with related meanings.
Visual Features: The specific types of information (e.g., tags, captions, objects, OCR) requested from the Image Analysis API.
W
WER (Word Error Rate): The standard metric for speech recognition accuracy, calculated from substitution, insertion, and deletion errors.