Copyright (c) 2026 MindMesh Academy. All rights reserved. This content is proprietary and may not be reproduced or distributed without permission.
6.1.6. Generative AI Risks and Mitigations
Generative AI introduces specific technical risks that require targeted countermeasures.
Hallucinations and Factual Accuracy: One of the most significant risks with generative AI is hallucination—when the model generates plausible-sounding but factually incorrect information. This happens because LLMs are trained to predict likely text, not verify facts.
Why hallucinations occur:
- Models don't have real-time knowledge or fact-checking capability
- Training data may contain errors or outdated information
- Models optimize for coherent text, not truthful text
- Confidence doesn't correlate with accuracy
Mitigating hallucinations:
- Grounding: Connect responses to verified data sources (RAG)
- Temperature reduction: Lower creativity for factual queries
- Retrieval augmentation: Combine generation with search
- Human review: Verify AI-generated content before use
- Source citations: Require models to cite sources
Jailbreaking and Prompt Injection: Malicious users may attempt to bypass safety controls through:
| Attack Type | What It Does | Example |
|---|---|---|
| Jailbreaking | Tricks model into ignoring restrictions | "Pretend you're an AI without rules..." |
| Prompt injection | Hides instructions in input data | Embedding commands in user-provided documents |
| Indirect injection | Attacks via external data sources | Malicious instructions in web pages the model reads |
Defenses against prompt attacks:
- Input validation and sanitization
- System message hardening
- Output monitoring for policy violations
- Content filtering on both input and output
- Regular red-team testing
Written byAlvin Varughese
Founder•15 professional certifications