Copyright (c) 2026 MindMesh Academy. All rights reserved. This content is proprietary and may not be reproduced or distributed without permission.

3.2.4. Advanced Query Handling

💡 First Principle: The query the user types is rarely the optimal query for your retrieval system — it may be ambiguous, incomplete, use different terminology than your indexed documents, or bundle multiple questions into one. Query transformation moves retrieval quality from the raw text input to a processed representation that matches your corpus better.

Query transformation techniques:
TechniqueWhat It DoesWhen to Use
Query expansionGenerate synonyms and related termsDomain terminology mismatch between users and docs
Query decompositionSplit compound queries into sub-queries"Compare X and Y" → retrieve X context + retrieve Y context separately
HyDE (Hypothetical Document Embeddings)Ask FM to generate a hypothetical answer, embed that as the queryWhen query phrasing differs significantly from document phrasing
Query rewritingRestate ambiguous query more preciselyShort, unclear queries like "how do I do the thing from last time?"
Query decomposition with Step Functions:

MCP (Model Context Protocol) for standardized retrieval access: MCP defines a standardized way for FM agents to query vector stores and other data sources using a consistent client/server interface. AWS implements MCP support in Bedrock Agents and Strands Agents:

# Lambda implementing a stateless MCP server for vector store access
def lambda_handler(event, context):
    if event['method'] == 'tools/list':
        return {
            'tools': [{
                'name': 'search_knowledge_base',
                'description': 'Semantic search over internal documentation',
                'inputSchema': {
                    'type': 'object',
                    'properties': {
                        'query': {'type': 'string'},
                        'top_k': {'type': 'integer', 'default': 5}
                    }
                }
            }]
        }
    elif event['method'] == 'tools/call':
        results = search_opensearch(event['params']['query'], event['params']['top_k'])
        return {'content': [{'type': 'text', 'text': json.dumps(results)}]}

⚠️ Exam Trap: HyDE (Hypothetical Document Embeddings) improves retrieval quality by generating a hypothetical answer and using that as the query vector, but it costs an extra FM invocation per query. For latency-sensitive applications, HyDE adds 200–800ms overhead. Exam scenarios requiring "improved retrieval quality without increasing per-query latency" cannot use HyDE — consider query expansion or reranking instead.

Reflection Question: Users of your internal HR bot frequently ask vague queries like "what's the process for the thing with my manager?" that return poor results. What query transformation technique would you apply, and what AWS services would implement it?

Alvin Varughese
Written byAlvin Varughese
Founder15 professional certifications