1.3. Thinking Like a Data Engineer
š” First Principle: Good data engineering is not about picking the most powerful service ā it's about making the right trade-offs. Every design decision sits at the intersection of cost, performance, and reliability, and the exam tests your ability to navigate these trade-offs under real-world constraints.
Without this mindset, teams end up with over-engineered pipelines that cost 10x what a simpler solution would ā like using a cargo ship to cross a river when a bridge would do. Consider a scenario where a startup picks EMR because "it scales" but never processes more than 50 GB; they waste thousands monthly on cluster management that Glue would handle for a fraction.
Do you optimize for the lowest monthly bill, even if queries take longer? Do you maximize query speed, even if it costs 3x more? Do you prioritize durability and fault tolerance, even though it adds complexity? In practice, you balance all three ā and the exam scenarios force you to identify which trade-off the question is testing.
This section builds the reasoning frameworks you'll apply throughout the rest of the guide. When you encounter an unfamiliar exam question, these mental models give you a structured way to evaluate the options even if you haven't memorized the specific service details.