5.1.1. IAM Fundamentals: Users, Roles, and Policies
š” First Principle: IAM policies are JSON documents that define what actions are allowed or denied on which resources. Every AWS API call is evaluated against the caller's policies. The core logic: an explicit Deny always wins, then explicit Allows are checked, and anything not explicitly allowed is implicitly denied. Understanding this evaluation chain lets you debug access issues and design least-privilege policies.
IAM entities: Users (human identities with long-term credentials), Groups (collections of users that share policies), and Roles (temporary credential containers that services or users assume). For data engineering, roles are used everywhere: Glue jobs assume a service role to access S3 and the Glue Catalog, Lambda functions assume an execution role, and Redshift uses a role to access S3 for COPY/UNLOAD.
Policy types: AWS managed policies (pre-built by AWS, like AmazonS3ReadOnlyAccess), customer managed policies (custom policies you create), and inline policies (embedded directly in a user, group, or role). The exam prefers customer managed policies for production workloads ā they're reusable, version-controlled, and auditable.
Policy structure: Effect (Allow/Deny), Action (which API calls), Resource (which AWS resources, using ARNs), and optionally Condition (when the policy applies ā e.g., only from specific IP ranges, only with MFA, only for specific tags).
Service-linked roles are predefined roles created by AWS services (Lake Formation, Glue, EMR) that contain the permissions the service needs. You can't modify these roles. The exam may reference them when asking about permissions that a service "automatically" has.
Identity federation allows external identities (corporate directory, SAML, OIDC) to assume IAM roles without creating IAM users for each person. For data engineering, federation enables corporate SSO users to access Athena, QuickSight, and Redshift without managing separate AWS credentials. This is the exam-preferred approach for large organizations ā never create individual IAM users when federation is available.
ā ļø Exam Trap: A Glue job that accesses S3, the Glue Data Catalog, and CloudWatch Logs needs permissions for all three in its service role. If any permission is missing, the job fails with an access denied error. When troubleshooting Glue permission errors, check the role's policies for each service the job touches ā not just S3.
Reflection Question: A Glue ETL job needs to read from an S3 bucket in Account A and write to a Redshift cluster in Account B. What IAM configuration enables this cross-account access?