AWS-MLS-C01 & AWS CERTIFICATION | Access Control (IAM Policies, Resource Policies) - AWS Certified Machine Learning

5.4.3. Access Control (IAM Policies, Resource Policies)

First Principle: Robust access control fundamentally ensures that only authorized users and services can access specific data and ML resources, upholding the principle of least privilege and maintaining security and compliance.

Access control defines who can perform what actions on which resources. For machine learning workloads, this means controlling access to data, models, SageMaker services, and underlying compute/storage resources.

Key AWS Mechanisms for Access Control in ML:

AWS Identity and Access Management (IAM):
- What it is: The service that enables you to securely control access to AWS services and resources. You use IAM to manage users, groups, and roles, and their permissions.
- IAM Users: Individual identities for people or applications.
- IAM Groups: Collections of IAM users, making it easier to manage permissions for multiple users.
- IAM Roles: Identities that you can assume to get temporary permissions. Used by AWS services (e.g., SageMaker to access S3), EC2 instances, or users in other accounts.
- IAM Policies: JSON documents that define permissions. They can be attached to users, groups, or roles.
- Principle of Least Privilege: Grant only the permissions required to perform a task. This minimizes the blast radius in case of a security breach.
- Use Cases for ML:
  - Granting data scientists permissions to create and manage SageMaker notebooks.
  - Defining an IAM role for SageMaker training jobs that allows reading data from specific S3 buckets and writing model artifacts to others.
  - Controlling who can invoke a SageMaker endpoint.
Resource Policies:
- What it is: Policies attached directly to a resource (e.g., an S3 bucket, an SQS queue, an SNS topic, a KMS key). They define who can access that specific resource.
- Amazon S3 Bucket Policies: Control access to objects within an S3 bucket. Can be used in conjunction with IAM policies to create robust access controls.
- AWS KMS Key Policies: Define who can use or manage a specific KMS encryption key. Crucial for controlling access to encrypted ML data and models.
- Use Cases for ML:
  - Ensuring only specific SageMaker roles can read/write to a particular S3 bucket containing sensitive training data.
  - Restricting access to a KMS key used to encrypt model artifacts, ensuring only authorized services/roles can decrypt them.
AWS Lake Formation: (As discussed in 2.4.2) Provides a simplified, centralized way to manage fine-grained access control for data lakes built on S3 and cataloged in AWS Glue Data Catalog. It translates its permissions into underlying IAM policies and S3 bucket policies.

Scenario: You need to configure permissions for a new ML project. Data scientists should only be able to read data from a specific S3 bucket for training, and SageMaker training jobs should only be able to write model artifacts to another specific S3 bucket. All data must be encrypted with a KMS key that only authorized personnel can manage.

Reflection Question: How do IAM policies (attached to users/roles) and resource policies (attached to S3 buckets or KMS keys) fundamentally ensure that only authorized users and services can access specific data and ML resources, upholding the principle of least privilege and maintaining security and compliance throughout the ML lifecycle?

💡 Tip: Remember that both IAM policies and resource policies can grant or deny access. When both apply, the most restrictive permission takes precedence (an explicit deny always overrides an allow).