4.1.2. data Blocks
💡 First Principle: A data block lets your configuration depend on facts it doesn't own — the ID of an existing VPC, the latest AMI, a secret from Vault — so you can wire managed resources to the surrounding environment without hard-coding values.
A data source is declared like a resource but with the data keyword, and its results are referenced with a data. prefix:
data "aws_ami" "latest" {
most_recent = true
owners = ["amazon"]
}
resource "aws_instance" "web" {
ami = data.aws_ami.latest.id # reads the looked-up value
}
Data sources are refreshed during plan/refresh and never cause create/update/destroy. They're how you keep configuration dynamic: instead of pasting an AMI ID that goes stale, you query for "the latest Amazon-owned AMI" every run.
resource | data | |
|---|---|---|
| Terraform manages lifecycle? | Yes (create/update/destroy) | No (read-only) |
| Reference prefix | aws_instance.web.id | data.aws_ami.latest.id |
| Effect of removing the block | Destroys the real object | Stops the lookup only |
| Can it provision? | Yes | Never |
⚠️ Exam Trap: Referencing a data source's attribute creates a dependency — Terraform reads the data source before the resources that use it. But a data source still never modifies anything. The exam may pair these facts: data sources create ordering dependencies and are strictly read-only.
Reflection Question: You replace a hard-coded subnet ID with a data lookup. What capability do you gain, and what does Terraform now do before creating resources that reference it?