Introduction:
Infrastructure-as-Code (IaC) tools like Terraform make managing cloud infrastructure predictable, repeatable, and auditable. However, sometimes cloud resources managed by Terraform may drift from the desired state perhaps because of manual changes, corruption, or other unexpected behaviors. In such cases, you don’t always want to destroy the entire infrastructure, instead you might want to selectively “refresh” or recreate only the problematic resources. This is where Terraform’s lifecycle-management commands like taint, untaint, and replace come into play.
Problem:
In real-world cloud environments, resources can become unhealthy or misconfigured over time. For example:
A deployed virtual machine (VM) might start showing degraded performance, security vulnerabilities, or inconsistent behavior, even though the Terraform configuration hasn’t changed.
There could be manual changes by operations teams, or drift caused by external processes, leading to state divergence from the desired configuration.
Even if the Terraform code is correct, the real world resource may be “corrupted” or in an undesirable state, and a fresh recreate is necessary.
Under such circumstances, simply running terraform apply won’t fix the issue because Terraform sees no configuration change, it assumes the existing resource is fine. Without a mechanism to force recreation, you may end up stuck with a problematic resource. This is a serious issue especially in production or shared environments where reliability and consistency are critical.
Prerequisites:
To illustrate how taint/untaint/replace works, the article uses a simple example with two cloud resources (EC2 instances in AWS). The setup includes:
resource "aws_instance" "my_vm_1" {
ami = var.ami # e.g. Ubuntu AMI
instance_type = var.instance_type
tags = {
Name = "VM 1"
}
}
resource "aws_instance" "my_vm_2" {
ami = var.ami
instance_type = var.instance_type
tags = {
Name = "VM 2"
}
}When this configuration is applied for the first time:
Terraform checks the remote backend state.
Because the EC2 instances don’t exist yet, Terraform creates them (VM 1 and VM 2).
Terraform then updates its state file with information about these instances for future runs.
This simple setup is used as the baseline to demonstrate how to mark a resource for recreation without affecting the rest of the infrastructure.
Solutions:
The
terraform taintcommand allows you to mark a resource in Terraform’s state as “tainted”, meaning it is considered degraded or damaged and flagged for destruction and recreation during the nextapply.This is useful when the resource’s real world state is faulty or undesirable, even though the Terraform configuration hasn’t changed.
However, note that since Terraform version 0.15.2, the
taintcommand is deprecated. The recommended way to force recreation is now using the-replaceflag withterraform apply.
Example of using terraform taint
Suppose VM 1 starts behaving poorly (slow performance, security issues, etc.) while VM 2 works fine. Instead of touching the configuration or destroying both VMs, you can taint only VM 1:
terraform taint aws_instance.my_vm_1Terraform responds:
Resource instance aws_instance.my_vm_1 has been marked as tainted.Then, running terraform plan will show that VM 1 is scheduled for destruction and recreation on the next apply.
Finally, terraform apply will destroy VM 1 and create a new one — leaving VM 2 untouched.
terraform untaint
If you tainted a resource by mistake or change your mind, you can revert it with terraform untaint <resource> and the resource will be preserved in the next run.
-replace (the modern alternative)
Because taint is deprecated (since Terraform v 0.15.2), it is recommended using the -replace option with terraform apply. This approach forces Terraform to replace specified resources immediately — without the gap between tainting and applying.
Example:
terraform apply -replace="aws_instance.my_vm_2"This will directly schedule VM 2 for destruction and recreation in the same operation.
However, when using Terraform Enterprise, runs execute remotely and you do not have direct terminal access to pass CLI flags during execution.
To force replace a tainted resource in TFE, you can pass the -replace option through an environment variable:
Open your TFE Workspace and go to the Variables page
-
Under Environment Variables, add a new variable:
Key:
TF_CLI_ARGS_plan-
Value:
-replace=aws_instance.my_vm_1
Save the variable and mark it as Environment Variable
Queue a new Plan run
– Terraform will now show destroy + create for the tainted resourceApprove and Apply the run to complete the replacement
This achieves the same outcome as running terraform apply -replace="..." locally, but fully within Terraform Enterprise — allowing you to safely recreate only the faulty resource while keeping the rest of the infrastructure unchanged.
Outcome:
By using taint/untaint or the newer -replace approach, you get fine grained control over your infrastructure lifecycle. You can selectively recreate only problematic resources without touching the rest. This helps in:
Resolving resource corruption, misconfiguration, or performance/security issues.
Avoiding unnecessary destruction of unaffected resources.
Maintaining infrastructure stability and minimizing downtime.
However, the taint workflow does come with some challenged marking a resource as tainted modifies the state file immediately, so if another developer runs a plan/apply in between, the tainted resource might get replaced unintentionally.
Using -replace reduces this risk by combining mark + apply into a single atomic operation.
Reference: