Problem
Terraform Enterprise fails to start with the following error surfaced in the Replicated app status output.
[root@ip-10-0-19-49 ~]# replicatedctl app status
[
{
"AppID": "4b97c9a63fe44edb4cd16ca99d97da59",
"Sequence": 776,
"PatchSequence": 0,
"State": "stopped",
"DesiredState": "started",
"Error": "Container tfe-bootstrap failed: Container 527a3d6bb094d9e7dd0f22264496bd59cb5b57af162cef77a3a95e3eae1a1e43 exited with non-zero exit status 4: ",
"IsCancellable": false,
"IsTransitioning": false,
"LastModifiedAt": "2024-07-01T16:32:21.152313367Z"
}
]
The tfe-bootstrap
container will have exited with the following error.
Fatal: can't open lock file /run/xtables.lock: Read-only file system
Prerequisites
- Terraform Enterprise v202407-1 to latest (Replicated deployment)
-
restrict_worker_metadata_access
setting enabled
Cause
When the restrict_worker_metadata_access setting is enabled, a an ephemeral container, the tfe-bootstrap
container, creates an iptables rule which prevents access to 169.254.169.254
from any containers on the tfe_terraform_isolation
Docker network, wherein the Terraform worker containers are started. A change introduced in v202305-1 (Replicated deployment), which modifies the filesystem properties of this container, has been known to cause the iptables command to fail to obtain a lock and exit with the error above.
Solution
This has been identified as a bug which is currently being addressed by HashiCorp engineering. As a workaround, leave the restrict_worker_metadata_access
setting disabled and manually create the iptables rule which blocks Terraform workers from connecting to the instance metadata endpoint using the steps outlined in this article (summarized below).
cidr=$(docker network inspect tfe_terraform_isolation -f '{{range.IPAM.Config}}{{.Subnet}}{{end}}')
iptables -I DOCKER-USER \
-s ${cidr} \
-d 169.254.169.254 \
-m comment \
--comment "prevent terraform runners from talking to instance metadata" \
-j DROP