Problem
Terraform Enterprise fails to start when either the /var mount or the Docker root directory mount uses the noexec option.
To identify your Docker root directory, run the following command.
$ docker info -f '{{ .DockerRootDir}}'Prerequisites
- Terraform Enterprise (self-hosted installation)
Cause
The noexec option is set for the /var or Docker root directory mount in /etc/fstab.
For example, your /etc/fstab file may contain entries similar to these.
UUID=920cd144-09bc-487c-ad56-0f09fdd53cb9 /var xfs nodev,noexec,nosuid 0 2 UUID=cd2822cd-6ed5-4bda-9912-eb013f62d763 /var/lib/docker xfs nodev,noexec,nosuid 0 2
If the noexec option is present before you install Terraform Enterprise, the installation fails during the bootstrapping phase with a Permission denied error.
## tfe-bootstrap/tfe-bootstrap.sh: line 12: /build-worker-metadata-firewall.sh: Permission denied
## TFE startup failed
[ { "AppID": "0129182acab44ab5534b4b6205257550", "Sequence": 722, "PatchSequence": 0, "State": "stopped", "DesiredState": "started", "Error": "Container tfe-bootstrap failed: Container 31da3bdbdcbdb79ef7513eda30cdd9f968e46b84e26c6d20be13a39144696ea8 exited with non-zero exit status 126: ", "IsCancellable": false, "IsTransitioning": false, "LastModifiedAt": "2023-07-28T14:17:30.278240957Z" }]If the noexec option is applied after you install Terraform Enterprise, multiple containers fail to start, including tfe-vault, tfe-health-check, tfe-state-parser, tfe-backup-restore, and tfe-base-startup. The application startup eventually times out, leaving it in a failed state.
## tfe-vault
## ...+ starting vault...
## Vault is already initialized
## + killing vault with pid 23
## ==> Vault shutdown triggered
## + vault has exited
## + exiting vault setup with 0
## + Retrieving Vault unseal key
## + Retrieving Vault root token
## + Setting IPC lock...
## /usr/bin/vault-start: line 182: /gosu/gosu: Permission denied
## + Starting vault
## tfe-health-check
## /usr/bin/setup-ca-certificates.sh: exec: line 41: /gosu/gosu: Permission denied
## tfe-state-parser
## /usr/bin/setup-ca-certificates.sh: exec: line 41: /gosu/gosu: Permission denied
## tfe-backup-restore
## INFO: Vault token retrieval timeout not yet reached
## ERROR: Operation timed out waiting for vault token
## tfe-base-startup
## INFO: Vault token retrieval timeout not yet reached
## ERROR: Operation timed out waiting for vault token
## TFE startup failed
[ { "AppID": "6e613d8ecae148c3642c6588fe75b597", "Sequence": 722, "PatchSequence": 0, "State": "stopped", "DesiredState": "started", "Error": "Container tfe-base-startup failed: Container 56bd2bc2dcd8f523e281923354fe41fd913e0c12f9f569fdd82c3a5115eb0d23 exited with non-zero exit status 1: ", "IsCancellable": false, "IsTransitioning": false, "LastModifiedAt": "2023-07-19T15:37:12.954203171Z" }]Solutions
Solution 1: Remount the Partition Without noexec
-
Stop the Terraform Enterprise application.
# replicatedctl app stop
-
Verify the application has stopped before proceeding.
# replicatedctl app status
-
Stop the Replicated services.
# sudo systemctl stop replicated replicated-operator replicated-ui
-
Stop the Docker service, socket, and containerd.
# sudo systemctl stop docker # sudo systemctl stop docker.socket # sudo systemctl stop containerd
-
Edit the
/etc/fstabfile and remove thenoexecoption from the/varor Docker root directory mount.# sudo vim /etc/fstab
For example, change this line:
UUID=cd2822cd-6ed5-4bda-9912-eb013f62d763 /var/lib/docker xfs nodev,noexec,nosuid 0 2
To this:
UUID=cd2822cd-6ed5-4bda-9912-eb013f62d763 /var/lib/docker xfs nodev,nosuid 0 2
-
Save the file changes and remount the partition.
# sudo mount -o remount /var/lib/docker
-
Start the Docker service.
# sudo systemctl start docker
-
Verify that Replicated is ready before proceeding.
# replicatedctl system status
-
Start the Terraform Enterprise application.
# replicatedctl app start
Solution 2: Create a Dedicated Partition for Replicated Operator
If your internal policies require mounting the /var partition with the noexec option, you must create a separate partition for /var/lib/replicated-operator. This new partition must be at least 2GB and must not use the noexec option. Additionally, ensure that the Docker root directory (e.g., /var/lib/docker) is also on a partition that does not use the noexec option.
Outcome Validation
The Terraform Enterprise application should now start successfully. You can monitor the containers and check logs to confirm the resolution.
-
Watch the Docker containers as they are launched.
# watch docker ps -f network=tfe_services
-
Check the logs of previously failed containers, such as
tfe-vault, to confirm the issue is resolved.$ docker logs -f tfe-vault ## ...+ Configuring Vault ## + mounting vault transit backend ## + tuning vault lease TTLs ## Success! Data written to: sys/mounts/auth/token/tune ## + adding vault policy ## Success! Uploaded policy: atlas ## + Creating vault token for use by services ## + Successfully created vault token