Problem
When running Terraform Enterprise (TFE) in a Kubernetes environment, pods may enter a crash loop if the TFE image has been improperly customized. If the image is modified to run as the root user instead of the built-in terraform-enterprise user, the nginx component may fail to start due to permission errors when accessing log files.
You may observe the following symptoms:
- TFE pods enter a crash loop during startup.
-
Logs from the pods contain permission denied errors.
{"component":"nginx","log":"nginx: [alert] could not open error log file: open() \"/var/log/terraform-enterprise/nginx.log\" failed (13: Permission denied)"} -
The
nginxprocess exits unexpectedly.INFO exited: nginx (exit status 1; not expected)
-
Logs show that
supervisordis running asrootinstead of the expectedterraform-enterpriseuser.Not running as builtin tfe user, will attempt to create scratch directories but skipping ownership changes... CRIT Supervisor is running as root. Privileges were not dropped because no user is specified in the config file.
- Checking the running user inside the container with the
idcommand shows that the process is running asroot(UID 0) instead ofterraform-enterprise(UID 1000), which indicates an issue with the image configuration.
Cause
The root cause is an improperly customized TFE image that runs the application as the root user instead of the built-in terraform-enterprise user. TFE is designed to operate as the non-root terraform-enterprise user for security and proper permission management.
When customizing a TFE image, such as to add certificates, you may need to switch to the root user to perform privileged operations. If the Dockerfile does not include a USER instruction to switch back to terraform-enterprise, subsequent commands and the container's runtime will default to the root user, leading to these permission issues.
Solution
To resolve this issue, ensure your customized TFE image switches back to the terraform-enterprise user after performing any operations that require root privileges.
Update your Dockerfile to include USER terraform-enterprise after the privileged commands complete. For example:
USER root # ... (your custom commands here) RUN update-ca-certificates USER terraform-enterprise
After updating the configuration, rebuild the Docker image and redeploy Terraform Enterprise using the new image. The TFE pods should now start successfully.
To confirm the change, you can check the running user inside the TFE container. The command should return the terraform-enterprise user.
$ kubectl exec <TFE pod name> -n <TFE namespace> -- id