Problem
Terraform Enterprise fails to start, and the terraform-enterprise process exits prematurely. This causes Fluent Bit to fail with an error [error] could not open configuration file, aborting..
2024-05-10 15:01:43,677 INFO stopped: terraform-enterprise (terminated by SIGTERM) 2024-05-10 15:01:44,680 INFO success: fluent-bit entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) [2024/05/10 15:01:45] [error] could not open configuration file, aborting. 2024-05-10 15:01:45,523 INFO stopped: fluent-bit (exit status 1)
A TLS verification error is logged to the /var/log/terraform-enterprise/terraform-enterprise.log file inside the container, referencing the configured external object storage endpoint's certificate.
To view the log, first copy it from the container.
# docker compose cp tfe:/var/log/terraform-enterprise/terraform-enterprise.log .
The log file contains a panic with the error tls: failed to verify certificate: x509: certificate signed by unknown authority.
panic: error loading config: failed detecting s3 prefix: could not list objects: operation error S3: ListObjectsV2, exceeded maximum number of attempts, 3, https response error StatusCode: 0, RequestID: , HostID: , request send failed, Get "https://tfe-minio:9000/tfe?delimiter=%2F&list-type=2": tls: failed to verify certificate: x509: certificate signed by unknown authority goroutine 1 [running]: main.main()/home/runner/work/terraform-enterprise/terraform-enterprise/tools/service-setup/main.go:37 +0x396
Prerequisites
- Terraform Enterprise
v202404-2 - External S3-compatible object storage using a self-signed or non-publicly trusted CA certificate that has been added to the CA bundle configuration.
Cause
This issue is caused by a bug introduced in Terraform Enterprise v202404-2. When Terraform Enterprise starts, the terraform-enterprise process sets up its trusted certificates using the contents of the CA bundle setting. In v202404-2, changes to the startup order cause a request to the configured S3-compatible object storage endpoint before this certificate setup completes. As a result, Terraform Enterprise cannot verify the endpoint's certificate, which causes the startup to fail.
Solutions
The recommended solution is to upgrade to a fixed version. If you cannot upgrade immediately, you can use a temporary workaround.
Solution 1: Upgrade to a Fixed Version
This bug is fixed in Terraform Enterprise v202405-1. Upgrading to this release or a later version resolves the issue.
Solution 2: Pin the Previous Release (Flexible Deployments)
For Flexible Deployment Options, you can work around this issue by pinning the previous release. Modify the image tag in your configuration to use the version prior to v202404-2.
Solution 3: Repave with the Previous Release (Replicated)
For Replicated deployments, you can work around this issue by repaving the server using the previous release version.
These workarounds are effective because the application will not have progressed far enough into the startup sequence to run the database migrations required for v202404-2, allowing it to start successfully on the previous version.