Problem
When starting Terraform Enterprise, the application may fail and report an error with the ptfe_base_startup container. For Terraform Enterprise versions 202205-1 and later, this container is named tfe-base-startup.
The ptfe_base_startup container waits for dependent services, such as the internal Vault service, to become available. If the Vault service fails to start or provide a required token within the timeout period, the ptfe_base_startup container will fail, stopping the application startup process.
If you have the admin dashboard enabled, it will report that the container failed.
Alternatively, running the replicatedctl app status command will show a similar error.
[
{
"AppID": "cf2420c1fb6c43957c238b0bec5255e0",
"Sequence": 576,
"PatchSequence": 0,
"State": "stopped",
"DesiredState": "started",
"Error": "Container ptfe_base_startup failed: Container 72b4ef0621d5acd05a325d0e00f88f335af925a4eefc07ed7e1ca2ab85f425ff exited with non-zero exit status 1: ",
"IsCancellable": false,
"IsTransitioning": false,
"LastModifiedAt": "2021-10-22T04:26:14.687941937Z"
}
]Check the logs for the ptfe_base_startup (or tfe-base-startup) container to confirm that a timeout occurred.
$ docker logs ptfe_base_startup
Example output showing a timeout.
INFO: Vault token retrieval timeout not yet reached INFO: Vault token retrieval timeout not yet reached INFO: Vault token retrieval timeout not yet reached ERROR: Operation timed out waiting for vault token
Solutions
To identify the root cause, check the logs for the internal Vault service, ptfe_vault (or tfe-vault for newer versions).
$ docker logs ptfe_vault
Solution 1: Correct the Encryption Password
Cause
The internal Vault service may fail to unseal if the encryption password (enc_password) provided in the configuration is incorrect. This prevents other services from retrieving the tokens they need to start. The ptfe_vault logs will show an error similar to the following.
get unseal: could not decrypt unseal key: crypto: could not decrypt ciphertext: chacha20poly1305: message authentication failed
This error indicates that the current enc_password value does not match the one used during the initial installation.
Resolution
Restore the enc_password value to the one that was used when you initially installed the instance. The unseal keys stored in the database can only be decrypted with the correct original password, even if the instance has been re-created.
Solution 2: Increase Disk I/O Throughput
Cause
Insufficient disk I/O throughput can cause high contention during application startup, leading to service timeouts. In this scenario, the ptfe_vault container logs will not show an explicit error, but a comparison of log timestamps reveals that the ptfe_base_startup container timed out before the Vault service was ready.
The ptfe_base_startup container waits up to 60 seconds for the token. If the Vault service takes longer than that to initialize, the startup process will fail.
Example ptfe_vault logs showing a slow startup.
... [INFO] identity: groups restored [INFO] expiration: lease restore complete [INFO] core: usage gauge collection is disabled [INFO] core: post-unseal setup complete + Retrying to create vault token + Successfully created vault token
Example ptfe_base_startup logs showing the timeout.
... execing command; /usr/bin/wait-for-token [-- true] INFO: Vault token retrieval timeout not yet reached ... INFO: Vault token retrieval timeout not yet reached ERROR: Operation timed out waiting for vault token
Resolution
The resolution depends on your operating environment and available disk types. We recommend you use SSD-based storage and monitor IOPS throughput during application startup. Burstable throughput may not always be available and can cause this issue to occur intermittently.
For more details, refer to the documentation on Capacity and Performance: Disk I/O.
Solution 3: Disable Dynatrace OneAgent
Cause
There is a known issue where the Dynatrace OneAgent can prevent Terraform Enterprise from starting successfully. The agent scans files related to the startup process, which can cause the files to be in a busy state when startup functions attempt to access them.
Resolution
Disable or uninstall the Dynatrace OneAgent from the Terraform Enterprise host machine.