Problem
After the initial Terraform Enterprise (TFE) Flexible Deployment Options (FDO) deployment on Kubernetes, a number of Readiness probe failed
warnings are displayed in the logs.
Example:
Warning Unhealthy Pod/terraform-enterprise-xxxxxxxxxx-xxxxx Readiness probe failed: Get "http://xx.xxx.x.x:8080/_health_check": dial tcp xx.xxx.x.x:8080: connect: connection refused
Warning Unhealthy Pod/terraform-enterprise-xxxxxxxxxx-xxxxx Readiness probe failed: HTTP probe failed with statuscode: 502
Warning Unhealthy Pod/terraform-enterprise-xxxxxxxxxx-xxxxx Readiness probe failed: Get "http://xx.xxx.x.x:8080/_health_check": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Apart from that, everything else seems to be working properly - the Container is in Running
state, UI is loading fine, it's possible to execute Terraform runs, etc.
Cause
Since the health_check
process starts earlier than the TFE FDO application itself, it is expected to see these Warnings.
For instance, PID1
in the TFE FDO container there is supervisorctl
process that manages all the other processes, so the container will appear in the Running
state minutes before the actual application processes are ready to transact.
Solution
For a sanity check, you can execute the following commands to confirm that OK
response is being received:
# Using FQDN
kubectl exec -t -n <NAMESPACE> <POD_NAME> -- curl -s https://FQDN/_health_check
# Using Localhost
kubectl exec -t -n <NAMESPACE> <POD_NAME> -- curl -s http://127.0.0.1:8080/_health_check
Also, these Warnings will disappear from the output of kubectl events -n <NAMESPACE>
and kubectl describe pod <POD_NAME -n <NAMESPACE>
after a few hours.
Outcome
TFE FDO is in a fully operational state