Problem
The TFE (Terraform Enterprise) application will not start up. The Replicated console will exhibit starting while toggle "waiting on components to start" then ultimately fail displaying an error "Container ptfe_postgresql_setup failed: Timeout waiting for event Postgres setup"
Running replicatedctl app status
from the command line on the TFE server may also display the error.
Cause
- The ptfe-postgresql-setup container logs show:
psql: error: could not translate host name "tfe-postgres" to address: Name or service not known
- This error suggests the database server name that was configured within TFE is not resolvable from the TFE server. TFE expects a resolvable IP or fully qualified domain name (FQDN) for the Postgres Database server
- This can be also a sign that DNS is resolved due to problem with internal docker is not resolving DNS
Solutions:
Solution 1
-
If the Postgres server and port are correct, from the TFE server test the connectivity to the Postgres server that is listed in the error message. In the examples below, replace $tfe-postres with the IP or FQDN of the Postgres server
-
nc -vz $tfe-postgres 5432
nslookup $tfe-postgres
-
Solution 2
- We need to confirm that the internal DNS is properly reachable from the container. To do so we connect to PostgreSQL from container, or we can use below steps:
-
-
- SSH to the TFE instance and connect into ptfe-postgresql-setup
docker exec -it ptfe-postgresql-setup bash
- Capture the required inputs for DB info
env | grep -i "database"
- Connect to the database via command
psql -h <hostname without port> -p <port> -d <database name> -U <user>
- SSH to the TFE instance and connect into ptfe-postgresql-setup
-
- If we are still failing to connect via psql then check that
/etc/resolv.conf
contains DNS resolver 127.0.0.11. In some cases the docker service needs to be restarted to have the containers properly connecting to DNS.
1. Stop replicated servicesreplicatedctl app stop
systemctl stop replicated replicated-ui replicated-operator
2. Stop docker servicesystemctl stop docker
3. Start docker servicesystemctl start docker
4- Start replicated servicesystemctl start replicated replicated-ui replicated-operator
replicatedctl app start
Outcome
Once the TFE server can resolve the FQDN of the Database server, the TFE components should start and the application will be available