Problem
Terraform Enterprise Flexible Deployment Options (FDO) on Kubernetes requires connectivity to external PostgreSQL, Redis, and S3-compatible object storage services to function correctly. If Terraform Enterprise cannot reach one of these services during startup, the container may exit without clear, persistent logs, making it difficult to diagnose the root cause on the pod itself.
You may see errors in the pod logs related to the specific service, or the pod may stop without any detailed error messages.
Example log errors for an S3 bucket connection failure:
{
"component": "terraform-enterprise",
"log": "2024-06-25T09:44:28.282Z [ERROR] terraform-enterprise: startup: error=\"failed detecting s3 prefix: could not list objects: operation error S3: ListObjectsV2, https response error StatusCode: 301, RequestID: 9HZJNGGAF12GNK53, HostID: 0EhgDLGAIeV1Cs6VwsWCohCiPRhcMFlACVS1VH7ertWu8UD1oh0=, api error PermanentRedirect: The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.\""
}For PostgreSQL or Redis, the pod may stop without writing specific connection errors to the logs.
Prerequisites
- Access to the Kubernetes cluster where Terraform Enterprise is deployed.
- The
kubectlcommand-line tool configured to communicate with your cluster. - The namespace of your Terraform Enterprise deployment.
- Your Terraform Enterprise Helm values, which contain the connection details for the external services.
To retrieve your Helm values, run the following command, replacing <namespace> with your deployment's namespace.
$ helm -n <namespace> get values terraform-enterprise
Solutions
To diagnose connectivity issues, you can launch temporary pods within the same Kubernetes namespace as Terraform Enterprise. This allows you to test the connection to each external service using the same network environment.
Solution 1: Verify S3 Bucket Connectivity
These commands test connectivity to your S3-compatible object storage.
AWS S3 Buckets
If you are using an instance profile for authentication (TFE_OBJECT_STORAGE_S3_USE_INSTANCE_PROFILE: true), run this command to test access.
$ kubectl -n <namespace> run s3test --rm --restart=Never --image=amazon/aws-cli -i -- s3 ls <bucket_name>
If you are using an access key and secret key for authentication, run this command, replacing the placeholders with your values.
$ kubectl -n <namespace> run s3test --rm --restart=Never \ --env=AWS_ACCESS_KEY_ID=<value> \ --env=AWS_SECRET_ACCESS_KEY=<value> \ --image=amazon/aws-cli -i -- s3 ls <bucket_name>
Expected Outcome
A successful connection lists the contents of the bucket.
PRE archivistterraform/
A failed connection returns an error.
An error occurred (NoSuchBucket) when calling the ListObjectsV2 operation: The specified bucket does not exist
Azure Storage Containers
If you are using Azure Blob Storage, run this command to test access, replacing the placeholders with your values.
$ kubectl -n <namespace> run az-test --rm --restart=Never --image=mcr.microsoft.com/azure-cli:cbl-mariner2.0 -i -- \ az storage blob list --account-name <TFE_OBJECT_STORAGE_AZURE_ACCOUNT_NAME> \ --container-name <TFE_OBJECT_STORAGE_AZURE_CONTAINER> \ --account-key "<TFE_OBJECT_STORAGE_AZURE_ACCOUNT_KEY>" \ --query "[].name"
Expected Outcome
A successful connection lists the names of blobs in the container.
[ "archivistterraform/json-plans/71d4662d/plan-tyxDE7QybJV62ePH", "archivistterraform/json-provider-schemas/5a9ab298/plan-tyxDE7QybJV62ePH", "archivistterraform/logs/apply-izPpEd9Kc6irViPY" ]
A failed connection returns an error code.
ErrorCode:ContainerNotFound
Solution 2: Verify PostgreSQL Connectivity
Use a temporary postgres pod to test the connection to your PostgreSQL database. Replace the placeholders with your connection details.
$ kubectl -n <namespace> run psqltest --rm --env=PGPASSWORD='<password_postgres>' --restart=Never --image=postgres -i -- \ psql -h <postgresql_hostname> -U <username> -d <database_name> -c "SELECT 1;"
Expected Outcome
- A successful connection returns a query result.
- A failed connection returns an error.
psql: error: could not translate host name "***************" to address: Name or service not known
Solution 3: Verify Redis Connectivity
Use a temporary redis pod to test the connection to your Redis instance. Replace the placeholders with your connection details.
$ kubectl -n <namespace> run redistest --rm --restart=Never --image=redis -i -- \ redis-cli -h <redis_hostname> -p <port> ping
Expected Outcome
A successful connection returns
PONG.PONG
A failed connection returns an error.
Could not connect to Redis at **********************: Name or service not known
Outcome
By successfully running these tests, you can confirm that the Kubernetes cluster has the necessary network connectivity and credentials to reach the external services required by Terraform Enterprise. If any test fails, you can focus your troubleshooting efforts on the specific network path, firewall rules, or credentials for that service.