Problem
When using the Restore API endpoint to restore a backup of a Terraform Enterprise (TFE) installation, it fails to start up successfully, and reports the following error within the logs of TFE:
[ERROR] tfe-task-worker: error running database migrations: err="migrate: no migration found for version X: read down for version X migrations: file does not exist"
Cause
This is due to the backup being made from a version of TFE that's different than the one currently provisioned. This more often is the case when a user is attempting to roll back the version of TFE that they upgraded to.
Solution
Manually rename the Database (DB) that's currently existing within the PostgreSQL (postgres) server that TFE is configured to connect to, which will allow TFE to recreate what's needed within the postgres server during the next startup attempt, and will allow the ability to perform a restoration of the backup successfully.
-
- Execute
replicatedctl app stop
to stop the TFE application, if not done already. - Execute
watch replicatedctl app status
until the value for State is stopped. - Create a local backup of the current state of the DB using the following command:
pg_dump postgres://<PG_USERNAME>:<PG_PASSWORD>@<PG_NETLOCATION>:5432/<PG_DBNAME>?sslmode=require > tfe_pgdump.backup
- The values for all of these variables can be identified through the output of
replicatedctl app-config export --hidden
- In order to execute this command,
postgresql-client
is required, below are the steps per OS type:- Debian/Ubuntu:
apt-get install postgresql-client
- Red Hat family:
yum install postgresql-client
- Debian/Ubuntu:
- The values for all of these variables can be identified through the output of
- Sign into the postgres server as the user configured to be used by TFE, while also signing in through a DB that resides within the postgres server that is not the DB that's found to be configured as the value for
pg_dbname
within the output ofreplicatedctl app-config export
, such aspostgres
, by using this command:psql postgres://<PG_USERNAME>:<PG_PASSWORD>@<PG_NETLOCATION>:5432/<POSTGRES_OR_OTHER>?sslmode=require
-
The values for all of these variables can be identified through the output of
replicatedctl app-config export --hidden
-
The values for all of these variables can be identified through the output of
- Attempt to rename the DB that you have configured to be used by TFE with a
_old
suffix through the use of this command:ALTER DATABASE "CurrentDBName" RENAME TO "CurrentDBName_old"
- If not successful due to it stating there are active sessions using the database, execute this command with the value DatabaseName replaced with the name of the DB:
SELECT pg_terminate_backend (pg_stat_activity.pid) FROM pg_stat_activity WHERE pg_stat_activity.datname = 'DatabaseName';
- If not successful due to it stating there are active sessions using the database, execute this command with the value DatabaseName replaced with the name of the DB:
- Create a new DB with the name of the original DB:
CREATE DATABASE "OriginalDBName;
- Close the
psql
connection by typingexit
- Execute
replicatedctl app start
to start the TFE application back up- If the application fails to start up still, generate and upload a new support bundle for HashiCorp Support to review, while also informing HashiCorp Support that the failure was at step 8 of this article.
- Execute an API request against the restore endpoint of the backup API again to restore the backup of TFE.
- Restart the application by first executing
replicatedctl app stop
- Execute
watch replicatedctl app status
until you see the value for State as stopped. - Execute
replicatedctl app start
to start the TFE application back up.- If the application fails to start up still, generate and upload a new support bundle for HashiCorp Support to review, while also informing HashiCorp Support that the failure was at step 12 of this article.
- Verify whether the postgres data was restored successfully through the UI.
- Sign back into the postgres server.
- Delete the
_old
variant of the DB if desired usingDROP DATABASE "DatabaseName";
- Close the psql connection by typing
exit
. - Delete the local backup of the DB you created if desired,
tfe_pgdump.backup
- Execute
Outcome
TFE starts successfully after performing the restore operation with all of the backup data existing.