Identification
- The
tfe-atlas
container prints the messagewaiting on database to reach 20230628000004
followed by the date, repeatedly. -
The
tfe-migrations
container logs the following error:2023/07/28 22:20:10 execing command; /usr/local/bundle/bin/bundle [exec ruby /app/atlas-migration-wrapper.rb] rake aborted! StandardError: An error has occurred, all later migrations canceled: === Dangerous operation detected #strong_migrations === Adding an index concurrently can cause silent data corruption in Postgres 14.0 to 14.3. Upgrade Postgres before adding new indexes, or wrap this step in a safety_assured { ... } block to accept the risk. /app/db/migrate/20230628000001_add_index_for_fks.rb:9:in change' /app/config/initializers/active_record_migrations.rb:15:in exec_migration' /usr/local/bundle/gems/bundler-2.3.25/lib/bundler/cli/exec.rb:58:in load' [...stack trace cut for brevity...]
Context
As the error message alludes to, PostgreSQL had a bug in release versions 14.0 through 14.3 that can cause silent data corruption when multiple indices are added concurrently. The 20230628000001_add_index_for_fks.rb
migration does this, and the migrations library correctly identifies this action as dangerous, preventing the corruption from occurring.
PostgreSQL announcement of the 14.4 release and this bug: https://www.postgresql.org/message-id/165473835807.573551.1512237163040609764%40wrigleys.postgresql.org
PostgreSQL 14.4 release notes: https://www.postgresql.org/about/news/postgresql-144-released-2470/
Resolution
Upgrading PostgreSQL to version 14.4 or later, or major version 15 should prevent this issue from occurring.