Replication to secondary cluster disconnects when primary cluster elects a new leader – HashiCorp Help Center

Introduction

Leader election in primary cluster breaks replication to secondary clusters.

Scenario

During a leadership change on a primary cluster, two RPC clients on a secondary cluster (WAL streaming and another activity such as heart beating) can race to authenticate a new token. The non-WAL streaming RPC will authenticate first, however a bug in the WAL stream error handling can cause the new auth token to be wiped. This can cause the primary cluster to still believe that the connection has a token. The end result is a secondary cluster that cannot maintain any replication activities until either:

1) Replication is restarted on either the primary or secondary cluster

2) A leadership change happens on either cluster

Recommendation

There are 3 workarounds for this issue:

1) First is to perform a leadership election in either cluster.

2) Second is perform a POST action to the sys/replication/recovery endpoint if an election does not fix replication between clusters.

3) The third and most involved workaround is to follow the update-primary procedure to re-establish replication between clusters.

Additional Information

Introduction

Scenario

Recommendation

Additional Information

Articles in this section

Related articles