Consul uses a gossip protocol to manage membership and broadcast messages to the cluster. Specifically, Consul uses a LAN gossip pool and a WAN gossip pool to perform different functions, which is possible by leveraging an embedded Serf library.
Gossip is done over UDP with a configurable but fixed fanout and interval. This ensures that network usage is constant with regards to number of nodes. Complete state exchanges with a random node are done periodically over TCP, but much less often than gossip messages. This increases the likelihood that the membership list converges properly since the full state is exchanged and merged. The interval between full state exchanges is configurable or can be disabled entirely.
The three stages of the Gossip Protocol are as follows: Alive -> Suspected of Failure -> Dead.
Failure detection is done by periodic random probing using a configurable interval. If the node fails to ack within a reasonable time (typically some multiple of RTT), then an indirect probe is attempted. An indirect probe asks a configurable number of random nodes to probe the same node, in case there are network issues causing our own node to fail the probe. If both our probe and the indirect probes fail within a reasonable time, then the node is marked "suspicious" and this knowledge is gossiped to the cluster. In the logs of the Consul nodes, it would look something like the following example:
Log of consul1:
[INFO] memberlist: Suspect consul2 has failed, no acks received
Log of node consul2 :
[WARN] memberlist: Refuting a suspect message (from: consul1)
This means consul2 is refuting a suspect message originated by consul1.
A suspicious node is still considered a member of the cluster. If the suspect member of the cluster does not dispute the suspicion within a configurable period of time, then the node is finally considered dead, and this state then gossips to the cluster. 99% of the time this indicates a UDP routing issue where nodes are unable to ping consul2 or consul2 is unable to respond (packet loss); therefore, it's important to make sure that all the required ports are open as well as checking that nothing in the network firewall is blocking traffic & communication between the Consul nodes. Additionally, you can look into tuning the LAN and WAN Gossip parameters via the node configuration depending on your environment and Consul’s use case but be warned that tuning them improperly may cause Consul to fail in unexpected ways.