Introduction

Scenarios

First start with the Nomad server cluster hosts, then after that is done, move on to the Nomad client nodes.
Staggering the server reboots is recommended. It is best to use the Nomad Upgrade Process as a template.
Take a Nomad snapshot before any activity that may cause an outage.

You can do one client node at a time or multiple client nodes at a time. Follow their respective steps.
When rebooting the Nomad Client Node, it is best to drain the node so all of its allocations are migrated to other nodes to avoid disruption of Nomad job tasks.
All Nomad jobs have parameters set with defaults, same for the Migrate Stanza:
- max_parallel (int: 1)
- health_check (string: "checks"
- min_healthy_time (string: "10s")
- healthy_deadline (string: "5m")

Drain the client node (nomad node drain -enable -yes <Node ID>).
Check the client node status (nomad node status <Node ID>).
Stop Nomad gracefully (sudo systemctl stop nomad).
Reboot the client node's host OS.
Check the client node status.
Restore client node eligibility (nomad node drain -disable -yes <Node ID>).
1. If you get "Error toggling drain mode: Unexpected response code: 500 (no servers)", run the command from a Nomad Server host.
Check the client node status.

See Workload Migration, Drain multiple nodes for details. It includes shell scripting that can help automate manipulating multiple nodes at once.
Check the node status (nomad node status -allocs).
Set client nodes ineligible (nomad node eligibility -disable <Node ID>).
Check the node status
Drain client nodes (nomad node drain -enable -yes <Node ID>).
Check the node status.
Stop Nomad gracefully (sudo systemctl stop nomad).
Reboot the drained client nodes' hosts OS.
Check the node status.
Restore nodes' eligibility (nomad node drain -disable -yes <Node ID>).
1. If you get "Error toggling drain mode: Unexpected response code: 500 (no servers)", run the command from a Nomad Server host.
Check the node status.