Consul nodes can experience resource exhaustion, including disk i/o, memory and CPU utilization. Here are some linux commands you can use to determine if those consul nodes are experiencing resource exhaustion.
Commands
top
$ top
top
is used to display all the running and active real-time processes in ordered list and updates it regularly. It displays CPU usage, Memory usage, Swap Memory, Cache Size, Buffer Size, Process PID, User, Commands and much more. It also shows high memory and cpu utilization of a running processes.
vmstat
$ vmstat
vmstat
will display statistics about system processes, memory, swap, I/O, and the CPU performance. These statistics are generated using data from the last time the command was run to the present.
mpstat
$ mpstat -P ALL
mpstat
writes to standard output about activities for each available processor, processor 0 being the first one. Global average activities among all processors are also reported.
sar
$ sar
sar
is a performance monitoring tool for collecting, viewing and recording performance data and can list what a system is doing all the time. Sar can generate report and email them to system admin. The statistics reported by sar includes I/O transfer rates, paging activity, process-related activities, interrupts, network activity, memory and swap space utilization, CPU utilization, kernel activities and TTY statistics, among others.
lsof
$ lsof
lsof
is used to display a list of all the open files and the processes. The open files included are disk files, network sockets, pipes, devices and processes.
tcpdump
$ tcpdump -i eth0
tcpdump
is one of the most widely used command-line network packet analyzer or packets sniffer program that is used capture or filter TCP/IP packets that received or transferred on a specific interface over a network. It also provides a option to save captured packages in a file for later analysis.
netstat
$ netstat -a | more
netstat
is a command line tool for monitoring incoming and outgoing network packets statistics as well as interface statistics. It is very useful tool for every system administrator to monitor network performance and troubleshoot network related problems.
htop
$ htop
htop
is a much advanced interactive and real time Linux process monitoring tool. This is much similar to Linux top command but it has some rich features like user friendly interface to manage process, shortcut keys, vertical and horizontal view of the processes and much more.
iotop
$ iotop
iotop
is also much similar to top command and htop program, but it has accounting function to monitor and display real time Disk I/O and processes.
iostat
$ iostat
To list the individual report use -c
, -d
and -h
switch for CPU utilization, device utilization and
network file system utilization. The following command will break the CPU utilization into user processes, system processes, I/O wait and idle time.
df
$ df -h
Returns filesystem mounts and their sizes/usage. -h
is for human-readable output.
Example:
$ df -h
Filesystem Size Used Avail Use% Mounted on
udev 481M 0 481M 0% /dev
tmpfs 99M 776K 98M 1% /run
/dev/xvda1 7.7G 2.6G 5.2G 33% /
tmpfs 492M 0 492M 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 492M 0 492M 0% /sys/fs/cgroup
/dev/loop0 18M 18M 0 100% /snap/amazon-ssm-agent/1480
/dev/loop2 90M 90M 0 100% /snap/core/8039
/dev/loop3 90M 90M 0 100% /snap/core/8268
tmpfs 99M 0 99M 0% /run/user/1000
ulimit and sysctl
The ulimit and sysctl programs allow to limit system-wide resource use. This can help a lot in system administration, e.g. when a user starts too many processes and therefore makes the system unresponsive for other users
Code Listing 1: ulimit example
# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
pending signals (-i) 8191
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 8191
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
ulimit
provides control over the resources available to the shell and to processes started by it, on systems that allow such control. The soft limit is the value that the kernel enforces for the corresponding resource. The hard limit acts as a ceiling for the soft limit. An unprivileged process may only set its soft limit to a value in the range from 0 up to the hard limit, and (irreversibly) lower its hard limit. A privileged process can make arbitrary changes to either limit value.
Setting a process limit
# ulimit -u 30
# ulimit -a
…
max user processes (-u) 30
…
--------sysctl
is a similar tool: It allows to configure kernel parameters at runtime. If you wish to keep settings persistent across reboots you should edit /etc/sysctl.conf - be aware that wrong settings may break things in unforeseen ways.
Exploring sysctl variables
# sysctl -a
…
vm.swappiness = 60
…
Reducing swappiness
# sysctl vm.swappiness=0
vm.swappiness = 0
prlimit
prlimit(1) is a new util that will be available in util-linux-2.21 (now -rc1). This new util is very nice and
flexible command line interface to prlimit(2) Linux syscall (supported since Linux 2.6.36).
prlimit(1) allows to get or set one or more process resource limits for given PID. When a command is given instead of PID, prlimit(1) will run this command with the given resources.
The output is flexible like output from lsblk(8) or findmnt(8). You can define the output columns, use parsable output etc.
See the default output:
$ prlimit --pid $$
RESOURCE DESCRIPTION SOFT HARD UNITS
AS address space limit unlimited unlimited bytes
CORE max core file size 0 unlimited blocks
CPU CPU time unlimited unlimited seconds
DATA max data size unlimited unlimited bytes
FSIZE max file size unlimited unlimited blocks
LOCKS max number of file locks held unlimited unlimited
MEMLOCK max locked-in-memory address space 65536 65536 bytes
MSGQUEUE max bytes in POSIX mqueues 819200 819200 bytes
NICE max nice prio allowed to raise 0 0
NOFILE max number of open files 1024 4096
NPROC max number of processes 1024 62809
RSS max resident set size unlimited unlimited pages
RTPRIO max real-time priority 0 0
RTTIME timeout for real-time tasks unlimited unlimited microsecs
SIGPENDING max number of pending signals 62809 62809
STACK max stack size 8388608 unlimited bytes
or redefine the output and ask for max number of open files only:
$ prlimit --nofile --output RESOURCE,SOFT,HARD --pid $$
RESOURCE SOFT HARD
NOFILE 1024 4096
and now let's modify the soft limit of maximal core file size and maximal number of open files:
$ prlimit --core=1000000: --nofile=100: --pid $$
the notation used for the limits is:
soft:hard specify both limits
soft: specify only the soft limit
:hard specify only the hard limit
value specify both soft and hard limits to the same value
and check the result:
$ prlimit --nofile --core --pid $$
RESOURCE DESCRIPTION SOFT HARD UNITS
NOFILE max number of open files 100 1024
CORE max core file size 1000000 unlimited blocks
and revert the core file soft limit:
$ prlimit --core=unlimited: --pid $$
$ prlimit --core --pid $$
RESOURCE DESCRIPTION SOFT HARD UNITS
CORE max core file size unlimited unlimited blocks
Do you want to restrict CPU time for given command (sort(1) in this example):
This is a followup to the writable /proc/<pid>/limits, which was not
integrated in mainline kernel. This solution should work.
You can set a system wide file descriptions limit using sysctl -w fs.file-max=N and
make the changes persist post boot up in /etc/sysctl.conf