One of the factors that affect Application performance is high I/O wait. High I/O wait can lead to unexpected increase in Load Average. This happens when processes enter the ‘D’ state i.e in an uninterruptible sleep i.e they are waiting for the disk I/O to complete. The following commands can be used to monitor and detect an I/O bottleneck in the storage subsystem.

★ Linux jobs

top

top

in the output of top, wa (IO Wait) should be 0.0% almost all the time. Numbers consistently above 1% indicates that the storage device is too slow to keep up with requests.

top - 02:43:05 up 44 days,  6:15,  2 users,  load average: 0.00, 0.01, 0.05
Tasks: 149 total,   2 running, 147 sleeping,   0 stopped,   0 zombie
%Cpu(s):  2.0 us,  0.0 sy,  0.0 ni, 49.0 id,  3.1 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  7994312 total,  2327256 free,   234672 used,  5432384 buff/cache
KiB Swap:  3145724 total,  3145452 free,      272 used.  7349164 avail Mem

In the above output, the I/O wait average is 3.1 which is high, but this is the average I/O wait of all CPUs on the system. However, After you run top command, if you press 1 to expand all the CPUs, the below output is observed.
[run ‘top’ > press ‘1’]

top - 02:41:25 up 44 days,  6:14,  2 users,  load average: 0.00, 0.01, 0.05
Tasks: 149 total,   1 running, 148 sleeping,   0 stopped,   0 zombie
%Cpu0  :  0.3 us,  0.3 sy,  0.0 ni, 0 id,  3.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  :  0.3 us,  0.0 sy,  0.0 ni, 99 id,  0.1 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  7994312 total,  2327320 free,   234608 used,  5432384 buff/cache
KiB Swap:  3145724 total,  3145452 free,      272 used.  7349228 avail Mem

The above output shows that I/O wait is high on one of the CPUs, whereas 2nd CPU is mostly idle. This means IO wait is high but not enough to make the the system run out of CPU resources due to IO wait. however, has this been a case of all CPUs with high I/O wait (wa), the situation would be considered as worse because all the CPUs are waiting for I/O requests to complete, thus making other processes starve of CPU time.

Now that we know how to check I/O metrics, we can find out which application or process is reading/writing the most by running iotop command.

iotop

#iotop [-o list only processess that is doing a R/W to disks]
iotop -o

Total DISK READ: 100.02 M/s | Total DISK WRITE: 106.52 M/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
59250 be/4 appUUP      0.00 B/s    0.00 B/s  0.00 % 23.37 % myappUUP (LOCAL=NO)
58837 be/4 appUUP    580.84 K/s    0.00 B/s  0.00 %  5.29 % myappUUP (LOCAL=NO)
48839 be/4 appUUP      0.00 B/s    0.00 B/s  0.00 %  4.40 % app_lgwr_UUP
48823 be/4 appUUP      0.00 B/s    0.00 B/s  0.00 %  0.06 % app_dbw4_UUP
86058 be/4 myapp      3.09 K/s    9.27 K/s  0.00 %  0.03 % app_lgwr_UUrddb8
66442 be/4 mylinuxmon     0.00 B/s    6.18 K/s  0.00 %  0.00 % java -Xrs -DMy.name=MyMainAgent -DMy.logback.configur~rt 80 -dir /opt/mylinuxmon -ssl false -highSecurity false

From the above output we understand that command myappUUP owned by user appUUP is doing the highest percentage of I/O to disks.

Also, The below ps command will help us filter all the D state processes i.e processes in uninterruptible sleep waiting for the Disk I/O to complete.

ps aux | awk '$8 ~ /^D/{print}'

Now that we can find out which process is doing the highest I/O, we can also find out disk specific statistics with iostat command.

iostat

#iostat [ -x for extended statistics, -d to display device stastistics only, -m for displaying r/w in MB/s ]
iostat -xdm

Linux 3.10.0-514.el7.x86_64 (unixutils)        05/28/2018      _x86_64_        (2 CPU)

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.10    0.01    0.71     0.00     0.01    27.95     0.00    3.86    2.63    3.87   1.53   0.11
sdd               0.00     0.00    0.00    0.00     0.00     0.00    11.32     0.00    2.86    0.74    5.70   2.13   0.00
sdc               0.00     0.00    0.00    0.02     0.00     0.01   877.57     0.00  202.09    6.03  245.71   2.21   0.00
sdb               0.00     0.00    0.00    0.00     0.00     0.00    10.20     0.00    0.32    0.32    0.00   0.28   0.00
sde               0.00     0.00    0.00    0.00     0.00     0.00    10.47     0.00    0.33    0.33    0.00   0.31   0.00
dm-0              0.00     0.00    0.01    0.81     0.00     0.01    24.57     0.00    5.26    2.78    5.28   1.36   0.11
dm-1              0.00     0.00    0.00    0.00     0.00     0.00    14.09     0.00    3.23    0.43   10.94   0.75   0.00
dm-2              0.00     0.00    0.00    0.02     0.00     0.01   795.23     0.00  187.26    6.07  222.83   2.00   0.00

#iostat with -p for specific device statistics
iostat -xdm -p sda

Linux 3.10.0-327.el7.x86_64 (unixutils)     05/28/2018      _x86_64_        (2 CPU)

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     1.31    0.05    1.30     0.00     0.02    25.81     0.00    0.38    0.61    0.37   0.23   0.03
sda1              0.00     0.00    0.00    0.00     0.00     0.00   140.60     0.00    0.65    0.67    0.00   0.24   0.00
sda2              0.00     1.31    0.05    1.30     0.00     0.02    25.73     0.00    0.38    0.61    0.37   0.23   0.03
NOTE: In the above output the last column %util tells us the bandwidth utilization of the device. Device saturation occurs when this value is close to 100%, which means that the device is extremely busy and programs may have to wait before being able to read or write data from/to disk.

sar
Historical data of I/O waits can be found with sar command.

#sar [ -P cpu statistics, -f location of log file under /var/log/sa/ ]
sar -P ALL -f /var/log/sa/sa21

in the above command -P, makes sar dump CPU usage information for the 21st day of the month. the output of the command should display %iowait along with other details, with an interval of 10 Minutes for the entire day. This information can be used to read the historical data and generate reports for analysis and capacity planning.

Storage bottleneck can be overcome by following the below listed best practices.

  • Avoiding storage allocation not only based on availability of free space but also by considering project performance needs. Make sure you have enough drives for the throughput or IOPS you need.
  • Application workload should be evenly distributed across disks to reduce the chance of hotspots.
  • Choosing the correct RAID based on application profile.