www.colben.cn/content/post/linux-io-random.md
2021-11-14 14:32:08 +08:00

189 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "Linux 硬盘随机读写"
date: 2019-10-29T21:29:26+08:00
lastmod: 2019-10-29T21:29:26+08:00
keywords: []
tags: ["linux", "随机读", "随机写"]
categories: ["storage"]
---
# 前言
- 随机写会导致磁头不停地换道,造成效率的极大降低
- 顺序写磁头几乎不用换道,或者换道的时间很短
# fio 介绍
- fio的输出报告中的几个关键指标:
- slat: 是指从 I/O 提交到实际执行 I/O 的时长Submission latency
- clat: 是指从 I/O 提交到 I/O 完成的时长Completion latency
- lat: 指的是从 fio 创建 I/O 到 I/O 完成的总时长
- bw : 吞吐量
- iops: 每秒 I/O 的次数
# 同步写测试
### 同步随机写
- 使用strace工具查看系统调用
```bash
strace -f -tt -o /tmp/randwrite.log -D fio -name=randwrite -rw=randwrite \
-direct=1 -bs=4k -size=1G -numjobs=1 -group_reporting -filename=/tmp/test.db
```
- 提取关键信息
```
root@wilson-ubuntu:~# strace -f -tt -o /tmp/randwrite.log -D fio -name=randwrite -rw=randwrite \
> -direct=1 -bs=4k -size=1G -numjobs=1 -group_reporting -filename=/tmp/test.db
randwrite: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
fio-2.2.10
Starting 1 process
...
randwrite: (groupid=0, jobs=1): err= 0: pid=26882: Wed Aug 14 10:39:02 2019
write: io=1024.0MB, bw=52526KB/s, iops=13131, runt= 19963msec
clat (usec): min=42, max=18620, avg=56.15, stdev=164.79
lat (usec): min=42, max=18620, avg=56.39, stdev=164.79
...
bw (KB /s): min=50648, max=55208, per=99.96%, avg=52506.03, stdev=1055.83
...
Run status group 0 (all jobs):
WRITE: io=1024.0MB, aggrb=52525KB/s, minb=52525KB/s, maxb=52525KB/s, mint=19963msec, maxt=19963msec
Disk stats (read/write):
...
sda: ios=0/262177, merge=0/25, ticks=0/7500, in_queue=7476, util=36.05%
```
- 重点关注的信息
- clat: 平均时长56ms左右
- lat: 平均时长56ms左右
- bw: 吞吐量大概在52M左右
- 内核调用信息随机读每一次写入之前都要通过lseek去定位当前的文件偏移量
```
root@wilson-ubuntu:~# more /tmp/randwrite.log
...
26882 10:38:41.919904 lseek(3, 665198592, SEEK_SET) = 665198592
26882 10:38:41.919920 write(3, "\220\240@\6\371\341\277>\0\200\36\31\0\0\0\0\202\2\7\320\343\6H\26P\340\277\370\330\30e\30"..., 4096) = 4096
26882 10:38:41.919969 lseek(3, 4313088, SEEK_SET) = 4313088
26882 10:38:41.919985 write(3, "\220\240@\6\371\341\277>\0\200\36\31\0\0\0\0\202\2\7\320\343\6H\26P\340\277\370\330\30e\30"..., 4096) = 4096
26882 10:38:41.920032 lseek(3, 455880704, SEEK_SET) = 455880704
26882 10:38:41.920048 write(3, "\220\240@\6\371\341\277>\0\200\36\31\0\0\0\0\202\2\7\320\343\6H\26P\340\277\370\330\30e\30"..., 4096) = 4096
26882 10:38:41.920096 lseek(3, 338862080, SEEK_SET) = 338862080
26882 10:38:41.920112 write(3, "\220\240@\6\371\341\277>\0\2402\24\0\0\0\0\202\2\7\320\343\6H\26P\340\277\370\330\30e\30"..., 4096) = 4096
26882 10:38:41.920161 lseek(3, 739086336, SEEK_SET) = 739086336
26882 10:38:41.920177 write(3, "\220\240@\6\371\341\277>\0\2402\24\0\0\0\0\202\2\7\320\343\6H\26P\340\277\370\330\30e\30"..., 4096) = 4096
26882 10:38:41.920229 lseek(3, 848175104, SEEK_SET) = 848175104
26882 10:38:41.920245 write(3, "\220\240@\6\371\341\277>\0\2402\24\0\0\0\0\202\2\7\320\343\6H\26P\340\277\370\330\30e\30"..., 4096) = 4096
26882 10:38:41.920296 lseek(3, 1060147200, SEEK_SET) = 1060147200
26882 10:38:41.920312 write(3, "\220\240@\6\371\341\277>\0\2402\24\0\0\0\0\202\2\7\320\343\6H\26P\340\277\370\330\30e\30"..., 4096) = 4096
26882 10:38:41.920362 lseek(3, 863690752, SEEK_SET) = 863690752
26882 10:38:41.920377 write(3, "\220\240@\6\371\341\277>\0\2402\24\0\0\0\0\202\2\7\320\343\6H\26P\340\277\370\330\30e\30"..., 4096) = 4096
26882 10:38:41.920428 lseek(3, 279457792, SEEK_SET) = 279457792
26882 10:38:41.920444 write(3, "\220\240@\6\371\341\277>\0\2402\24\0\0\0\0\202\2\7\320\343\6H\26P\340\277\370\330\30e\30"..., 4096) = 4096
26882 10:38:41.920492 lseek(3, 271794176, SEEK_SET) = 271794176
26882 10:38:41.920508 write(3, "\220\240@\6\371\341\277>\0\2402\24\0\0\0\0\202\2\7\320\343\6H\26P\340\277\370\330\30e\30"..., 4096) = 4096
26882 10:38:41.920558 lseek(3, 1067864064, SEEK_SET) = 1067864064
26882 10:38:41.920573 write(3, "\220\240@\6\371\341\277>\0\2402\24\0\0\0\0\202\2\7\320\343\6H\26P\340\277\370\330\30e\30"..., 4096) = 4096
...
```
### 同步顺序写
- 测试顺序写
```bash
strace -f -tt -o /tmp/write.log -D fio -name=write -rw=write \
-direct=1 -bs=4k -size=1G -numjobs=1 -group_reporting -filename=/tmp/test.db
```
- 可以看到吞吐量提升至70M左右
```
write: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
fio-2.2.10
Starting 1 process
Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/70432KB/0KB /s] [0/17.7K/0 iops] [eta 00m:00s]
write: (groupid=0, jobs=1): err= 0: pid=27005: Wed Aug 14 10:53:02 2019
write: io=1024.0MB, bw=70238KB/s, iops=17559, runt= 14929msec
clat (usec): min=43, max=7464, avg=55.95, stdev=56.24
lat (usec): min=43, max=7465, avg=56.15, stdev=56.25
...
bw (KB /s): min=67304, max=72008, per=99.98%, avg=70225.38, stdev=1266.88
...
Run status group 0 (all jobs):
WRITE: io=1024.0MB, aggrb=70237KB/s, minb=70237KB/s, maxb=70237KB/s, mint=14929msec, maxt=14929msec
Disk stats (read/write):
...
sda: ios=0/262162, merge=0/10, ticks=0/6948, in_queue=6932, util=46.49%
```
- 内核调用
```
root@wilson-ubuntu:~# more /tmp/write.log
...
27046 10:54:28.194508 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\360\t\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.194568 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\220\24\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.194627 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\220\24\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.194687 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\220\24\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.194747 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\220\24\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.194807 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\220\24\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.194868 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\220\24\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.194928 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\220\24\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.194988 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\220\24\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.195049 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\220\24\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.195110 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\220\24\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.195197 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\220\24\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.195262 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\220\24\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.195330 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\220\24\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.195426 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\220\24\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.195497 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\220\24\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.195567 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\220\24\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.195637 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\220\24\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.195704 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\220\24\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.195757 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\220\24\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.195807 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\220\24\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.195859 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\220\24\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.195910 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\220\24\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.195961 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\220\24\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.196012 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\220\24\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.196062 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0\220\24\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.196112 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0 \26\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.196162 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0 \26\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.196213 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0 \26\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.196265 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0 \26\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.196314 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0 \26\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.196363 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0 \26\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.196414 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0 \26\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.196472 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0 \26\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.196524 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0 \26\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
27046 10:54:28.196573 write(3, "\0\0\23\0\0\0\0\0\0\300\16\0\0\0\0\0\0 \26\0\0\0\0\0\0\320\17\0\0\0\0\0"..., 4096) = 4096
...
```
- 顺序读不需要反复定位文件偏移量,所以能够专注于写操作
# 异步顺序写
- 前面 fio 的测试报告中没有发现slat那是由于上述都是同步操作对同步 I/O 来说,由于 I/O 提交和 I/O 完成是一个动作,所以 slat 实际上就是 I/O 完成的时间
- 异步顺序写测试命令
```bash
# 同步顺序写的命令添加 -ioengine=libaio
fio -name=write -rw=write -ioengine=libaio -direct=1 -bs=4k -size=1G -numjobs=1 -group_reporting -filename=/tmp/test.db
```
- slat指标出现lat 近似等于 slat + clat 之和avg平均值并且换成异步io之后吞吐量得到了极大的提升120M左右
```
write: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
fio-2.2.10
Starting 1 process
Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/119.3MB/0KB /s] [0/30.6K/0 iops] [eta 00m:00s]
write: (groupid=0, jobs=1): err= 0: pid=27258: Wed Aug 14 11:14:36 2019
write: io=1024.0MB, bw=120443KB/s, iops=30110, runt= 8706msec
slat (usec): min=3, max=70, avg= 4.31, stdev= 1.56
clat (usec): min=0, max=8967, avg=28.13, stdev=55.68
lat (usec): min=22, max=8976, avg=32.53, stdev=55.72
...
bw (KB /s): min=118480, max=122880, per=100.00%, avg=120467.29, stdev=1525.68
...
Run status group 0 (all jobs):
WRITE: io=1024.0MB, aggrb=120442KB/s, minb=120442KB/s, maxb=120442KB/s, mint=8706msec, maxt=8706msec
Disk stats (read/write):
...
sda: ios=0/262147, merge=0/1, ticks=0/6576, in_queue=6568, util=74.32%
```
# 参考
- [https://www.linuxidc.com/Linux/2019-08/160097.htm](https://www.linuxidc.com/Linux/2019-08/160097.htm)