…if you use the inappropriate bytes size (bs) option. See the man page of dd for details on this option.
Hard disks have a typical block size of 512 bytes. LVM on the other hand creates its block devices with a block size of 4096 bytes. So it’s easy to get confused – even if you know that disks should be tested with blocks of 512 bytes, you shouldn’t test LVM block devices with a 512-bytes but with a 4096-bytes block size.
What happens if you make a write performance test by writing directly on the raw block device and you use the wrong bytes size (bs) option?
If you look at the “iostat” statistics, they will show lots of read requests too, when you are only writing. This is not what is expected when you do only writing.
The problem comes by the fact that when you are not using the proper block size for the raw block device, instead of writing whole blocks, you are writing partial blocks. This is however physically not possible – the block device can only write one whole block at a time. In order to update the data in only a part of a block, this block needs to be read back first, then modified with the new partial data in memory and finally written back as a whole block.
The total performance drop is about 3 times on the systems I tested. I’ve tested this on some hard disks and on an Areca RAID-6 volume.
So what’s the lesson here?
When you do sequential write performance tests with “dd” directly on the raw block device, make sure that you use the proper bytes size option, and verify that during the tests you see only write requests in the “iostat” statistics.
Physical hard disk example:
# Here is a bad example for a hard disk device dd if=/dev/zero of=/dev/sdb1 bs=256 count=5000000 # Here is the proper usage, because /dev/sda physical block size is 512 bytes dd if=/dev/zero of=/dev/sdb1 bs=512 count=5000000
LVM block device example:
# Another bad example, this time for an LVM block device dd if=/dev/zero of=/dev/sdb-vol/test bs=512 count=1000000 # Here is the proper usage, because the LVM block size is 4096 bytes dd if=/dev/zero of=/dev/sdb-vol/test bs=4k count=1000000
Understanding the “iostat” output during a “dd” test:
Here is what “iostat” displays when you are not using the proper bytes size option (lots of read “r/s” and “rsec/s” requests):
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sdb 0.00 5867.40 3573.20 46.40 28585.60 47310.40 20.97 110.38 30.61 0.28 100.00 sdb1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb2 0.00 5867.40 3572.80 46.40 28582.40 47310.40 20.97 110.38 30.61 0.28 100.00 dm-2 0.00 0.00 3572.80 5913.80 28582.40 47310.40 8.00 13850.92 1465.43 0.11 100.00
Here is what it should display (no read “r/s” or “rsec/s” requests at all):
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sdb 0.00 16510.00 0.00 128.60 0.00 131686.40 1024.00 107.82 840.32 7.78 100.00 sdb1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb2 0.00 16510.00 0.00 128.60 0.00 131686.40 1024.00 107.82 840.32 7.78 100.00 dm-2 0.00 0.00 0.00 16640.00 0.00 133120.00 8.00 13674.86 823.73 0.06 100.00
How to be safe?
Fortunately, file systems are smart enough and pay attention to the block size of the block devices they were mounted on. So if you do a “dd” write performance test and write to a file, you should be fine. Though in this case there are some other complications like journaling, commit intervals, barriers, mount options, etc.