FreeBSD gmirror: comparing the different load balancing algorithms

ARCHIVES

August 2011
July 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
August 2008
July 2008
June 2008
May 2008
March 2008
February 2008
January 2008
November 2007

CONTACT

About this blog: Computers hate me. They really do. Every time I try to do something unusual like add new hardware, something is guaranteed to go wrong. I decided to start writing about my constant problems so that someone else might benefit from my experiences - or at least laugh at them!

FreeBSD gmirror: comparing the different load balancing algorithms

21 July 2009, 2:45pm

I've recently been toying with a 4 drive gmirror (RAID1) array for a file system that does mostly random reads. One drive is currently synchronising, and the array itself is idle, so I thought I'd do a quick experiment and see which load balancing algorithm works best. Each algorithm is sampled using iostat with a period of 60 seconds. The results are surprising.

"split" algorithm, 4096 bytes. (This is the default if you do not explicitly configure an algorithm.)

extended device statistics
device     r/s   w/s    kr/s    kw/s wait svc_t  %b
ad12     344.5   0.0 14467.1     0.0    0   3.3  73
ad14     344.4   0.0 14810.8     0.0    1   1.1  30
ad16     344.5   0.0 14811.5     0.0    2   1.7  44
ad18       0.0 347.9     0.0 44091.8    0   1.5  41

Note the high IOPS from each drive (presumably because of the 4096 byte split level), and that the destination drive is only 41% busy.

"round-robin" algorithm.

extended device statistics
device     r/s   w/s    kr/s    kw/s wait svc_t  %b
ad12     115.6   0.0 14798.9     0.0    0  10.0  85
ad14     115.6   0.0 14796.8     0.0    0   1.3  16
ad16     115.6   0.0 14794.7     0.0    1   2.3  25
ad18       0.0 350.2     0.0 44390.0    1   1.2  35

Similar overall performance to "split", except that the reads are not split into such small amounts. Why do the busy levels of the (identical) source drives vary so remarkably?

"load" algorithm.

extended device statistics
device     r/s   w/s    kr/s    kw/s wait svc_t  %b
ad12      12.0   0.0  1535.5     0.0    0   4.5   3
ad14     331.4   0.0 42418.3     0.0    0   0.9  27
ad16     133.8   0.0 17120.2     0.0    0   1.2  13
ad18       0.0 481.9     0.0 61076.4    2   3.0  92

Wow. We've just seen nearly 50% improvement in sequential read speed. The destination drive is also pretty busy, which is good. This is the best performing algorithm in this simple test.

"prefer" algorithm.

extended device statistics
device     r/s   w/s    kr/s    kw/s wait svc_t  %b
ad12       0.0   0.0     0.0     0.0    0   0.0   0
ad14       0.0   0.0     0.0     0.0    0   0.0   0
ad16     472.5   0.0 60483.8     0.0    1   0.9  37
ad18       0.0 477.2     0.0 60486.2    1   3.3  96

Reading from a single drive is faster than "split" and "round-robin", and almost as fast as "load."

Are these results skewed by the unusual configuration of a 4 drive mirror and the fact that it's a rebuild rather than normal operation? Hmmm...

UPDATE: Array rebuild is complete. Doing a random read test (executing dd if=/dev/mirror/db0 of=/dev/null iseek=<random_number> bs=16k count=1 repeatedly) has produced even more confusing results. All algorithms are showing very similar numbers, with an effective random read rate of between 133.8 and 140.3 16k blocks per second. The highest value of 140.3/sec is actually from the prefer algorithm, reading from a single drive only! There appears to be zero benefit, for this synthetic test anyway, to having more than one drive to read the data from.

System: FreeBSD 7.1-RELEASE amd64, Gigabyte GA-EX38-DS4 mainboard, 4 x 2GB A-DATA DDR2-800 RAM, 4 x WD7500AAKS 750GB drives connected to onboard SATA ports in AHCI+native mode, gmirror configured to use 4 drives.

UPDATE #2: FreeBSD 8.0 has a much improved "load" algorithm which does correctly balance. I ran a 4 x 1TB drive RAID1 array for a few months before changing to RAID10 (which also benefits from the new algorithm.)