[kwlug-disc] MDADM and RAID

Wed Mar 3 19:46:12 EST 2010

You're correct in pointing out some flaws in my initial test.
Filesystem caching affected the dd calculation, which is why I ran
`time` around dd and sync. As pointed out earlier though, I have no
control about where on disk these are written. LVM may also be causing
some overhead, though since I just moved all blocks I would assume it
placed them in linear order.

I did some testing using your method. I've migrated my pv's back to my
external disk and am doing the write directly to the array, so I
shouldn't be hindered by either ext4 or lvm this time. I wish I was
using a dual xeon with SAS drives to test, but I'm still using SATA on
my Athlon X2.

I did notice all the md*raid* processes were niced to -5, so they
should edge out anything else if I understand correctly.

It took a while as each resync after creating the array took about 1.5
hours. I could have repartitioned the disk and made smaller arrays, but
I wanted to reflect my actual use pattern.

On Wed, Mar 3, 2010 at 09:29, John Van Ostrand <john at netdirect.ca> wrote:
> Then I created a RAID5 set with 4 of the disks and repeated.
>
>        md0 : active raid5 sde1[3] sdd1[2] sdc1[1] sdb1[0]
>              215045760 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
>
> The result was 142MB/s, total time 37 seconds.

This is the same RAID5 array that started the thread. I did this test
first, but I didn't think to capture the mdstat output for it. It was
whatever the defaults were based on the create command I showed earlier
in the thread...

I got 58.1 MB/s for a time of 1m30.359. dd used ~28% cpu, md1_raid5
used ~25%. 

> Those same disks as RAID0+1:
>
>        md0 : active raid0 sde1[3] sdd1[2] sdc1[1] sdb1[0]
>              286727680 blocks 64k chunks
>
> Performed the write at 513MB/s, total time 10 seconds

I did benchmarks for 4-disk raid10, as well as 2-disk raid1 and raid0.
I used the defaults for mdadm, so from what I understand, raid10 is
striping between two mirrors. (i.e. sda2 and sdb2 have the same bits.
Some ASCII art would go well here). 

Raid10: 85.8 MB/s for a time of 1m3.185. dd used ~40% cpu, md1_raid10
maxed at 9% but mostly stayed around 4%.

Raid0: 86.6 MB/s for a time of 1m0.595. dd used ~40% cpu again, there
does not appear to be an md1_raid0 process.

Raid1: 44.5 MB/s for a time of 1m57.960. dd used around 25% cpu,
md2_raid1 used about 5%.

From my brief look, it appears as though RAID10 is as fast as RAID0.
RAID1 appears to take a near 50% performance hit from both of those
options, managing to come in behind even RAID5, which needs to do extra
parity calculations.

>        time dd if=/dev/zero of=/dev/sdb1 bs=1M count=5000
>
> The result was 131MB/s, total time 40 seconds.

In true "WTF" fashion, I tested the bare disks last. dd used about 40%
cpu for each disk.

sda2: 38.0 MB/s for a time of 2m17.913.
sdb2: 108 MB/s for a time of 0m50.240.
sdc2: 38.4 MB/s for a time of 2m16.397.
sdd2: 102 MB/s for a time of 0m51.536.

So it appears I have two sub-par disks. I checked the make and model,
and I have two pairs of disks. Apparently I staggered them when I built
the system originally.

sda and sdc are both Hitachi "HDP725050GLA360", which apparently only
have a 16MB cache. There may be other performance issues related to
them as well. sdb and sdd are both Seagate "ST3500320AS" with 32MB cache.

Just for fun, I tried a two more RAID0 arrays, md1 on the Hitachi
drives, md2 on the Seagates.

Hitachi RAID0: 86.5 MB/s in 1m0.632
Seagate RAID0: 202 MB/s in 0m25.948

So thus after waiting for all those array syncs, it turns out 1/2 my
disks are just really slow. Well crap.

-- 
Chris Irwin
<chris at chrisirwin.ca>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <http://kwlug.org/pipermail/kwlug-disc_kwlug.org/attachments/20100303/4df61a84/attachment.sig>