PDA

View Full Version : [ubuntu] 10.04 MDADM Block Device and Partition Confusion



kurrazyman
October 23rd, 2010, 12:58 PM
I have been trying to get to grips with the Software RAID tool MDADM and I keep hitting the same hurdle which I haven't been able to explain or understand even after reading pages and pages of forums across the land!

The problem is that I have 2 SATA drives lets say sda and sdb. The size of the drives is unimportant as I've seen the same problem occur with 500GB and 1TB drives.

My goal is to make these two drives into a RAID 1 mirror so to do this I create a Linux RAID Auto Detect (fd) partition on each drive which spans the entire drive (so from the first cylinder to the last). This all goes to plan and I end up with /dev/sda1 and /dev/sdb1.

I then proceed to create an MDADM RAID 1 block device from the two partitions (md1 for argument sake) and this is where strange behaviour starts to occur.

I do all the necessary steps like letting the RAID sync and I add the UUID to the mdadm.conf file and all seems to have gone well but I subsequently have issues upon rebooting where the RAID device doesn't activate but instead shows as inactive and mdstat shows something like this: -

md1 : inactive sda1[0] sdb[0](S)

I was curious as to why MDADM was trying to use the entire second drive (sdb) rather than the primary partition (sdb1) for the MDADM device. So I ran the examine command on the sdb drive and the primary partition (sdb1) and both return the same information (same UUID the lot) as if they are one in the same thing.

Yet if I examine the sda drive I get the message 'No md superblock detected on /dev/sda' and examining the primary partition (sda1) returns the expected info (UUID, etc.). The behaviour of the sda drive is what I would expect but what is going on with the sdb drive?

Surely the superblock should be located on the partition not the drive itself (as seen with sda)? It seems the mdadm is getting itself into a pickle and hence why my RAID device fails.

The only reliable way around this I have found is to make the partitions (sda1, sdb1) a couple of cylinders shorter than the entire length of the drive, this seems to ensure the above confusion never occurs.

I'm hoping someone out there can clear this behaviour up and hopefully explain what is going on!

kurrazyman
October 28th, 2010, 05:58 PM
Is there no one out there who can help with this, it's causing me to lose sleep...there's a biscuit in it http://static.which.net/media/images/in-content/baby-food---rusks-61736.jpg

Carlbc18
November 4th, 2010, 07:25 PM
Hopefully I can help. I too recently went through the software raid setup. I'm sure this is a redundant question but did you follow https://help.ubuntu.com/10.04/serverguide/C/advanced-installation.html this guide? If you're installing desktop there is a similar doc.

I had some major issues chosing EXT4 for my filesystem. I needed to change to EXT3. Not sure if it was the drives, hardware, server, etc. Didn't care, EXT3 has been around since dirt was invented and is proven, so i was okay with my decision.

Also, after my install i never needed to go into the mdadm.conf and add the ARRAY -.... UUID stuff, it was already there.

Sorry if this isn't too helpful, but it took my many tries before i switched to EXT3 for things to work right.

kurrazyman
November 5th, 2010, 08:27 AM
Hi Carlbc18 many thanks for responding, I was beggining to lose hope!

It hasn't been plain sailing for me either with getting MDADM to work...The typical linux paradox of having to trawl through forums for some usable/relevant information as the minutes turn to hours, hours to days,....

I used to use dmraid but didn't it very good when a drive failure occurred so took the plunge to change over to mdadm and if it wasn't for this issue of the UUID appearing on the block device as well as the first (and only) partition then it would have been running sweet. One other point to note if you used to use dmraid previously with the same drives is to make sure you remove the metadata from the disks and/or stop dmraid loading at boot time as it will 'steal' your drives if you're not careful.

Back to the topic here, yeah I followed the installation guides but I am using LVM on top of my mdadm RAID-1 array so although I've used EXT4 this isn't causing the issue as the partitions on my drives are set to type fd (Auto Detect Linux RAID) and the partitions on the /dev/md* block devices are set to LVM and my EXT4 partitions are created on the Logical Volumes within my LVG. I came up with a workaround that solves the problem I have been experiencing but would still like to know what is going on exactly.

My workaround was to modify /etc/mdadm/mdadm.conf and change the DEVICE line from partitions to sd[a-z][1-9] which for SATA drives with no more than 9 partitions causes MDADM to avoid checking the block devices themselves (i.e. sd[a-z]) as listed in /proc/partitions and to only scan the partitions themselves. Once updated you need to run 'sudo update-initramfs -u' so that the modified mdadm.conf is updated in the boot image (assuming mdadm is loading at boot time).