This epic saga continues. First, thanks everyone that helped me in this thread, I would have been useless without all these hints. Now I have recovered most of the more important part of my data, but would like some advice on if I can recover any more.
I was able to get the backup file recovered and re-assembled the array with the --invalid-backup param in case parts of it had been corrupted. I was able to get my array back up and running for a while. First I took one external drive and copied of the smallest and most immediatly important stuff. At this point I had what was originally a synced raid5, which was growing/reshaping to a raid6 onto a new drive. As it was happeneing atleast one of the drives was still making the beeping noises periodically and I knew it was a matter of time before it went out. Eventually /dev/sdd totally failed and got marked F on /proc/mdstat. The array was still operational, still showed 4 of 6 components active, where the fifth still being resynced. After getting the most important smallest stuff copied to one location, I started doing an rsync -avH of the entire array contents to another array. This was going while it was reshaping, since I didn't have faith the array would live long enough for the 5 day reshape to complete, which would have gotten me up to 5/6 components of the raid 6, at which point I would have added another drive for 6/6. The reason I didn't immediatly add the extra drive is because I really felt this array and its drives were a timebomb and I was more immediatly interested in copying the data off than trying to get the full raid 6 running (again if it was going to take 10 days and I expected another drive to fail within a few).
Anyway, quite a bit of my data got rsynced off, this morning I woke up to find out another drive (/dev/sda) in the array had failed as I expected. I am really confused about what kind of state it is in, and I am hoping for some advice about what is going on and if it's finally time to give up and take my baby off lifesupport.
First:
Code:
cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid6 sdb[0] sdc[7] sda[5](F) sde[4] sdd[2](F) sdf[6]
7813531648 blocks super 1.2 level 6, 512k chunk, algorithm 18 [6/3] [UU_U__]
unused devices: <none>
So judging from the above, I would expect the array to be totally offline/broken. 3 out of 6 drives in a raid6? But my array is still "active" and I still have it mounted, can still see the folder structure and many files, just the certain folders give I/O errors when you try to ls them. How is this possible? Why do I have some files but not the others? Is it wrong to expect all or nothing of a filesystem on md0 (ext4)?
My theory: The array had been 40% through the reshape onto the new drive (even with 4/5 of the 'old' components), then 1 more of the old components died, so I am at 3/5 of the old components PLUS a new drive with 40% of the stripes, so right now in essence I have 40% of the 4th drive needed for minimum raid6 operation, and thus can see 40% of the files. I don't know if mdadm is capable of that kind of magic, but I can't otherwise explain how my array is even assembled right now. Can anyone tell me if this is indeed possible/accurate? If it is something like this, why doen't mdstat atleast say something other than "active" like "degraded" etc?
The real question. Right now Disk Utility says /dev/sda is "green" but with 1 bad sector (I don't really know what this means except it's bad). What I'm asking is what everyone is always asking in these threads, is there any way I can bring my array back to full functionality?
Code:
sudo mdadm --examine /dev/sd[a-f]
/dev/sda:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x4
Array UUID : 8bc78af0:d9a981e3:73549f21:2f76cd24
Name : mainframe:vault (local to host mainframe)
Creation Time : Wed Aug 15 21:57:14 2012
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 3906767024 (1862.89 GiB 2000.26 GB)
Array Size : 7813531648 (7451.56 GiB 8001.06 GB)
Used Dev Size : 3906765824 (1862.89 GiB 2000.26 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : active
Device UUID : f18da9cc:27f5eee4:61ba900e:dd6ca8b9
Reshape pos'n : 3277520896 (3125.69 GiB 3356.18 GB)
New Layout : left-symmetric
Update Time : Sun Apr 21 06:27:31 2013
Checksum : 75147b14 - correct
Events : 755496
Layout : left-symmetric-6
Chunk Size : 512K
Device Role : Active device 4
Array State : AA.AAA ('A' == active, '.' == missing)
/dev/sdb:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x4
Array UUID : 8bc78af0:d9a981e3:73549f21:2f76cd24
Name : mainframe:vault (local to host mainframe)
Creation Time : Wed Aug 15 21:57:14 2012
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 3906767024 (1862.89 GiB 2000.26 GB)
Array Size : 7813531648 (7451.56 GiB 8001.06 GB)
Used Dev Size : 3906765824 (1862.89 GiB 2000.26 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 004a89c7:bd03e0fe:b6ea3ab9:76e5e5e0
Reshape pos'n : 3277520896 (3125.69 GiB 3356.18 GB)
New Layout : left-symmetric
Update Time : Sun Apr 21 13:41:12 2013
Checksum : 5bc7638b - correct
Events : 759402
Layout : left-symmetric-6
Chunk Size : 512K
Device Role : Active device 0
Array State : AA.A.A ('A' == active, '.' == missing)
/dev/sdc:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x6
Array UUID : 8bc78af0:d9a981e3:73549f21:2f76cd24
Name : mainframe:vault (local to host mainframe)
Creation Time : Wed Aug 15 21:57:14 2012
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 3906767024 (1862.89 GiB 2000.26 GB)
Array Size : 7813531648 (7451.56 GiB 8001.06 GB)
Used Dev Size : 3906765824 (1862.89 GiB 2000.26 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
Recovery Offset : 1638760448 sectors
State : clean
Device UUID : 0d8ddf14:2601f343:0b7e182f:cc8358e9
Reshape pos'n : 3277520896 (3125.69 GiB 3356.18 GB)
New Layout : left-symmetric
Update Time : Sun Apr 21 13:41:12 2013
Checksum : ce2e55b3 - correct
Events : 759402
Layout : left-symmetric-6
Chunk Size : 512K
Device Role : Active device 5
Array State : AA.A.A ('A' == active, '.' == missing)
/dev/sde:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x4
Array UUID : 8bc78af0:d9a981e3:73549f21:2f76cd24
Name : mainframe:vault (local to host mainframe)
Creation Time : Wed Aug 15 21:57:14 2012
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 3906767024 (1862.89 GiB 2000.26 GB)
Array Size : 7813531648 (7451.56 GiB 8001.06 GB)
Used Dev Size : 3906765824 (1862.89 GiB 2000.26 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 1df1fd17:592f431a:f3f05592:fbfccdcd
Reshape pos'n : 3277520896 (3125.69 GiB 3356.18 GB)
New Layout : left-symmetric
Update Time : Sun Apr 21 13:41:12 2013
Checksum : 8da25408 - correct
Events : 759402
Layout : left-symmetric-6
Chunk Size : 512K
Device Role : Active device 3
Array State : AA.A.A ('A' == active, '.' == missing)
/dev/sdf:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x4
Array UUID : 8bc78af0:d9a981e3:73549f21:2f76cd24
Name : mainframe:vault (local to host mainframe)
Creation Time : Wed Aug 15 21:57:14 2012
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 3906767024 (1862.89 GiB 2000.26 GB)
Array Size : 7813531648 (7451.56 GiB 8001.06 GB)
Used Dev Size : 3906765824 (1862.89 GiB 2000.26 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 15dcad1e:3808a229:7409b3aa:4e03ae1b
Reshape pos'n : 3277520896 (3125.69 GiB 3356.18 GB)
New Layout : left-symmetric
Update Time : Sun Apr 21 13:41:12 2013
Checksum : 9ee36b5 - correct
Events : 759402
Layout : left-symmetric-6
Chunk Size : 512K
Device Role : Active device 1
Array State : AA.A.A ('A' == active, '.' == missing)
Bookmarks