Results 1 to 4 of 4

Thread: After RAID disk crashes: madm saying -1 drives and 1 spare not enough to start

  1. #1
    Join Date
    Dec 2007
    Location
    Dallas, TX
    Beans
    151
    Distro
    Ubuntu Development Release

    After RAID disk crashes: madm saying -1 drives and 1 spare not enough to start

    Howdy all,

    I've been poking around the Internet for a number of days and haven't been able to nail down the next steps to take. Here's the summary:

    • Three partitions, each set up with RAID 1. LVM2 on top of that. Two partitions are ext3, and the 3rd is xfs.
    • Started taking errors (bad sectors) on one drive. After reboot, the other drive appears dead (spins up and BIOS see's it, but something low-level is wrong because Linux won't even create a device to allow me to access it).
    • ddrescued the one with bad sectors to another drive. About 43 MBytes unrecoverable.
    • Successfully got the most important ext3 partition up (mdadm, vgscan, vgchange, e2fsck) and mounted on another machine and extracted the data
    • Struggling to do the same with xfs partition. I think the main problem is that mdadm sees the partition marked as "spare" and won't assemble it. There is also a checksum mismatch.

    Questions:
    1. Ideas on how to get madm to ignore the checksum error and assemble a drive even though it is marked "spare"
    2. Is there some way to repair/recreate the superblock? (is there a backup superblock?)
    3. If those aren't feasible, how about ignoring the RAID problem and trying to mount directly? How do I get the LVM2 up?

    Here's some of the more interesting details:
    Code:
    # mdadm --assemble --force /dev/md2
    mdadm: failed to add /dev/sdb4 to /dev/md2: Invalid argument
    mdadm: /dev/md2 assembled from -1 drives and 1 spare - not enough to start the array.
    
    # mdadm --assemble --run /dev/md2
    mdadm: failed to add /dev/sdb4 to /dev/md2: Invalid argument
    mdadm: failed to RUN_ARRAY /dev/md2: Invalid argument
    mdadm: Not enough devices to start the array.
    
    root@media:~# mdadm --examine /dev/sdb4
    /dev/sdb4:
              Magic : a92b4efc
            Version : 0.90.00
               UUID : 823846d8:30459e5c:4ef1a930:4c9fb5d5
      Creation Time : Sun Jan 13 04:47:42 2008
         Raid Level : raid1
      Used Dev Size : 488239360 (465.62 GiB 499.96 GB)
         Array Size : 488239360 (465.62 GiB 499.96 GB)
       Raid Devices : 2
      Total Devices : 1
    Preferred Minor : 2
    
        Update Time : Mon Apr 14 06:21:19 2014
              State : clean
     Active Devices : 1
    Working Devices : 1
     Failed Devices : 1
      Spare Devices : 0
           Checksum : b24910e7 - expected b24910b5
             Events : 26176880
    
    
          Number   Major   Minor   RaidDevice State
    this     0       0        0        0      spare
    
       0     0       8       36        0      active sync
       1     1       0        0        1      faulty removed
    #
    relevant dmesg output:
    Code:
    [    1.624713] md: invalid superblock checksum on sdb4
    [    1.624716] md: sdb4 does not have a valid v0.90 superblock, not importing!
    [    1.624722] md: md_import_device returned -22
    [    1.624749] md: md2 stopped.

  2. #2
    Join Date
    Dec 2007
    Location
    Dallas, TX
    Beans
    151
    Distro
    Ubuntu Development Release

    Re: After RAID disk crashes: madm saying -1 drives and 1 spare not enough to start

    I am wondering if I need to go something along the lines of http://www.robmeerman.co.uk/unix/lvm_recovery describes, although I'm still at a loss of how to get at the LVM. Can't find what it takes to bypass the RAID detection and try to activate the LVM.

    The partition I did retrieve the data from was the root partition, so it contains the original data regarding the partition I'm having trouble with:

    /etc/mtab:
    Code:
    /dev/mapper/big_vg1-big_lv1 /var xfs rw,noatime,nodiratime,logbufs=8,allocsize=512m 0 0
    and of course the contents of /etc/lvm/backup/big_vg1:
    Code:
    # Generated by LVM2 version 2.02.54(1) (2009-10-26): Sat Oct 23 13:21:52 2010
    
    contents = "Text Format Volume Group"
    version = 1
    
    description = "Created *after* executing 'vgcfgbackup'"
    
    creation_host = "media"	# Linux media 2.6.31-22-generic #65-Ubuntu SMP Thu Sep 16 15:48:58 UTC 2010 i686
    creation_time = 1287858112	# Sat Oct 23 13:21:52 2010
    
    big_vg1 {
    	id = "LRFlCk-JPig-tKi7-lFAn-jJA2-yfRg-6n0oMH"
    	seqno = 2
    	status = ["RESIZEABLE", "READ", "WRITE"]
    	flags = []
    	extent_size = 8192		# 4 Megabytes
    	max_lv = 0
    	max_pv = 0
    
    	physical_volumes {
    
    		pv0 {
    			id = "yBd7V3-xBd5-C8Io-a2sT-SVeF-lRsB-N47AWl"
    			device = "/dev/md2"	# Hint only
    
    			status = ["ALLOCATABLE"]
    			flags = []
    			dev_size = 976478720	# 465.621 Gigabytes
    			pe_start = 384
    			pe_count = 119199	# 465.621 Gigabytes
    		}
    	}
    
    	logical_volumes {
    
    		big_lv1 {
    			id = "X1Zf5I-1hWz-jGzF-JYPq-7NJk-a10u-rHrvOh"
    			status = ["READ", "WRITE", "VISIBLE"]
    			flags = []
    			segment_count = 1
    
    			segment1 {
    				start_extent = 0
    				extent_count = 119199	# 465.621 Gigabytes
    
    				type = "striped"
    				stripe_count = 1	# linear
    
    				stripes = [
    					"pv0", 0
    				]
    			}
    		}
    	}
    }

  3. #3
    Join Date
    Dec 2007
    Location
    Dallas, TX
    Beans
    151
    Distro
    Ubuntu Development Release

    Re: After RAID disk crashes: madm saying -1 drives and 1 spare not enough to start

    I did something similar to the linked page and got the LVM active by using a loop device which directly accessed an offset of the drive:

    losetup /dev/loop0 /dev/sdb -o250196567040
    where 250196567040 was the sector number for the start of /dev/sdb4 from fdisk, multiplied by 512. pvscan and all remaining lvm tools saw the "big_vg1". I couldn't immediately mount it due to super block corruption, but after running xfs_repair (since I knew that is what it was), I successfully mounted it. Most everything is in lost+found though, so I'll text searches for the important stuff and move on

  4. #4
    Join Date
    Dec 2007
    Location
    Dallas, TX
    Beans
    151
    Distro
    Ubuntu Development Release

    Re: After RAID disk crashes: madm saying -1 drives and 1 spare not enough to start

    [remove duplicate post]

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •