Page 1 of 4 123 ... LastLast
Results 1 to 10 of 32

Thread: MD raid trouble.

  1. #1
    Join Date
    Mar 2009
    Beans
    19

    MD raid trouble.

    Hi,

    I am having a problem with my RAID array. After going on vacation for a couple of weeks I have come back unable to access any of the data. I would appreciate it if anybody can help as i don't want to loose my data by rebuiling the array incorrectly.
    I am not sure why the problem happened and help in understanding why would also be appreciated.

    I used to have a ZFS filesystem on 3 of the 5 drives.
    After setting up the mdraid i formated the filesystem with btrfs. (I realize this is my next problem if i can get as far as loading up the array). But my hope is to be able to extracted media so i don't have to rerip everything.

    The drives should be set up in a RAID-6 setup.

    Hope i included the most usefull data

    Thanks for reading this far.

    René

    Code:
    $ uname -a
    Linux Asgard 3.0.0-16-server #29-Ubuntu SMP Tue Feb 14 13:08:12 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

    Code:
    $ sudo mdadm --examine /dev/sd*
    mdadm: No md superblock detected on /dev/sda.
    mdadm: No md superblock detected on /dev/sda1.
    mdadm: No md superblock detected on /dev/sda2.
    mdadm: No md superblock detected on /dev/sda5.
    /dev/sdb:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x0
         Array UUID : 0db0c336:f56bd888:2f9e92e4:c1d64c09
               Name : Asgard:0  (local to host Asgard)
      Creation Time : Sat Jan 28 14:29:36 2012
         Raid Level : raid6
       Raid Devices : 5
    
     Avail Dev Size : 3907027120 (1863.02 GiB 2000.40 GB)
         Array Size : 11721077760 (5589.05 GiB 6001.19 GB)
      Used Dev Size : 3907025920 (1863.02 GiB 2000.40 GB)
        Data Offset : 2048 sectors
       Super Offset : 8 sectors
              State : clean
        Device UUID : 8e861931:6f7448ec:6f2c6cd2:74cb06a2
    
        Update Time : Tue Mar 27 18:25:33 2012
           Checksum : a4305a9b - correct
             Events : 315451
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : Active device 4
       Array State : ..A.A ('A' == active, '.' == missing)
    /dev/sdc:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x0
         Array UUID : 0db0c336:f56bd888:2f9e92e4:c1d64c09
               Name : Asgard:0  (local to host Asgard)
      Creation Time : Sat Jan 28 14:29:36 2012
         Raid Level : raid6
       Raid Devices : 5
    
     Avail Dev Size : 3907027120 (1863.02 GiB 2000.40 GB)
         Array Size : 11721077760 (5589.05 GiB 6001.19 GB)
      Used Dev Size : 3907025920 (1863.02 GiB 2000.40 GB)
        Data Offset : 2048 sectors
       Super Offset : 8 sectors
              State : clean
        Device UUID : ef9d6881:b31e81b4:9403e07a:4280392d
    
        Update Time : Tue Mar 27 18:25:33 2012
           Checksum : f05ad7f8 - correct
             Events : 0
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : spare
       Array State : ..A.A ('A' == active, '.' == missing)
    /dev/sdd:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x0
         Array UUID : 0db0c336:f56bd888:2f9e92e4:c1d64c09
               Name : Asgard:0  (local to host Asgard)
      Creation Time : Sat Jan 28 14:29:36 2012
         Raid Level : raid6
       Raid Devices : 5
    
     Avail Dev Size : 3907027120 (1863.02 GiB 2000.40 GB)
         Array Size : 11721077760 (5589.05 GiB 6001.19 GB)
      Used Dev Size : 3907025920 (1863.02 GiB 2000.40 GB)
        Data Offset : 2048 sectors
       Super Offset : 8 sectors
              State : clean
        Device UUID : 387de195:c966497f:f7fad598:cfc7fb10
    
        Update Time : Tue Mar 27 18:25:33 2012
           Checksum : d1543e47 - correct
             Events : 0
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : spare
       Array State : ..A.A ('A' == active, '.' == missing)
    /dev/sde:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x0
         Array UUID : 0db0c336:f56bd888:2f9e92e4:c1d64c09
               Name : Asgard:0  (local to host Asgard)
      Creation Time : Sat Jan 28 14:29:36 2012
         Raid Level : raid6
       Raid Devices : 5
    
     Avail Dev Size : 3907027120 (1863.02 GiB 2000.40 GB)
         Array Size : 11721077760 (5589.05 GiB 6001.19 GB)
      Used Dev Size : 3907025920 (1863.02 GiB 2000.40 GB)
        Data Offset : 2048 sectors
       Super Offset : 8 sectors
              State : clean
        Device UUID : 8e1a4400:e00d30d2:faf68bb0:99e13cb1
    
        Update Time : Tue Mar 27 18:25:33 2012
           Checksum : 46949882 - correct
             Events : 0
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : spare
       Array State : ..A.A ('A' == active, '.' == missing)
    /dev/sdf:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x0
         Array UUID : 0db0c336:f56bd888:2f9e92e4:c1d64c09
               Name : Asgard:0  (local to host Asgard)
      Creation Time : Sat Jan 28 14:29:36 2012
         Raid Level : raid6
       Raid Devices : 5
    
     Avail Dev Size : 3907027120 (1863.02 GiB 2000.40 GB)
         Array Size : 11721077760 (5589.05 GiB 6001.19 GB)
      Used Dev Size : 3907025920 (1863.02 GiB 2000.40 GB)
        Data Offset : 2048 sectors
       Super Offset : 8 sectors
              State : clean
        Device UUID : 418fe563:d3c6d16b:54762ab7:8b6aced2
    
        Update Time : Tue Mar 27 18:25:33 2012
           Checksum : 6c0b9ead - correct
             Events : 315451
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : Active device 2
       Array State : ..A.A ('A' == active, '.' == missing)

    Code:
    Output of parted:
    # parted -l /dev/sd[bcdef]
    Error: The primary GPT table is corrupt, but the backup appears OK, so that will
    be used.
    OK/Cancel? ok                                                             
    Model: ATA ST2000DL003-9VT1 (scsi)
    Disk /dev/sdb: 2000GB
    Sector size (logical/physical): 512B/512B
    Partition Table: gpt
    
    Number  Start   End     Size    File system  Name  Flags
     1      1049kB  2000GB  2000GB               zfs
     9      2000GB  2000GB  8389kB
    
    
    Error: The primary GPT table is corrupt, but the backup appears OK, so that will be used.
    OK/Cancel? ok                                                             
    Model: ATA ST2000DL003-9VT1 (scsi)
    Disk /dev/sdc: 2000GB
    Sector size (logical/physical): 512B/512B
    Partition Table: gpt
    
    Number  Start   End     Size    File system  Name  Flags
     1      1049kB  2000GB  2000GB               zfs
     9      2000GB  2000GB  8389kB
    
    
    Error: /dev/sdd: unrecognised disk label                                  
    
    Error: /dev/sde: unrecognised disk label                                  
    
    Error: The primary GPT table is corrupt, but the backup appears OK, so that will be used.
    OK/Cancel? ok                                                             
    Model: ATA ST2000DL003-9VT1 (scsi)
    Disk /dev/sdf: 2000GB
    Sector size (logical/physical): 512B/512B
    Partition Table: gpt
    
    Number  Start   End     Size    File system  Name  Flags
     1      1049kB  2000GB  2000GB               zfs
     9      2000GB  2000GB  8389kB



    Build:
    Code:
    #created a raid 5 array with 3 disks. Don't have the precise commands anymore.
    sudo mdadm --add /dev/md127 /dev/sdc /dev/sdd
    sudo mdadm --grow --level=6 --raid-devices=5 /dev/md127 --backup-file=/home/renec/raid.backup

  2. #2
    Join Date
    Jul 2010
    Location
    Michigan, USA
    Beans
    2,136
    Distro
    Ubuntu 18.04 Bionic Beaver

    Re: MD raid trouble.

    Boy, this is a mess, and a fairly convoluted setup. If you get this remounted, you'll want to back up your data, and write zeroes over all of these disks to remove all traces of ZFS and mdadm, and then start over. Also, I wouldn't use btrfs at this point because it's still be developed.

    What's in your /etc/mdadm/mdadm.conf file?
    Code:
    cat /etc/mdadm/mdadm.conf
    What do these show?

    Code:
    cat /proc/mdstat
    mdadm --detail --scan

  3. #3
    Join Date
    Mar 2009
    Beans
    19

    Re: MD raid trouble.

    Yeah, i plan on doing something like that. I wasn't aware zfs would still be visible after the raid initialization.

    I am aware of the status of btrfs but wanted something with checksumming so was willing to take that risk.

    The mdadm.conf is still in its original state. (Mostly commented out) Forgot to add the relavent info.

    Code:
    cat /proc/mdstat
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
    md0 : inactive sdf[3](S) sdb[4](S) sdd[5](S)
          5860540680 blocks super 1.2
    Code:
    sudo mdadm --detail --scan
    mdadm: md device /dev/md/0 does not appear to be active.
    Last edited by rcastberg; April 1st, 2012 at 06:13 PM.

  4. #4
    Join Date
    Jul 2010
    Location
    Michigan, USA
    Beans
    2,136
    Distro
    Ubuntu 18.04 Bionic Beaver

    Re: MD raid trouble.

    It looks like you just need to try to force assemble the array and then put a proper mdadm.conf file back in place.

    Code:
    mdadm --assemble --force /dev/md0 /dev/sd[bcdef]
    If that assemblies properly, then you'll want to recreate your mdadm.conf file.
    Code:
    echo "DEVICE partitions" > /etc/mdadm/mdadm.conf
    echo "HOMEHOST <system>" >> /etc/mdadm/mdadm.conf
    echo "MAILADDR root" >> /etc/mdadm/mdadm.conf
    mdadm --detail --scan >> /etc/mdadm/mdadm.conf
    I'm assuming that you didn't update your mdadm.conf after changing to a RAID6 and that was the cause of this situation. Let me know how this assemble goes.

  5. #5
    Join Date
    Mar 2009
    Beans
    19

    Re: MD raid trouble.

    Cheers, tried that, no luck.

    Code:
    $ sudo mdadm --stop /dev/md0
    mdadm: stopped /dev/md0
    $ lsof /dev/sde 
    $ lsof /dev/sdd
    $sudo mdadm --assemble --force /dev/md0 /dev/sd[bcdef]
    mdadm: failed to add /dev/sdd to /dev/md0: Device or resource busy
    mdadm: failed to add /dev/sde to /dev/md0: Device or resource busy
    mdadm: /dev/md0 assembled from 2 drives and 1 spare - not enough to start the array.
    $mdadm --stop  /dev/md0
    mdadm: stopped /dev/md0
    Gives me:
    Code:
    $ sudo mdadm --detail /dev/md0 
    mdadm: md device /dev/md0 does not appear to be active.
    $ sudo cat /proc/mdstat 
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]                                                                        
    md0 : inactive sdf[3](S) sdc[5](S) sdb[4](S)
          5860540680 blocks super 1.2
           
    unused devices: <none>

    Not sure why i am getting the device or resource busy... Especially as these devies seem to be the clean ones from the mdadm .--examine line. Do you have any idea why one of the drives comes up as spare, as technically i only need 3 of the 5 discs to get at the data.

    Thanks for your help so far.

    Rene

  6. #6
    Join Date
    Jul 2010
    Location
    Michigan, USA
    Beans
    2,136
    Distro
    Ubuntu 18.04 Bionic Beaver

    Re: MD raid trouble.

    Those drives aren't the clean one's, they're just the ones that don't have any event counter value. That's not a good sign as their event counters should be the same as the other disks, or very close.

    It appears that the dmraid driver is controlling these two drives. It appears that mdadm is confused and assembling incorrectly because of the lack of a proper mdadm.conf file. Can you post what you have here even if it's commented out?
    Code:
    cat /etc/mdadm/mdadm.conf
    Have you verified these disks are healthy via smartmontools?
    Code:
    smartctl -d ata -a /dev/sdd
    smartctl -d ata -a /dev/sde

  7. #7
    Join Date
    Mar 2009
    Beans
    19

    Re: MD raid trouble.

    Code:
    $ cat /etc/mdadm/mdadm.conf
    # mdadm.conf
    #
    # Please refer to mdadm.conf(5) for information about this file.
    #
    
    # by default, scan all partitions (/proc/partitions) for MD superblocks.
    # alternatively, specify devices to scan, using wildcards if desired.
    DEVICE partitions
    
    # auto-create devices with Debian standard permissions
    CREATE owner=root group=disk mode=0660 auto=yes
    
    # automatically tag new arrays as belonging to the local system
    HOMEHOST <system>
    
    # instruct the monitoring daemon where to send mail alerts
    MAILADDR rene-sysadm@castberg.org
    
    # definitions of existing MD arrays
    #ARRAY /dev/md0 level=raid1 num-devices=2 UUID=7c88ffa3:9ae1da97:856c4418:84634b4a
    For the smartctl selftests i get:
    SMART overall-health self-assessment test result: PASSED
    for both drives.

  8. #8
    Join Date
    Jul 2010
    Location
    Michigan, USA
    Beans
    2,136
    Distro
    Ubuntu 18.04 Bionic Beaver

    Re: MD raid trouble.

    Here's what I would do next. The first step is to get a proper mdadm.conf file in place so mdadm knows how to correctly assemble the array on bootup. Yours should be like this based off of what you've posted so far.

    Code:
    echo "DEVICE partitions" > /etc/mdadm/mdadm.conf
    echo "HOMEHOST <system>" >> /etc/mdadm/mdadm.conf
    echo "MAILADDR rene-sysadm@castberg.org" >> /etc/mdadm/mdadm.conf
    echo "ARRAY /dev/md0 metadata=1.2 name=Asgard:0 UUID=0db0c336:f56bd888:2f9e92e4:c1d64c09" >> /etc/mdadm/mdadm.conf
    Then I'd update initramfs.
    Code:
    update-initramfs -u
    Finally, reboot.
    Code:
    reboot
    Last edited by rubylaser; April 2nd, 2012 at 12:20 PM.

  9. #9
    Join Date
    Mar 2009
    Beans
    19

    Re: MD raid trouble.

    Right, adjust the mdadm.conf like you recommended and rebooted. I ended up in initramfs boot console and tried a couple of the recommendations that you came with earlier, no luck. I was still getting the error message about device or resource busy so rebooted using the nodmraid kernel command and it still results in the same error message.

    Code:
    # mdadm.conf
    #
    # Please refer to mdadm.conf(5) for information about this file.
    #
    
    # by default, scan all partitions (/proc/partitions) for MD superblocks.
    # alternatively, specify devices to scan, using wildcards if desired.
    DEVICE partitions
    
    # auto-create devices with Debian standard permissions
    CREATE owner=root group=disk mode=0660 auto=yes
    
    # automatically tag new arrays as belonging to the local system
    HOMEHOST <system>
    
    # instruct the monitoring daemon where to send mail alerts
    MAILADDR rene-sysadm@castberg.org
    
    # definitions of existing MD arrays
    #ARRAY /dev/md0 level=raid1 num-devices=2 UUID=7c88ffa3:9ae1da97:856c4418:84634b4a
    ARRAY /dev/md0 metadata=1.2 name=Asgard:0 UUID=0db0c336:f56bd888:2f9e92e4:c1d64c09
    Code:
    [2012-04-02 21:44:54]  Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.0.0-16-server root=UUID=65041921-457e-4790-acde-79864823fe5f ro crashkernel=384M-2G:64M,2G-:128M quiet splash vt.handoff=7 nodmraid

  10. #10
    Join Date
    Jul 2010
    Location
    Michigan, USA
    Beans
    2,136
    Distro
    Ubuntu 18.04 Bionic Beaver

    Re: MD raid trouble.

    Have you tried to assemble this from the LiveCD to see if you can get it assembled and mounted? Just download and burn the livecd, and boot from the cd. Once, running just apt-get install mdadm, and then try to assemble the array.

    Code:
    mdadm --assemble /dev/md0 /dev/sd[bcdef]
    If you can get this assembled, then you should be able to mount the array, and then backup your data.

Page 1 of 4 123 ... LastLast

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •