Page 1 of 5 123 ... LastLast
Results 1 to 10 of 43

Thread: Changed drive sata port. RAID5 no longer working

  1. #1
    Join Date
    Aug 2021
    Beans
    19

    Changed drive sata port. RAID5 no longer working

    I have a pretty frustrating problem here. I have a server with 6 drives (sdc as a boot drive, sdd as 2tb backup and sda, sdb, sde, sdf as RAID5). Everything worked fine, except lately one single drive (sde) sometimes failed for no reason. And to fix this I had to reboot my server and re-add via mdadm --manage /dev/md0 --re-add /dev/sde. I ran smart tests on that drive and they never returned any problems. This drive (together with another drive sdf) was connected to a PCI-E SATA adapter. So I thought that maybe moving the drive to an onboard SATA port might fix the problem. And with that, I decided to move the other drive as well. So now, everything is connected directly to the mobo.


    And comes the problem. Once I turned the server on, the RAID wouldn't start. For some unknown reason, moving the drives around somehow broke the RAID.
    Running mdadm --detail on the RAID returns the following:
    Code:
    /dev/md0:     Version : 1.2
       Creation Time : Thu May 6 10:19:43 2021
         Raid Level : raid5
       Used Dev Size : 18446744073709551615
        Raid Devices : 4
       Total Devices : 2
        Persistence : Superblock is persistent
    
        Update Time : Wed Aug 25 14:35:20 2021
           State : active, FAILED, Not Started
       Active Devices : 2
      Working Devices : 2
       Failed Devices : 0
       Spare Devices : 0
    
           Layout : left-symmetric
         Chunk Size : 512K
    
    Consistency Policy : unknown
    
            Name : HomeServer:0 (local to host HomeServer)
            UUID : 97298aaf:6c2e1a99:179c931c:db773d51
           Events : 30469
    
       Number Major Minor RaidDevice State
        -   0    0    0   removed
        -   0    0    1   removed
        -   0    0    2   removed
        -   0    0    3   removed
    
        -   8    0    0   sync /dev/sda
        -   8   16    1   sync /dev/sdb
    As you can see, only sda and sdb are recognized.


    So I stoped the RAID and tried to reassemble it using mdadm --assemble --scan -v, which gave me this:
    Code:
    mdadm: looking for devices for /dev/md0
    mdadm: Cannot assemble mbr metadata on /dev/sdf
    mdadm: No super block found on /dev/sde (Expected magic a92b4efc, got 00000000)
    mdadm: no RAID superblock on /dev/sde
    mdadm: No super block found on /dev/sdc5 (Expected magic a92b4efc, got db8ed616)
    mdadm: no RAID superblock on /dev/sdc5
    mdadm: /dev/sdc2 is too small for md: size is 2 sectors.
    mdadm: no RAID superblock on /dev/sdc2
    mdadm: No super block found on /dev/sdc1 (Expected magic a92b4efc, got 0000041b)
    mdadm: no RAID superblock on /dev/sdc1
    mdadm: No super block found on /dev/sdc (Expected magic a92b4efc, got b3453333)
    mdadm: no RAID superblock on /dev/sdc
    mdadm: No super block found on /dev/sdd1 (Expected magic a92b4efc, got 000004ea)
    mdadm: no RAID superblock on /dev/sdd1
    mdadm: No super block found on /dev/sdd (Expected magic a92b4efc, got 00000000)
    mdadm: no RAID superblock on /dev/sdd
    mdadm: /dev/sdb is identified as a member of /dev/md0, slot 1.
    mdadm: /dev/sda is identified as a member of /dev/md0, slot 0.
    mdadm: added /dev/sdb to /dev/md0 as 1
    mdadm: no uptodate device for slot 2 of /dev/md0
    mdadm: no uptodate device for slot 3 of /dev/md0
    mdadm: added /dev/sda to /dev/md0 as 0
    mdadm: /dev/md0 assembled from 2 drives - not enough to start the array.
    So superblock on sde and sdf drives are suddenly gone. But how?
    Running mdadm --examine /dev/sd? seems to confirm this:
    Code:
    /dev/sda:
          Magic : a92b4efc
         Version : 1.2
       Feature Map : 0x1
       Array UUID : 97298aaf:6c2e1a99:179c931c:db773d51
          Name : HomeServer:0 (local to host HomeServer)
      Creation Time : Thu May 6 10:19:43 2021
       Raid Level : raid5
      Raid Devices : 4
    
    
    
    
     Avail Dev Size : 7813776048 (3725.90 GiB 4000.65 GB)
       Array Size : 11720658432 (11177.69 GiB 12001.95 GB)
      Used Dev Size : 7813772288 (3725.90 GiB 4000.65 GB)
       Data Offset : 261120 sectors
      Super Offset : 8 sectors
      Unused Space : before=261040 sectors, after=3760 sectors
          State : clean
       Device UUID : 0cc32dc9:f4d74ada:eb6b735d:aa646359
    
    
    
    
    Internal Bitmap : 8 sectors from superblock
       Update Time : Wed Aug 25 14:35:20 2021
      Bad Block Log : 512 entries available at offset 24 sectors
        Checksum : a0421fad - correct
         Events : 30469
    
    
    
    
         Layout : left-symmetric
       Chunk Size : 512K
    
    
    
    
      Device Role : Active device 0
      Array State : AAA. ('A' == active, '.' == missing, 'R' == replacing)
    /dev/sdb:
          Magic : a92b4efc
         Version : 1.2
       Feature Map : 0x1
       Array UUID : 97298aaf:6c2e1a99:179c931c:db773d51
          Name : HomeServer:0 (local to host HomeServer)
      Creation Time : Thu May 6 10:19:43 2021
       Raid Level : raid5
      Raid Devices : 4
    
    
    
    
     Avail Dev Size : 7813776048 (3725.90 GiB 4000.65 GB)
       Array Size : 11720658432 (11177.69 GiB 12001.95 GB)
      Used Dev Size : 7813772288 (3725.90 GiB 4000.65 GB)
       Data Offset : 261120 sectors
      Super Offset : 8 sectors
      Unused Space : before=261040 sectors, after=3760 sectors
          State : clean
       Device UUID : 6cddf9b2:ff102af5:ad8a9440:5a34374a
    
    
    
    
    Internal Bitmap : 8 sectors from superblock
       Update Time : Wed Aug 25 14:35:20 2021
      Bad Block Log : 512 entries available at offset 24 sectors
        Checksum : 78e2618b - correct
         Events : 30469
    
    
    
    
         Layout : left-symmetric
       Chunk Size : 512K
    
    
    
    
      Device Role : Active device 1
      Array State : AAA. ('A' == active, '.' == missing, 'R' == replacing)
    /dev/sdc:
      MBR Magic : aa55
    Partition[0] :  435445760 sectors at    2048 (type 83)
    Partition[1] :  33411074 sectors at  435449854 (type 05)
    /dev/sdd:
      MBR Magic : aa55
    Partition[0] : 3907029167 sectors at      1 (type ee)
    /dev/sde:
      MBR Magic : aa55
    Partition[0] : 4294967295 sectors at      1 (type ee)
    /dev/sdf:
      MBR Magic : aa55
    Partition[0] : 4294967295 sectors at      1 (type ee)

    So, do you guys have any idea, what happened to my drives? And is this even fixable?
    I even tried putting everything back as it was (putting drives back on the PCI-E card as they were before, but nothing seems to fix it).
    Last edited by strat00s; August 25th, 2021 at 10:23 PM.

  2. #2
    Join Date
    Nov 2009
    Location
    Catalunya, Spain
    Beans
    14,504
    Distro
    Ubuntu 18.04 Bionic Beaver

    Re: Changed drive sata port. RAID5 no longer working

    First of all, I don't know what happened to the font of the post because it is very pale and difficult to read. Almost like inverted...

    Second, the problem might be because you are using the disks directly as raid members. However I have no definite proof about this. And especially when changing controllers, which you did, maybe that is the source of your problems.

    You should use partitions with mdadm, not the whole disk device itself. Instead, create a partition (I usually leave the last 15-20MB unallocated) and use the partition as mdadm member. This allows you to easily replace disks in the future also in case where you buy another brand and the disk has few sectors less.

    Going back to your problem. You already tried to reassemble and it didn't work.

    Option A
    Stop any failed array if it is currently assembled and try to force the assemble.
    Code:
    sudo mdadm --stop /dev/md0
    sudo mdadm --assemble --verbose --force /dev/md0 /dev/sd[abef]
    If that didn't work try again but changing the second command little:
    Code:
    sudo --assemble --verbose --force --run /dev/md0 /dev/sd[abef]
    If option A didn't help let us know and we will talk option B.
    Darko.
    -----------------------------------------------------------------------
    Ubuntu 18.04 LTS 64bit

  3. #3
    Join Date
    Aug 2021
    Beans
    24

    Re: Changed drive sata port. RAID5 no longer working

    Did you switch it back to the SATA ports that worked to see if that still works? How many sata, which are supported by Intel chipset which are supported by optional motherboard chipset?

    Did a cable go bad when you switched?

    What level of SCSI? Does your raid require the first disk have the header? Does your scsi raid put headers on every disk? Did you read the manual?

    Switch it back to when it worked. Back it up if you don't have a backup. Reload the backup onto your new configuration after switching cables around. Safest thing to do.

  4. #4
    Join Date
    Aug 2021
    Beans
    19

    Re: Changed drive sata port. RAID5 no longer working

    Quote Originally Posted by darkod View Post
    First of all, I don't know what happened to the font of the post because it is very pale and difficult to read. Almost like inverted...

    Second, the problem might be because you are using the disks directly as raid members. However I have no definite proof about this. And especially when changing controllers, which you did, maybe that is the source of your problems.

    You should use partitions with mdadm, not the whole disk device itself. Instead, create a partition (I usually leave the last 15-20MB unallocated) and use the partition as mdadm member. This allows you to easily replace disks in the future also in case where you buy another brand and the disk has few sectors less.

    Going back to your problem. You already tried to reassemble and it didn't work.

    Option A
    Stop any failed array if it is currently assembled and try to force the assemble.
    Code:
    sudo mdadm --stop /dev/md0
    sudo mdadm --assemble --verbose --force /dev/md0 /dev/sd[abef]
    If that didn't work try again but changing the second command little:
    Code:
    sudo --assemble --verbose --force --run /dev/md0 /dev/sd[abef]
    If option A didn't help let us know and we will talk option B.
    Sorry for the font. It should be ok now.
    I tried your option A and it returned this:
    Code:
    mdadm: looking for devices for /dev/md0
    mdadm: No super block found on /dev/sde (Expected magic a92b4efc, got 00000000)
    mdadm: no RAID superblock on /dev/sde
    mdadm: /dev/sde has no superblock - assembly aborted

    Which I think looks promising, as now it's only a single drive that seems to have a problem (and that is recoverable on RAID5).
    So maybe forcing it with just sd[abf] would reassembly it as a degraded RAID and I can format the sde drive and add it once again?
    Please let me know if this makes any sense
    Last edited by strat00s; August 25th, 2021 at 10:35 PM.

  5. #5
    Join Date
    Mar 2007
    Beans
    1,139

    Re: Changed drive sata port. RAID5 no longer working

    Have you checked the drives to see if one or more are malfunctioning?

  6. #6
    Join Date
    Aug 2021
    Beans
    19

    Re: Changed drive sata port. RAID5 no longer working

    Quote Originally Posted by not-published View Post
    Did you switch it back to the SATA ports that worked to see if that still works? How many sata, which are supported by Intel chipset which are supported by optional motherboard chipset?

    Did a cable go bad when you switched?

    What level of SCSI? Does your raid require the first disk have the header? Does your scsi raid put headers on every disk? Did you read the manual?

    Switch it back to when it worked. Back it up if you don't have a backup. Reload the backup onto your new configuration after switching cables around. Safest thing to do.
    Yes, they are now as they were before (connected to their old SATA ports).
    The cables are ok.
    Isn't SCSI different from SATA? And no idea about the headers. There is kinda no manual for unplugging and plugging SATA drives.

    Quote Originally Posted by rsteinmetz70112 View Post
    Have you checked the drives to see if one or more are malfunctioning?
    Yes. All the drives are working just fine. I ran the smart test on all of them and none report any problems.
    Last edited by strat00s; August 25th, 2021 at 11:44 PM.

  7. #7
    Join Date
    Nov 2009
    Location
    Catalunya, Spain
    Beans
    14,504
    Distro
    Ubuntu 18.04 Bionic Beaver

    Re: Changed drive sata port. RAID5 no longer working

    Yes, try the command with only abf and let us know. Not sure if it stopped after detecting that sde doesn't have a superblock, but worth of try.

    Even if it fails again, there is still option B. But lets not get ahead of ourselves.

    PS. And I would be trying this using the onboard ports, the way you want it to work. My first option would be to try repair it connected to onboard so that you can leave it like that. As you wanted... I know it's natural to try put things back as they were but that didn't assemble it either.
    Last edited by darkod; August 26th, 2021 at 04:45 PM.
    Darko.
    -----------------------------------------------------------------------
    Ubuntu 18.04 LTS 64bit

  8. #8
    Join Date
    Mar 2007
    Beans
    1,139

    Re: Changed drive sata port. RAID5 no longer working

    sdd sde and sdf all report ee partition type. That is "Indication that this legacy MBR is followed by an EFI header"

    Are we sure the letters didn't change? It seems possible with all the plugging and unplugging, possibly the current sde is the former sdd

  9. #9
    Join Date
    Aug 2021
    Beans
    19

    Re: Changed drive sata port. RAID5 no longer working

    Quote Originally Posted by darkod View Post
    Yes, try the command with only abf and let us know. Not sure if it stopped after detecting that sde doesn't have a superblock, but worth of try.

    Even if it fails again, there is still option B. But lets not get ahead of ourselves.

    PS. And I would be trying this using the onboard ports, the way you want it to work. My first option would be to try repair it connected to onboard so that you can leave it like that. As you wanted... I know it's natural to try put things back as they were but that didn't assemble it either.
    Sadly it just stopped after the sde drive. So I got the same message about the sdf while trying it only with abf.
    I re-plugged it back as I wanted it (so all the drives are now connected to the onboard sata ports). And after checking the smart status I determined, that the sde drive is now sdd and sdf is sdc. So the raid is now across a,b,c,d (or rather a,b,d,c as that's the order I used when creating the raid initially. At least I think...).
    So I guess next I am trying the second option.
    Also while researching this, it might also be possible to "fix" it by "recreating" the array in the exact same order as before but with --assume-clean flag
    Last edited by strat00s; August 26th, 2021 at 05:42 PM.

  10. #10
    Join Date
    Aug 2021
    Beans
    19

    Re: Changed drive sata port. RAID5 no longer working

    Quote Originally Posted by rsteinmetz70112 View Post
    sdd sde and sdf all report ee partition type. That is "Indication that this legacy MBR is followed by an EFI header"

    Are we sure the letters didn't change? It seems possible with all the plugging and unplugging, possibly the current sde is the former sdd
    I am 100% certain that I plugged them back as they were. sde and sdf were the only drives from the RAID that I replugged from the pcie controller onto the motherboard.
    And that's what I don't understand. Just by replugging the HDDs they suddenly report as gpt with protective mbr.
    Running gdisk -l on all my current drives give me this:
    Code:
    stratos@HomeServer:~$ sudo gdisk -l /dev/sda
    GPT fdisk (gdisk) version 1.0.3
    
    Partition table scan:
      MBR: not present
      BSD: not present
      APM: not present
      GPT: not present
    
    Creating new GPT entries.
    Disk /dev/sda: 7814037168 sectors, 3.6 TiB
    Model: ST4000DM004-2CV1
    Sector size (logical/physical): 512/4096 bytes
    Disk identifier (GUID): 95DDD556-5AE4-4F70-9C39-2A8EF89732DE
    Partition table holds up to 128 entries
    Main partition table begins at sector 2 and ends at sector 33
    First usable sector is 34, last usable sector is 7814037134
    Partitions will be aligned on 2048-sector boundaries
    Total free space is 7814037101 sectors (3.6 TiB)
    
    Number  Start (sector)    End (sector)  Size       Code  Name
    
    
    stratos@HomeServer:~$ sudo gdisk -l /dev/sdb
    GPT fdisk (gdisk) version 1.0.3
    
    Partition table scan:
      MBR: not present
      BSD: not present
      APM: not present
      GPT: not present
    
    Creating new GPT entries.
    Disk /dev/sdb: 7814037168 sectors, 3.6 TiB
    Model: ST4000DM004-2CV1
    Sector size (logical/physical): 512/4096 bytes
    Disk identifier (GUID): B3C11B98-898E-4CA8-B141-F79BBB63E669
    Partition table holds up to 128 entries
    Main partition table begins at sector 2 and ends at sector 33
    First usable sector is 34, last usable sector is 7814037134
    Partitions will be aligned on 2048-sector boundaries
    Total free space is 7814037101 sectors (3.6 TiB)
    
    Number  Start (sector)    End (sector)  Size       Code  Name
    
    
    stratos@HomeServer:~$ sudo gdisk -l /dev/sdc
    GPT fdisk (gdisk) version 1.0.3
    
    Partition table scan:
      MBR: protective
      BSD: not present
      APM: not present
      GPT: present
    
    Found valid GPT with protective MBR; using GPT.
    Disk /dev/sdc: 7814037168 sectors, 3.6 TiB
    Model: WDC WD40EFAX-68J
    Sector size (logical/physical): 512/4096 bytes
    Disk identifier (GUID): 0CB38614-0CBB-4D78-8EB0-8F9FB189FF28
    Partition table holds up to 128 entries
    Main partition table begins at sector 2 and ends at sector 33
    First usable sector is 34, last usable sector is 7814037134
    Partitions will be aligned on 2048-sector boundaries
    Total free space is 7814037101 sectors (3.6 TiB)
    
    Number  Start (sector)    End (sector)  Size       Code  Name
    
    
    stratos@HomeServer:~$ sudo gdisk -l /dev/sdd
    GPT fdisk (gdisk) version 1.0.3
    
    Partition table scan:
      MBR: protective
      BSD: not present
      APM: not present
      GPT: present
    
    Found valid GPT with protective MBR; using GPT.
    Disk /dev/sdd: 7814037168 sectors, 3.6 TiB
    Model: WDC WD40EFAX-68J
    Sector size (logical/physical): 512/4096 bytes
    Disk identifier (GUID): E06F92D5-8CDA-4938-81A7-AAD6E60EA8BA
    Partition table holds up to 128 entries
    Main partition table begins at sector 2 and ends at sector 33
    First usable sector is 34, last usable sector is 7814037134
    Partitions will be aligned on 2048-sector boundaries
    Total free space is 7814037101 sectors (3.6 TiB)
    
    Number  Start (sector)    End (sector)  Size       Code  Name
    
    
    stratos@HomeServer:~$ sudo gdisk -l /dev/sde
    GPT fdisk (gdisk) version 1.0.3
    
    Partition table scan:
      MBR: MBR only
      BSD: not present
      APM: not present
      GPT: not present
    
    ***************************************************************
    Found invalid GPT and valid MBR; converting MBR to GPT format
    in memory. 
    ***************************************************************
    
    Disk /dev/sde: 468862128 sectors, 223.6 GiB
    Model: KINGSTON SA400S3
    Sector size (logical/physical): 512/512 bytes
    Disk identifier (GUID): 67FB853F-D130-4AA3-AC55-3B2A926EC13B
    Partition table holds up to 128 entries
    Main partition table begins at sector 2 and ends at sector 33
    First usable sector is 34, last usable sector is 468862094
    Partitions will be aligned on 2048-sector boundaries
    Total free space is 5229 sectors (2.6 MiB)
    
    Number  Start (sector)    End (sector)  Size       Code  Name
       1            2048       435447807   207.6 GiB   8300  Linux filesystem
       5       435449856       468860927   15.9 GiB    8200  Linux swap
    sde is clearly my boot drive (now) and sdc and sdd (they are now plugged into the mobo, that's why they are no longer sde and sdf) show GPT partition table with protective MBR.

Page 1 of 5 123 ... LastLast

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •