Page 1 of 2 12 LastLast
Results 1 to 10 of 17

Thread: mdadm RAID5 array 2 drives removed

  1. #1
    Join Date
    Apr 2009
    Beans
    12

    mdadm RAID5 array 2 drives removed

    Hi.

    I have 12 disks in a RAID5 array with mdadm.
    Disks in raid: sd(bcdefghijklm)
    Array: md0

    Disk sdi failed and needed to get replaced.
    I replaced with a new disk and resynced array.
    While resync i lost a nother disk: sde.

    Here is the output of cat /proc/mdstat:

    Code:
    # cat /proc/mdstat
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
    md0 : active raid5 sdc1[0] sdi1[14](S) sdj1[13] sdl1[12] sdg1[9] sde1[8](F) sdm1[7] sdk1[6] sdd1[11] sdf1[3] sdb1[2] sdh1[1]
          32232904704 blocks super 1.2 level 5, 512k chunk, algorithm 2 [12/10] [UUUUUUU_U_UU]
    
    unused devices: <none>
    Here is the output of mdadm -D /dev/md0:

    Code:
    # mdadm -D /dev/md0
    /dev/md0:
            Version : 1.2
      Creation Time : Thu Jan 24 20:36:48 2013
         Raid Level : raid5
         Array Size : 32232904704 (30739.69 GiB 33006.49 GB)
      Used Dev Size : 2930264064 (2794.52 GiB 3000.59 GB)
       Raid Devices : 12
      Total Devices : 12
        Persistence : Superblock is persistent
    
        Update Time : Thu Jul 19 05:16:36 2018
              State : clean, FAILED
     Active Devices : 10
    Working Devices : 11
     Failed Devices : 1
      Spare Devices : 1
    
             Layout : left-symmetric
         Chunk Size : 512K
    
               Name : ingfil:0  (local to host ingfil)
               UUID : 9c5baecf:58212783:fe438251:3b70e113
             Events : 870676
    
        Number   Major   Minor   RaidDevice State
           0       8       33        0      active sync   /dev/sdc1
           1       8      113        1      active sync   /dev/sdh1
           2       8       17        2      active sync   /dev/sdb1
           3       8       81        3      active sync   /dev/sdf1
          11       8       49        4      active sync   /dev/sdd1
           6       8      161        5      active sync   /dev/sdk1
           7       8      193        6      active sync   /dev/sdm1
           7       0        0        7      removed
           9       8       97        8      active sync   /dev/sdg1
           9       0        0        9      removed
          12       8      177       10      active sync   /dev/sdl1
          13       8      145       11      active sync   /dev/sdj1
    
           8       8       65        -      faulty spare   /dev/sde1
          14       8      129        -      spare   /dev/sdi1
    Output of smartctl -x /dev/sde:
    Code:
    === START OF INFORMATION SECTION ===
    Model Family:     Western Digital Red (AF)
    Device Model:     WDC WD30EFRX-68EUZN0
    Serial Number:    WD-WMC4N1007575
    LU WWN Device Id: 5 0014ee 6594ee571
    Firmware Version: 80.00A80
    User Capacity:    3*000*592*982*016 bytes [3,00 TB]
    Sector Sizes:     512 bytes logical, 4096 bytes physical
    Rotation Rate:    5400 rpm
    Device is:        In smartctl database [for details use: -P show]
    ATA Version is:   ACS-2 (minor revision not indicated)
    SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
    Local Time is:    Thu Jul 19 17:23:32 2018 CEST
    SMART support is: Available - device has SMART capability.
    SMART support is: Enabled
    AAM feature is:   Unavailable
    APM feature is:   Unavailable
    Rd look-ahead is: Enabled
    Write cache is:   Enabled
    ATA Security is:  Disabled, NOT FROZEN [SEC1]
    Wt Cache Reorder: Enabled
    
    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    
    General SMART Values:
    Offline data collection status:  (0x00) Offline data collection activity
                                            was never started.
                                            Auto Offline Data Collection: Disabled.
    Self-test execution status:      (   0) The previous self-test routine completed
                                            without error or no self-test has ever
                                            been run.
    Total time to complete Offline
    data collection:                (40320) seconds.
    Offline data collection
    capabilities:                    (0x7b) SMART execute Offline immediate.
                                            Auto Offline data collection on/off support.
                                            Suspend Offline collection upon new
                                            command.
                                            Offline surface scan supported.
                                            Self-test supported.
                                            Conveyance Self-test supported.
                                            Selective Self-test supported.
    SMART capabilities:            (0x0003) Saves SMART data before entering
                                            power-saving mode.
                                            Supports SMART auto save timer.
    Error logging capability:        (0x01) Error logging supported.
                                            General Purpose Logging supported.
    Short self-test routine
    recommended polling time:        (   2) minutes.
    Extended self-test routine
    recommended polling time:        ( 404) minutes.
    Conveyance self-test routine
    recommended polling time:        (   5) minutes.
    SCT capabilities:              (0x703d) SCT Status supported.
                                            SCT Error Recovery Control supported.
                                            SCT Feature Control supported.
                                            SCT Data Table supported.
    
    SMART Attributes Data Structure revision number: 16
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
      1 Raw_Read_Error_Rate     POSR-K   200   200   051    -    7
      3 Spin_Up_Time            POS--K   183   170   021    -    5841
      4 Start_Stop_Count        -O--CK   100   100   000    -    344
      5 Reallocated_Sector_Ct   PO--CK   200   200   140    -    0
      7 Seek_Error_Rate         -OSR-K   200   200   000    -    0
      9 Power_On_Hours          -O--CK   051   051   000    -    36167
     10 Spin_Retry_Count        -O--CK   100   100   000    -    0
     11 Calibration_Retry_Count -O--CK   100   100   000    -    0
     12 Power_Cycle_Count       -O--CK   100   100   000    -    185
    192 Power-Off_Retract_Count -O--CK   200   200   000    -    103
    193 Load_Cycle_Count        -O--CK   001   001   000    -    962848
    194 Temperature_Celsius     -O---K   118   097   000    -    32
    196 Reallocated_Event_Count -O--CK   200   200   000    -    0
    197 Current_Pending_Sector  -O--CK   200   200   000    -    1
    198 Offline_Uncorrectable   ----CK   100   253   000    -    0
    199 UDMA_CRC_Error_Count    -O--CK   200   200   000    -    0
    200 Multi_Zone_Error_Rate   ---R--   100   253   000    -    0
                                ||||||_ K auto-keep
                                |||||__ C event count
                                ||||___ R error rate
                                |||____ S speed/performance
                                ||_____ O updated online
                                |______ P prefailure warning
    
    General Purpose Log Directory Version 1
    SMART           Log Directory Version 1 [multi-sector log support]
    Address    Access  R/W   Size  Description
    0x00       GPL,SL  R/O      1  Log Directory
    0x01           SL  R/O      1  Summary SMART error log
    0x02           SL  R/O      5  Comprehensive SMART error log
    0x03       GPL     R/O      6  Ext. Comprehensive SMART error log
    0x06           SL  R/O      1  SMART self-test log
    0x07       GPL     R/O      1  Extended self-test log
    0x09           SL  R/W      1  Selective self-test log
    0x10       GPL     R/O      1  NCQ Command Error log
    0x11       GPL     R/O      1  SATA Phy Event Counters
    0x21       GPL     R/O      1  Write stream error log
    0x22       GPL     R/O      1  Read stream error log
    0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
    0xa0-0xa7  GPL,SL  VS      16  Device vendor specific log
    0xa8-0xb7  GPL,SL  VS       1  Device vendor specific log
    0xbd       GPL,SL  VS       1  Device vendor specific log
    0xc0       GPL,SL  VS       1  Device vendor specific log
    0xc1       GPL     VS      93  Device vendor specific log
    0xe0       GPL,SL  R/W      1  SCT Command/Status
    0xe1       GPL,SL  R/W      1  SCT Data Transfer
    
    SMART Extended Comprehensive Error Log Version: 1 (6 sectors)
    Device Error Count: 4
            CR     = Command Register
            FEATR  = Features Register
            COUNT  = Count (was: Sector Count) Register
            LBA_48 = Upper bytes of LBA High/Mid/Low Registers ]  ATA-8
            LH     = LBA High (was: Cylinder High) Register    ]   LBA
            LM     = LBA Mid (was: Cylinder Low) Register      ] Register
            LL     = LBA Low (was: Sector Number) Register     ]
            DV     = Device (was: Device/Head) Register
            DC     = Device Control Register
            ER     = Error register
            ST     = Status register
    Powered_Up_Time is measured from power on, and printed as
    DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
    SS=sec, and sss=millisec. It "wraps" after 49.710 days.
    
    Error 4 [3] occurred at disk power-on lifetime: 36155 hours (1506 days + 11 hours)
      When the command that caused the error occurred, the device was active or idle.
    
      After command completion occurred, registers were:
      ER -- ST COUNT  LBA_48  LH LM LL DV DC
      -- -- -- == -- == == == -- -- -- -- --
      40 -- 51 00 00 00 01 4e 8f fb 28 40 00  Error: UNC at LBA = 0x14e8ffb28 = 5613026088
    
      Commands leading to the command that caused the error were:
      CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
      -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
      60 01 40 00 80 00 01 4e 90 05 e8 40 08  1d+05:58:34.051  READ FPDMA QUEUED
      60 04 00 00 78 00 01 4e 90 01 e8 40 08  1d+05:58:34.051  READ FPDMA QUEUED
      60 00 40 00 70 00 01 4e 8f fd a8 40 08  1d+05:58:34.010  READ FPDMA QUEUED
      60 00 80 00 68 00 01 4e 8f fd 28 40 08  1d+05:58:34.010  READ FPDMA QUEUED
      60 00 80 00 60 00 01 4e 8f fc a8 40 08  1d+05:58:34.010  READ FPDMA QUEUED
    
    Error 3 [2] occurred at disk power-on lifetime: 36155 hours (1506 days + 11 hours)
      When the command that caused the error occurred, the device was active or idle.
    
      After command completion occurred, registers were:
      ER -- ST COUNT  LBA_48  LH LM LL DV DC
      -- -- -- == -- == == == -- -- -- -- --
      40 -- 51 00 fe 00 01 4e 8f fb 28 40 00  Error: UNC at LBA = 0x14e8ffb28 = 5613026088
    
      Commands leading to the command that caused the error were:
      CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
      -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
      60 04 00 00 30 00 01 4e 8f f9 e8 40 08  1d+05:58:30.229  READ FPDMA QUEUED
      60 04 00 00 28 00 01 4e 8f f5 e8 40 08  1d+05:58:30.221  READ FPDMA QUEUED
      60 04 00 00 20 00 01 4e 8f f1 e8 40 08  1d+05:58:30.213  READ FPDMA QUEUED
      60 04 00 00 18 00 01 4e 8f ed e8 40 08  1d+05:58:30.209  READ FPDMA QUEUED
      60 04 00 00 10 00 01 4e 8f e9 e8 40 08  1d+05:58:30.201  READ FPDMA QUEUED
    
    Error 2 [1] occurred at disk power-on lifetime: 36132 hours (1505 days + 12 hours)
      When the command that caused the error occurred, the device was active or idle.
    
      After command completion occurred, registers were:
      ER -- ST COUNT  LBA_48  LH LM LL DV DC
      -- -- -- == -- == == == -- -- -- -- --
      40 -- 51 00 00 00 01 4e 8f fb 28 40 00  Error: UNC at LBA = 0x14e8ffb28 = 5613026088
    
      Commands leading to the command that caused the error were:
      CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
      -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
      60 04 00 00 a8 00 01 4e 94 db 78 40 08     07:33:08.137  READ FPDMA QUEUED
      60 04 00 00 b0 00 01 4e 94 d7 78 40 08     07:33:08.132  READ FPDMA QUEUED
      60 04 00 00 c0 00 01 4e 94 d3 78 40 08     07:33:08.125  READ FPDMA QUEUED
      60 04 00 00 c8 00 01 4e 94 cf 78 40 08     07:33:08.120  READ FPDMA QUEUED
      60 04 00 00 d0 00 01 4e 94 cb 78 40 08     07:33:08.113  READ FPDMA QUEUED
    
    Error 1 [0] occurred at disk power-on lifetime: 36132 hours (1505 days + 12 hours)
      When the command that caused the error occurred, the device was active or idle.
    
      After command completion occurred, registers were:
      ER -- ST COUNT  LBA_48  LH LM LL DV DC
      -- -- -- == -- == == == -- -- -- -- --
      40 -- 51 00 01 00 01 4e 8f fb 28 40 00  Error: UNC at LBA = 0x14e8ffb28 = 5613026088
    
      Commands leading to the command that caused the error were:
      CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
      -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
      60 04 00 00 88 00 01 4e 90 17 78 40 08     07:33:01.999  READ FPDMA QUEUED
      60 04 00 00 80 00 01 4e 90 13 78 40 08     07:33:01.999  READ FPDMA QUEUED
      60 04 00 00 78 00 01 4e 90 0f 78 40 08     07:33:01.999  READ FPDMA QUEUED
      60 04 00 00 70 00 01 4e 90 0b 78 40 08     07:33:01.999  READ FPDMA QUEUED
      60 04 00 00 68 00 01 4e 90 07 78 40 08     07:33:01.999  READ FPDMA QUEUED
    
    SMART Extended Self-test Log Version: 1 (1 sectors)
    No self-tests have been logged.  [To run self-tests, use: smartctl -t]
    
    SMART Selective self-test log data structure revision number 1
     SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
        1        0        0  Not_testing
        2        0        0  Not_testing
        3        0        0  Not_testing
        4        0        0  Not_testing
        5        0        0  Not_testing
    Selective self-test flags (0x0):
      After scanning selected spans, do NOT read-scan remainder of disk.
    If Selective self-test is pending on power-up, resume after 0 minute delay.
    
    SCT Status Version:                  3
    SCT Version (vendor specific):       258 (0x0102)
    SCT Support Level:                   1
    Device State:                        Active (0)
    Current Temperature:                    32 Celsius
    Power Cycle Min/Max Temperature:     30/38 Celsius
    Lifetime    Min/Max Temperature:     -2/54 Celsius
    Under/Over Temperature Limit Count:   0/0
    SCT Temperature History Version:     2
    Temperature Sampling Period:         1 minute
    Temperature Logging Interval:        1 minute
    Min/Max recommended Temperature:      0/60 Celsius
    Min/Max Temperature Limit:           -41/85 Celsius
    Temperature History Size (Index):    478 (367)
    
    Index    Estimated Time   Temperature Celsius
     368    2018-07-19 09:26    32  *************
     ...    ..(476 skipped).    ..  *************
     367    2018-07-19 17:23    32  *************
    
    SCT Error Recovery Control:
               Read:     70 (7,0 seconds)
              Write:     70 (7,0 seconds)
    
    Device Statistics (GP Log 0x04) not supported
    
    SATA Phy Event Counters (GP Log 0x11)
    ID      Size     Value  Description
    0x0001  2            0  Command failed due to ICRC error
    0x0002  2            0  R_ERR response for data FIS
    0x0003  2            0  R_ERR response for device-to-host data FIS
    0x0004  2            0  R_ERR response for host-to-device data FIS
    0x0005  2            0  R_ERR response for non-data FIS
    0x0006  2            0  R_ERR response for device-to-host non-data FIS
    0x0007  2            0  R_ERR response for host-to-device non-data FIS
    0x0008  2            0  Device-to-host non-data FIS retries
    0x0009  2            3  Transition from drive PhyRdy to drive PhyNRdy
    0x000a  2            4  Device-to-host register FISes sent due to a COMRESET
    0x000b  2            0  CRC errors within host-to-device FIS
    0x000f  2            0  R_ERR response for host-to-device data FIS, CRC
    0x0012  2            0  R_ERR response for host-to-device non-data FIS, CRC
    0x8000  4       151491  Vendor specific


    Is it possible to save the array?
    Last edited by vwlinux; July 19th, 2018 at 04:27 PM.

  2. #2
    Join Date
    Nov 2009
    Location
    Catalunya, Spain
    Beans
    14,560
    Distro
    Ubuntu 18.04 Bionic Beaver

    Re: mdadm RAID5 array 2 drives removed

    First, 12 disks is way too many for raid5. You need at least 2 disk redundancy. Even more if possible...

    You should be able to bring the array online. First take a deeper look into each member superblock info:
    Code:
    sudo mdadm -E /dev/sd[bcdefghijklm]1
    That will show you Event counters which are important to figure out how up to date each member is. After you post that output we can advise more...

  3. #3
    Join Date
    Apr 2009
    Beans
    12

    Re: mdadm RAID5 array 2 drives removed

    I know this.
    Was going to move all data to new server.
    Thank you for your reply.

    Here is the output of the mdadm -E /dev/sd[bcdefghijklm]1

    Code:
    # mdadm -E /dev/sd[bcdefghijklm]1
    /dev/sdb1:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x0
         Array UUID : 9c5baecf:58212783:fe438251:3b70e113
               Name : ingfil:0  (local to host ingfil)
      Creation Time : Thu Jan 24 20:36:48 2013
         Raid Level : raid5
       Raid Devices : 12
    
     Avail Dev Size : 5860528128 (2794.52 GiB 3000.59 GB)
         Array Size : 32232904704 (30739.69 GiB 33006.49 GB)
        Data Offset : 2048 sectors
       Super Offset : 8 sectors
              State : clean
        Device UUID : bd70c6e3:6f0536d1:3cdc907d:cba8a589
    
        Update Time : Thu Jul 19 05:16:36 2018
           Checksum : 1c74c438 - correct
             Events : 870676
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : Active device 2
       Array State : AAAAAAA.A.AA ('A' == active, '.' == missing)
    /dev/sdc1:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x0
         Array UUID : 9c5baecf:58212783:fe438251:3b70e113
               Name : ingfil:0  (local to host ingfil)
      Creation Time : Thu Jan 24 20:36:48 2013
         Raid Level : raid5
       Raid Devices : 12
    
     Avail Dev Size : 5860528128 (2794.52 GiB 3000.59 GB)
         Array Size : 32232904704 (30739.69 GiB 33006.49 GB)
        Data Offset : 2048 sectors
       Super Offset : 8 sectors
              State : clean
        Device UUID : 571efc24:34a91143:075d21f8:ca040d3b
    
        Update Time : Thu Jul 19 05:16:36 2018
           Checksum : fb7df25d - correct
             Events : 870676
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : Active device 0
       Array State : AAAAAAA.A.AA ('A' == active, '.' == missing)
    /dev/sdd1:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x0
         Array UUID : 9c5baecf:58212783:fe438251:3b70e113
               Name : ingfil:0  (local to host ingfil)
      Creation Time : Thu Jan 24 20:36:48 2013
         Raid Level : raid5
       Raid Devices : 12
    
     Avail Dev Size : 5860528128 (2794.52 GiB 3000.59 GB)
         Array Size : 32232904704 (30739.69 GiB 33006.49 GB)
        Data Offset : 2048 sectors
       Super Offset : 8 sectors
              State : clean
        Device UUID : f75ff05b:e15fbf7c:d712c3c0:aedf87d8
    
        Update Time : Thu Jul 19 05:16:36 2018
           Checksum : d23c7b6a - correct
             Events : 870676
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : Active device 4
       Array State : AAAAAAA.A.AA ('A' == active, '.' == missing)
    /dev/sde1:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x0
         Array UUID : 9c5baecf:58212783:fe438251:3b70e113
               Name : ingfil:0  (local to host ingfil)
      Creation Time : Thu Jan 24 20:36:48 2013
         Raid Level : raid5
       Raid Devices : 12
    
     Avail Dev Size : 5860528128 (2794.52 GiB 3000.59 GB)
         Array Size : 32232904704 (30739.69 GiB 33006.49 GB)
        Data Offset : 2048 sectors
       Super Offset : 8 sectors
              State : clean
        Device UUID : ff5db992:40dc1d0a:46c8b0be:df02baa5
    
        Update Time : Thu Jul 19 05:15:33 2018
           Checksum : 6181ce3e - correct
             Events : 870672
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : Active device 7
       Array State : AAAAAAAAAAAA ('A' == active, '.' == missing)
    /dev/sdf1:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x0
         Array UUID : 9c5baecf:58212783:fe438251:3b70e113
               Name : ingfil:0  (local to host ingfil)
      Creation Time : Thu Jan 24 20:36:48 2013
         Raid Level : raid5
       Raid Devices : 12
    
     Avail Dev Size : 5860528128 (2794.52 GiB 3000.59 GB)
         Array Size : 32232904704 (30739.69 GiB 33006.49 GB)
        Data Offset : 2048 sectors
       Super Offset : 8 sectors
              State : clean
        Device UUID : 6599e8ed:dddecd34:76a41633:fb5fbb58
    
        Update Time : Thu Jul 19 05:16:36 2018
           Checksum : eca45b8 - correct
             Events : 870676
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : Active device 3
       Array State : AAAAAAA.A.AA ('A' == active, '.' == missing)
    /dev/sdg1:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x0
         Array UUID : 9c5baecf:58212783:fe438251:3b70e113
               Name : ingfil:0  (local to host ingfil)
      Creation Time : Thu Jan 24 20:36:48 2013
         Raid Level : raid5
       Raid Devices : 12
    
     Avail Dev Size : 5860528128 (2794.52 GiB 3000.59 GB)
         Array Size : 32232904704 (30739.69 GiB 33006.49 GB)
        Data Offset : 2048 sectors
       Super Offset : 8 sectors
              State : clean
        Device UUID : b050efc5:c7e92c1b:b700de78:043bafb7
    
        Update Time : Thu Jul 19 05:16:36 2018
           Checksum : 71eb3f3d - correct
             Events : 870676
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : Active device 8
       Array State : AAAAAAA.A.AA ('A' == active, '.' == missing)
    /dev/sdh1:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x0
         Array UUID : 9c5baecf:58212783:fe438251:3b70e113
               Name : ingfil:0  (local to host ingfil)
      Creation Time : Thu Jan 24 20:36:48 2013
         Raid Level : raid5
       Raid Devices : 12
    
     Avail Dev Size : 5860528128 (2794.52 GiB 3000.59 GB)
         Array Size : 32232904704 (30739.69 GiB 33006.49 GB)
        Data Offset : 2048 sectors
       Super Offset : 8 sectors
              State : clean
        Device UUID : fd99f2f3:a9e46376:3ddfcc1b:f9baff23
    
        Update Time : Thu Jul 19 05:16:36 2018
           Checksum : a64e1df - correct
             Events : 870676
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : Active device 1
       Array State : AAAAAAA.A.AA ('A' == active, '.' == missing)
    /dev/sdi1:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x0
         Array UUID : 9c5baecf:58212783:fe438251:3b70e113
               Name : ingfil:0  (local to host ingfil)
      Creation Time : Thu Jan 24 20:36:48 2013
         Raid Level : raid5
       Raid Devices : 12
    
     Avail Dev Size : 5860529005 (2794.52 GiB 3000.59 GB)
         Array Size : 32232904704 (30739.69 GiB 33006.49 GB)
      Used Dev Size : 5860528128 (2794.52 GiB 3000.59 GB)
        Data Offset : 4096 sectors
       Super Offset : 8 sectors
              State : clean
        Device UUID : a8da1419:de490f38:6e2519d8:35887114
    
        Update Time : Thu Jul 19 05:16:36 2018
           Checksum : 9df0a6a5 - correct
             Events : 870676
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : spare
       Array State : AAAAAAA.A.AA ('A' == active, '.' == missing)
    /dev/sdj1:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x0
         Array UUID : 9c5baecf:58212783:fe438251:3b70e113
               Name : ingfil:0  (local to host ingfil)
      Creation Time : Thu Jan 24 20:36:48 2013
         Raid Level : raid5
       Raid Devices : 12
    
     Avail Dev Size : 5860529005 (2794.52 GiB 3000.59 GB)
         Array Size : 32232904704 (30739.69 GiB 33006.49 GB)
      Used Dev Size : 5860528128 (2794.52 GiB 3000.59 GB)
        Data Offset : 4096 sectors
       Super Offset : 8 sectors
              State : clean
        Device UUID : f4c9feba:470c018d:99a83fcc:5fc18c66
    
        Update Time : Thu Jul 19 05:16:36 2018
           Checksum : db0e14af - correct
             Events : 870676
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : Active device 11
       Array State : AAAAAAA.A.AA ('A' == active, '.' == missing)
    /dev/sdk1:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x0
         Array UUID : 9c5baecf:58212783:fe438251:3b70e113
               Name : ingfil:0  (local to host ingfil)
      Creation Time : Thu Jan 24 20:36:48 2013
         Raid Level : raid5
       Raid Devices : 12
    
     Avail Dev Size : 5860528128 (2794.52 GiB 3000.59 GB)
         Array Size : 32232904704 (30739.69 GiB 33006.49 GB)
        Data Offset : 2048 sectors
       Super Offset : 8 sectors
              State : clean
        Device UUID : 1c15eb5c:751ded1b:e5837644:4393b5af
    
        Update Time : Thu Jul 19 05:16:36 2018
           Checksum : cd4612c0 - correct
             Events : 870676
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : Active device 5
       Array State : AAAAAAA.A.AA ('A' == active, '.' == missing)
    /dev/sdl1:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x0
         Array UUID : 9c5baecf:58212783:fe438251:3b70e113
               Name : ingfil:0  (local to host ingfil)
      Creation Time : Thu Jan 24 20:36:48 2013
         Raid Level : raid5
       Raid Devices : 12
    
     Avail Dev Size : 5860528128 (2794.52 GiB 3000.59 GB)
         Array Size : 32232904704 (30739.69 GiB 33006.49 GB)
        Data Offset : 2048 sectors
       Super Offset : 8 sectors
              State : clean
        Device UUID : 15070871:74c810a5:5491dfaa:3794a736
    
        Update Time : Thu Jul 19 05:16:36 2018
           Checksum : 57e1be22 - correct
             Events : 870676
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : Active device 10
       Array State : AAAAAAA.A.AA ('A' == active, '.' == missing)
    /dev/sdm1:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x0
         Array UUID : 9c5baecf:58212783:fe438251:3b70e113
               Name : ingfil:0  (local to host ingfil)
      Creation Time : Thu Jan 24 20:36:48 2013
         Raid Level : raid5
       Raid Devices : 12
    
     Avail Dev Size : 5860528128 (2794.52 GiB 3000.59 GB)
         Array Size : 32232904704 (30739.69 GiB 33006.49 GB)
        Data Offset : 2048 sectors
       Super Offset : 8 sectors
              State : clean
        Device UUID : a75f6f2a:bc3db33a:40bd7b6b:b1e75168
    
        Update Time : Thu Jul 19 05:16:36 2018
           Checksum : 99320b5c - correct
             Events : 870676
    
             Layout : left-symmetric
         Chunk Size : 512K
    
       Device Role : Active device 6
       Array State : AAAAAAA.A.AA ('A' == active, '.' == missing)

  4. #4
    Join Date
    Nov 2009
    Location
    Catalunya, Spain
    Beans
    14,560
    Distro
    Ubuntu 18.04 Bionic Beaver

    Re: mdadm RAID5 array 2 drives removed

    Hmmm, this should be easy... All of your members (including sdi1) report 870676 events counter, except sde1 which says 870672. The only difference with sdi1 seems to be that it is marked as spare right now, out of what ever reason. But if its counter is really 870676, it should fit into the array just nicely.

    First stop the array and try to force assemble it leaving sde1 out for the moment:
    Code:
    sudo mdadm --stop /dev/md0
    sudo mdadm --assemble --verbose --force /dev/md0 /dev/sd[bcdfghijklm]1
    Follow the messages it is giving you during the assemble and lets see how that goes...
    Darko.
    -----------------------------------------------------------------------
    Ubuntu 18.04 LTS 64bit

  5. #5
    Join Date
    Apr 2009
    Beans
    12

    Re: mdadm RAID5 array 2 drives removed

    Here is the output of the command:
    Code:
    # mdadm --assemble --verbose --force /dev/md0 /dev/sd[bcdfghijklm]1
    mdadm: looking for devices for /dev/md0
    mdadm: /dev/sdb1 is identified as a member of /dev/md0, slot 2.
    mdadm: /dev/sdc1 is identified as a member of /dev/md0, slot 0.
    mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot 4.
    mdadm: /dev/sdf1 is identified as a member of /dev/md0, slot 3.
    mdadm: /dev/sdg1 is identified as a member of /dev/md0, slot 8.
    mdadm: /dev/sdh1 is identified as a member of /dev/md0, slot 1.
    mdadm: /dev/sdi1 is identified as a member of /dev/md0, slot -1.
    mdadm: /dev/sdj1 is identified as a member of /dev/md0, slot 11.
    mdadm: /dev/sdk1 is identified as a member of /dev/md0, slot 5.
    mdadm: /dev/sdl1 is identified as a member of /dev/md0, slot 10.
    mdadm: /dev/sdm1 is identified as a member of /dev/md0, slot 6.
    mdadm: added /dev/sdh1 to /dev/md0 as 1
    mdadm: added /dev/sdb1 to /dev/md0 as 2
    mdadm: added /dev/sdf1 to /dev/md0 as 3
    mdadm: added /dev/sdd1 to /dev/md0 as 4
    mdadm: added /dev/sdk1 to /dev/md0 as 5
    mdadm: added /dev/sdm1 to /dev/md0 as 6
    mdadm: no uptodate device for slot 7 of /dev/md0
    mdadm: added /dev/sdg1 to /dev/md0 as 8
    mdadm: no uptodate device for slot 9 of /dev/md0
    mdadm: added /dev/sdl1 to /dev/md0 as 10
    mdadm: added /dev/sdj1 to /dev/md0 as 11
    mdadm: added /dev/sdi1 to /dev/md0 as -1
    mdadm: added /dev/sdc1 to /dev/md0 as 0
    mdadm: /dev/md0 assembled from 10 drives and 1 spare - not enough to start the array.
    Is it possible to get the /dev/sdi1 included to be able to start the array?

    /dev/sdi1 is the disk that i replaced.

  6. #6
    Join Date
    Apr 2009
    Beans
    12

    Re: mdadm RAID5 array 2 drives removed

    Here is the output of mdadm.conf

    Code:
    # mdadm.conf
    #
    # Please refer to mdadm.conf(5) for information about this file.
    #
    
    # by default (built-in), scan all partitions (/proc/partitions) and all
    # containers for MD superblocks. alternatively, specify devices to scan, using
    # wildcards if desired.
    #DEVICE partitions containers
    
    # auto-create devices with Debian standard permissions
    CREATE owner=root group=disk mode=0660 auto=yes
    
    # automatically tag new arrays as belonging to the local system
    HOMEHOST <system>
    
    # instruct the monitoring daemon where to send mail alerts
    MAILADDR root
    
    # definitions of existing MD arrays
    #ARRAY /dev/md/0 metadata=1.2 UUID=9c5baecf:58212783:fe438251:3b70e113 name=myserver:0
    
    # This file was auto-generated on Wed, 21 May 2014 11:57:09 +0200
    # by mkconf $Id$
    ARRAY /dev/md/0 metadata=1.2 UUID=9c5baecf:58212783:fe438251:3b70e113 name=myserver:0 spares=1
    I run this command:
    Code:
    mdadm --examine --scan >> /etc/mdadm/mdadm.conf
    Before i had no spare in the array.
    I think sdi1 is set to spare because is was not finished with rebuild.
    Could this be the case?
    Last edited by vwlinux; July 21st, 2018 at 10:53 AM.

  7. #7
    Join Date
    Nov 2009
    Location
    Catalunya, Spain
    Beans
    14,560
    Distro
    Ubuntu 18.04 Bionic Beaver

    Re: mdadm RAID5 array 2 drives removed

    I am not sure what happened during the rebuild. The counter for sdi1 looked OK, but I am not 100% sure how it should look if the rebuild didn't complete.

    You can try assembling the array using sde1 instead of sdi1 in that same command. The counter for sde1 is very close, and it should start... Don't forget to stop the array first.

    That is what I would try next. To run those same commands only with sde1 instead of sdi1 in the member list.
    Darko.
    -----------------------------------------------------------------------
    Ubuntu 18.04 LTS 64bit

  8. #8
    Join Date
    Apr 2009
    Beans
    12

    Re: mdadm RAID5 array 2 drives removed

    Here is the output of command with sde instead of sdi:

    Code:
    # mdadm --assemble --verbose --force /dev/md0 /dev/sd[bcdefghjklm]1
    mdadm: looking for devices for /dev/md0
    mdadm: /dev/sdb1 is identified as a member of /dev/md0, slot 2.
    mdadm: /dev/sdc1 is identified as a member of /dev/md0, slot 0.
    mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot 4.
    mdadm: /dev/sde1 is identified as a member of /dev/md0, slot 7.
    mdadm: /dev/sdf1 is identified as a member of /dev/md0, slot 3.
    mdadm: /dev/sdg1 is identified as a member of /dev/md0, slot 8.
    mdadm: /dev/sdh1 is identified as a member of /dev/md0, slot 1.
    mdadm: /dev/sdj1 is identified as a member of /dev/md0, slot 11.
    mdadm: /dev/sdk1 is identified as a member of /dev/md0, slot 5.
    mdadm: /dev/sdl1 is identified as a member of /dev/md0, slot 10.
    mdadm: /dev/sdm1 is identified as a member of /dev/md0, slot 6.
    mdadm: forcing event count in /dev/sde1(7) from 870672 upto 870676
    mdadm: clearing FAULTY flag for device 3 in /dev/md0 for /dev/sde1
    mdadm: Marking array /dev/md0 as 'clean'
    mdadm: added /dev/sdh1 to /dev/md0 as 1
    mdadm: added /dev/sdb1 to /dev/md0 as 2
    mdadm: added /dev/sdf1 to /dev/md0 as 3
    mdadm: added /dev/sdd1 to /dev/md0 as 4
    mdadm: added /dev/sdk1 to /dev/md0 as 5
    mdadm: added /dev/sdm1 to /dev/md0 as 6
    mdadm: added /dev/sde1 to /dev/md0 as 7
    mdadm: added /dev/sdg1 to /dev/md0 as 8
    mdadm: no uptodate device for slot 9 of /dev/md0
    mdadm: added /dev/sdl1 to /dev/md0 as 10
    mdadm: added /dev/sdj1 to /dev/md0 as 11
    mdadm: added /dev/sdc1 to /dev/md0 as 0
    mdadm: /dev/md0 assembled from 11 drives - not enough to start the array.
    Here is the output of cat /proc/mdstat
    Code:
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
    md0 : inactive sdc1[0](S) sdj1[13](S) sdl1[12](S) sdg1[9](S) sde1[8](S) sdm1[7](S) sdk1[6](S) sdd1[11](S) sdf1[3](S) sdb1[2](S) sdh1[1](S)
          32232905142 blocks super 1.2
    
    unused devices: <none>
    Last edited by vwlinux; July 22nd, 2018 at 07:26 PM.

  9. #9
    Join Date
    Nov 2009
    Location
    Catalunya, Spain
    Beans
    14,560
    Distro
    Ubuntu 18.04 Bionic Beaver

    Re: mdadm RAID5 array 2 drives removed

    Strange, it is refusing to assemble with 11 drives according to that last message. You could try forcing it to run by adding the parameter --run to the command. That would be the last attempt with --assemble. If that doesn't work, there is only one more thing left to do...
    Darko.
    -----------------------------------------------------------------------
    Ubuntu 18.04 LTS 64bit

  10. #10
    Join Date
    Apr 2009
    Beans
    12

    Re: mdadm RAID5 array 2 drives removed

    I run the command with sde, but nothing happend.

    Then i rebootet the server and this is the output of cat /proc/mdstat

    Code:
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
    md0 : active raid5 sdh1[1] sdf1[3] sdb1[2] sdj1[13] sdm1[7] sdk1[6] sdi1[14] sdg1[9] sdl1[12] sde1[8] sdc1[0] sdd1[11]
          32232904704 blocks super 1.2 level 5, 512k chunk, algorithm 2 [12/11] [UUUUUUUUU_UU]
          [>....................]  recovery =  4.3% (128174280/2930264064) finish=338.1min speed=138120K/sec
    
    unused devices: <none>
    This was the output of mdadm -D /dev/md0
    Code:
    # mdadm -D /dev/md0
    /dev/md0:
            Version : 1.2
      Creation Time : Thu Jan 24 20:36:48 2013
         Raid Level : raid5
         Array Size : 32232904704 (30739.69 GiB 33006.49 GB)
      Used Dev Size : 2930264064 (2794.52 GiB 3000.59 GB)
       Raid Devices : 12
      Total Devices : 12
        Persistence : Superblock is persistent
    
        Update Time : Mon Jul 23 23:57:53 2018
              State : clean, degraded, recovering
     Active Devices : 11
    Working Devices : 12
     Failed Devices : 0
      Spare Devices : 1
    
             Layout : left-symmetric
         Chunk Size : 512K
    
     Rebuild Status : 4% complete
    
               Name : ingfil:0  (local to host ingfil)
               UUID : 9c5baecf:58212783:fe438251:3b70e113
             Events : 870682
    
        Number   Major   Minor   RaidDevice State
           0       8       33        0      active sync   /dev/sdc1
           1       8      113        1      active sync   /dev/sdh1
           2       8       17        2      active sync   /dev/sdb1
           3       8       81        3      active sync   /dev/sdf1
          11       8       49        4      active sync   /dev/sdd1
           6       8      161        5      active sync   /dev/sdk1
           7       8      193        6      active sync   /dev/sdm1
           8       8       65        7      active sync   /dev/sde1
           9       8       97        8      active sync   /dev/sdg1
          14       8      129        9      spare rebuilding   /dev/sdi1
          12       8      177       10      active sync   /dev/sdl1
          13       8      145       11      active sync   /dev/sdj1
    This is the output from the server now:

    Code:
    cat /proc/mdstat
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
    md0 : active raid5 sde1[8](F) sdd1[11] sdb1[2] sdc1[0] sdg1[9] sdf1[3] sdk1[6] sdl1[12] sdi1[14](S) sdh1[1] sdj1[13] sdm1[7]
          32232904704 blocks super 1.2 level 5, 512k chunk, algorithm 2 [12/10] [UUUUUUU_U_UU]
    
    unused devices: <none>
    Code:
    # mdadm -D /dev/md0
    /dev/md0:
            Version : 1.2
      Creation Time : Thu Jan 24 20:36:48 2013
         Raid Level : raid5
         Array Size : 32232904704 (30739.69 GiB 33006.49 GB)
      Used Dev Size : 2930264064 (2794.52 GiB 3000.59 GB)
       Raid Devices : 12
      Total Devices : 12
        Persistence : Superblock is persistent
    
        Update Time : Tue Jul 24 07:09:10 2018
              State : clean, FAILED
     Active Devices : 10
    Working Devices : 11
     Failed Devices : 1
      Spare Devices : 1
    
             Layout : left-symmetric
         Chunk Size : 512K
    
               Name : ingfil:0  (local to host ingfil)
               UUID : 9c5baecf:58212783:fe438251:3b70e113
             Events : 870772
    
        Number   Major   Minor   RaidDevice State
           0       8       33        0      active sync   /dev/sdc1
           1       8      113        1      active sync   /dev/sdh1
           2       8       17        2      active sync   /dev/sdb1
           3       8       81        3      active sync   /dev/sdf1
          11       8       49        4      active sync   /dev/sdd1
           6       8      161        5      active sync   /dev/sdk1
           7       8      193        6      active sync   /dev/sdm1
           7       0        0        7      removed
           9       8       97        8      active sync   /dev/sdg1
           9       0        0        9      removed
          12       8      177       10      active sync   /dev/sdl1
          13       8      145       11      active sync   /dev/sdj1
    
           8       8       65        -      faulty spare   /dev/sde1
          14       8      129        -      spare   /dev/sdi1
    Is it possible to replace the sde disk with the disk i just removed?
    I might be less errors on this?
    Last edited by vwlinux; July 24th, 2018 at 07:55 AM.

Page 1 of 2 12 LastLast

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •