Page 2 of 3 FirstFirst 123 LastLast
Results 11 to 20 of 24

Thread: When creating a RAID5 array, mdadm will automatically create a degraded array

  1. #11
    Join Date
    Oct 2009
    Beans
    Hidden!
    Distro
    Ubuntu 22.04 Jammy Jellyfish

    Re: When creating a RAID5 array, mdadm will automatically create a degraded array

    Quote Originally Posted by rubylaser View Post
    All you would need to do is write a simple BASH script from your regular fileserver to etherwake the backup-box by MAC address, ping it until it responds, then test SSH to make sure that it's available, rsync from your fileserver to the backup-box (setup keyless SSH), once the rsync is done, duse ssh to the remote box to shut it down (ssh root@backup-box 'shutdown -h now'). This way you can schedule via cron to do the backup when you are off at work, or not trying to sleep, etc.
    That sounds like a plan. I plan on using keys for ssh (duh), so sending a remote command via SSH shouldn't be that hard to script.
    Come to #ubuntuforums! We have cookies! | Basic Ubuntu Security Guide

    Tomorrow's an illusion and yesterday's a dream, today is a solution...

  2. #12
    Join Date
    Dec 2005
    Location
    USA
    Beans
    134
    Distro
    Ubuntu Development Release

    Re: When creating a RAID5 array, mdadm will automatically create a degraded array

    Quote Originally Posted by rubylaser View Post
    finally, what does the SMART info look like on that disk?
    Code:
    sudo apt-get install smartmontools
    smartctl -a /dev/sdh
    I think your guess that i have some bad drives is corect.

    This one looks like the main offender. Do you aggree?
    Code:
    root@dalserver:/dev/disk/by-id# smartctl -a /dev/sdc
    smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.2.0-4-amd64] (local build)
    Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
    
    === START OF INFORMATION SECTION ===
    Model Family:     Western Digital Caviar Black
    Device Model:     WDC WD1002FAEX-00Z3A0
    Serial Number:    WD-WCATR2712007
    LU WWN Device Id: 5 0014ee 25a38c4f2
    Firmware Version: 05.01D05
    User Capacity:    1,000,204,886,016 bytes [1.00 TB]
    Sector Size:      512 bytes logical/physical
    Device is:        In smartctl database [for details use: -P show]
    ATA Version is:   8
    ATA Standard is:  Exact ATA specification draft version not indicated
    Local Time is:    Wed Jul 10 23:11:34 2013 CDT
    SMART support is: Available - device has SMART capability.
    SMART support is: Enabled
    
    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    
    General SMART Values:
    Offline data collection status:  (0x85)    Offline data collection activity
                        was aborted by an interrupting command from host.
                        Auto Offline Data Collection: Enabled.
    Self-test execution status:      ( 118)    The previous self-test completed having
                        the read element of the test failed.
    Total time to complete Offline
    data collection:         (16200) seconds.
    Offline data collection
    capabilities:              (0x7b) SMART execute Offline immediate.
                        Auto Offline data collection on/off support.
                        Suspend Offline collection upon new
                        command.
                        Offline surface scan supported.
                        Self-test supported.
                        Conveyance Self-test supported.
                        Selective Self-test supported.
    SMART capabilities:            (0x0003)    Saves SMART data before entering
                        power-saving mode.
                        Supports SMART auto save timer.
    Error logging capability:        (0x01)    Error logging supported.
                        General Purpose Logging supported.
    Short self-test routine
    recommended polling time:      (   2) minutes.
    Extended self-test routine
    recommended polling time:      ( 187) minutes.
    Conveyance self-test routine
    recommended polling time:      (   5) minutes.
    SCT capabilities:            (0x3037)    SCT Status supported.
                        SCT Feature Control supported.
                        SCT Data Table supported.
    
    SMART Attributes Data Structure revision number: 16
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
      1 Raw_Read_Error_Rate     0x002f   189   189   051    Pre-fail  Always       -       69505
      3 Spin_Up_Time            0x0027   172   169   021    Pre-fail  Always       -       4383
      4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       190
      5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
      7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
      9 Power_On_Hours          0x0032   088   088   000    Old_age   Always       -       9182
     10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
     11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
     12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       188
    192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       65
    193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       124
    194 Temperature_Celsius     0x0022   104   094   000    Old_age   Always       -       43
    196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
    197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       1
    198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
    199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
    200 Multi_Zone_Error_Rate   0x0008   195   195   000    Old_age   Offline      -       1099
    
    SMART Error Log Version: 1
    No Errors Logged
    
    SMART Self-test log structure revision number 1
    Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
    # 1  Short offline       Completed: read failure       60%      9182         807585
    # 2  Extended offline    Completed: read failure       90%      8324         20253333
    # 3  Short offline       Completed without error       00%        65         -
    
    SMART Selective self-test log data structure revision number 1
     SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
        1        0        0  Not_testing
        2        0        0  Not_testing
        3        0        0  Not_testing
        4        0        0  Not_testing
        5        0        0  Not_testing
    Selective self-test flags (0x0):
      After scanning selected spans, do NOT read-scan remainder of disk.
    If Selective self-test is pending on power-up, resume after 0 minute delay.
    Something weird may be going on with this guy also. It may be nothing. Perhaps this drive or the controler it is on do not support as many smary features.
    Code:
    root@dalserver:/dev/disk/by-id# smartctl -a /dev/sdg
    smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.2.0-4-amd64] (local build)
    Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net
    
    === START OF INFORMATION SECTION ===
    Model Family:     Western Digital Caviar Black
    Device Model:     WDC WD1002FAEX-00Z3A0
    Serial Number:    WD-WCATR0300891
    Firmware Version: 0956
    User Capacity:    1,000,204,886,016 bytes [1.00 TB]
    Sector Size:      512 bytes logical/physical
    Device is:        In smartctl database [for details use: -P show]
    ATA Version is:   7
    ATA Standard is:  Exact ATA specification draft version not indicated
    Local Time is:    Wed Jul 10 23:09:08 2013 CDT
    SMART support is: Available - device has SMART capability.
    SMART support is: Enabled
    
    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    
    General SMART Values:
    Offline data collection status:  (0x00)    Offline data collection activity
                        was never started.
                        Auto Offline Data Collection: Disabled.
    Total time to complete Offline
    data collection:         (    0) seconds.
    Offline data collection
    capabilities:              (0x00)     Offline data collection not supported.
    SMART capabilities:            (0x0000)    Automatic saving of SMART data                    is not implemented.
    Error logging capability:        (0x00)    Error logging NOT supported.
                        No General Purpose Logging support.
    
    SMART Error Log not supported
    SMART Self-test Log not supported
    Device does not support Selective Self Tests/Logging
    Anyway, ill tear the system down in the next few days and note which controlers are used by these two drives and replace as needed. The second drive may still be good. I will try moving it to another controler and see if i can get more detailed smart info.

    Thanks!

  3. #13
    Join Date
    Oct 2009
    Beans
    Hidden!
    Distro
    Ubuntu 22.04 Jammy Jellyfish

    Re: When creating a RAID5 array, mdadm will automatically create a degraded array

    The first drive looks ok, even if it is running a bit hot. You might want to take a look at the thread I created asking about SMART data.

    The second disk looks.. odd. I have never had a drive not give me SMART data.

    EDIT: The first disk has a pending sector, so you should probably do a verify on it to see if the drive marks it as bad.
    Come to #ubuntuforums! We have cookies! | Basic Ubuntu Security Guide

    Tomorrow's an illusion and yesterday's a dream, today is a solution...

  4. #14
    Join Date
    Jul 2010
    Location
    Michigan, USA
    Beans
    2,136
    Distro
    Ubuntu 18.04 Bionic Beaver

    Re: When creating a RAID5 array, mdadm will automatically create a degraded array

    Are there any other disks connected to the same controller as /dev/sdg (that would tell you if it's the disk or a controller limitation)? If so, do those provide SMART data or not? Also, how about posting the other info I asked for as well regarding mdadm?

  5. #15
    Join Date
    Dec 2005
    Location
    USA
    Beans
    134
    Distro
    Ubuntu Development Release

    Re: When creating a RAID5 array, mdadm will automatically create a degraded array

    Quote Originally Posted by CharlesA View Post
    The first drive looks ok, even if it is running a bit hot. You might want to take a look at the thread I created asking about SMART data.

    The second disk looks.. odd. I have never had a drive not give me SMART data.

    EDIT: The first disk has a pending sector, so you should probably do a verify on it to see if the drive marks it as bad.
    Code:
    SMART Self-test log structure revision number 1
    Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
    # 1  Short offline       Completed: read failure       60%      9182         807585
    # 2  Extended offline    Completed: read failure       90%      8324         20253333
    # 3  Short offline       Completed without error       00%        65         -
    These lines in red are the ones that I am worried about. Doesn't that mean the drive is bad?

    How do i "verify it" as you said. Run the LONG test?

  6. #16
    Join Date
    Oct 2009
    Beans
    Hidden!
    Distro
    Ubuntu 22.04 Jammy Jellyfish

    Re: When creating a RAID5 array, mdadm will automatically create a degraded array

    Yikes! I did not see that part. Is that drive still under warranty? I'd run the WD Life Guard tools to verify it is bad and RMA that sucker.

    Only problem is I only found the Life Guard tools for Windows and Dos.
    Come to #ubuntuforums! We have cookies! | Basic Ubuntu Security Guide

    Tomorrow's an illusion and yesterday's a dream, today is a solution...

  7. #17
    Join Date
    Dec 2005
    Location
    USA
    Beans
    134
    Distro
    Ubuntu Development Release

    Re: When creating a RAID5 array, mdadm will automatically create a degraded array

    Quote Originally Posted by rubylaser View Post
    Are there any other disks connected to the same controller as /dev/sdg (that would tell you if it's the disk or a controller limitation)? If so, do those provide SMART data or not? Also, how about posting the other info I asked for as well regarding mdadm?
    I was going to but I have since destroyed the array and recreated it with RAID6 as you suggested. I ran the same thing as before just with raid6
    Code:
    mdadm --create /dev/md0 --level=6 --raid-devices=7 /dev/sd[b-h]1
    Surprisingly the raid created without any issues much like your example. All drives were active and it started doing its first sync. Although the speed was only 150K/s. So i started looking for the bad drives as you suggested and found those two that stood out.

  8. #18
    Join Date
    Jul 2010
    Location
    Michigan, USA
    Beans
    2,136
    Distro
    Ubuntu 18.04 Bionic Beaver

    Re: When creating a RAID5 array, mdadm will automatically create a degraded array

    Yeah, 150K/s is really slow, that will take forever to sync If you really want to test those two disks, I'd suggest you break the array again, and test them both with badblocks at a minimum like this.

    Code:
    badblocks -wsv /dev/sdX

  9. #19
    Join Date
    Oct 2009
    Beans
    Hidden!
    Distro
    Ubuntu 22.04 Jammy Jellyfish

    Re: When creating a RAID5 array, mdadm will automatically create a degraded array

    Quote Originally Posted by rubylaser View Post
    Yeah, 150K/s is really slow, that will take forever to sync If you really want to test those two disks, I'd suggest you break the array again, and test them both with badblocks at a minimum like this.

    Code:
    badblocks -wsv /dev/sdX
    Glad you mentioned it, I'm currently running badblocks on the 2TB drives in the old array. It's current been running for almost 36 hours now. I am preforming a destructive write test on the drives all at the same time, so the I/O is probably insane.

    EDIT: I used the destructive write command found here:
    https://calomel.org/badblocks_wipe.html
    Last edited by CharlesA; July 11th, 2013 at 11:34 PM. Reason: added link
    Come to #ubuntuforums! We have cookies! | Basic Ubuntu Security Guide

    Tomorrow's an illusion and yesterday's a dream, today is a solution...

  10. #20
    Join Date
    Jul 2010
    Location
    Michigan, USA
    Beans
    2,136
    Distro
    Ubuntu 18.04 Bionic Beaver

    Re: When creating a RAID5 array, mdadm will automatically create a degraded array

    I'm in the process of modifying the UnRAID preclear_disk.sh script to test drives. I like it because it compares before, mid, and after SMART values and does a pretty good job of hammering a disk to weed out marginal disks. I'm almost done modifying it to remove all the UnRAID specific steps. It still needs a lot more cleanup, but it works as is right now.

    If a disk can make it through all of that without an issue, it should be safe to use. I've thought about adding a badblocks destructive write to the middle, but that would be unnessecary, because it's already reading the whole disk twice and writing zeroes to the whole disk.

Page 2 of 3 FirstFirst 123 LastLast

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •