View Full Version : [ubuntu] 9.10 upgrade says I have failing hard drive
thunderdan
October 24th, 2009, 07:15 AM
I just upgraded from 9.04 to 9.10 and when I restarted and got to the desktop, there was an icon in the top panel that when hovered over said, "One or more disks are failing." This was not happening before I upgraded.
When I clicked on the icon, a window opened titled "Palimpsest Disk Utility." It said that my hard disk "has many bad sectors." I clicked on a link that said "More information" and another window came up titled "SMART Data." This window also says the disk has many bad sectors, and I ran the Short self test. The window says "Last self-test completed OK." Near the bottom of the window is a list. In this list, the following items are printed in red: "Reallocated Sector Count, Normalized: 100, Worst: 100, Threshold: 10, Value: 1 sector." "Current Pending Sector Count, Normalized: 100, Worst: 100, Threshold: 0, Value: 983 sectors."
I wonder if it is just a coincidence that the failing hard drive appeared immediately after upgrading and in fact I do have a failing hard drive, or if this is some kind of bug and I have a normal hard drive. The computer is just over a year old.
Does anyone have any advice for me? Should I back up my data and replace the hard drive? Any comments would be greatly appreciated.
Zach1188
October 24th, 2009, 08:06 AM
I hate to tell you this, but you need to backup any data you need, and now. I personally would put in another drive as a slave (or an external media of some form), boot into a Live CD, and try and copy everything that way. Whether or not it was a direct result of the update, I cannot tell you, but I would be skeptical. But, once everything is backed up, you have time to contemplate.
Keep in mind, hard drives can fail at any time.
Edit: And I say boot into a live CD, because you want to stress your hard drive as little as possible.
Findarato
October 24th, 2009, 08:17 AM
The exact same thing happened to me when I upgraded on my laptop. It actually seems to be a bug. When it happened to me, I decided to go back to 9.04, which then told me my disk was fine. Then I did a complete install of 9.10 and AGAIN it told me my disk was failing. This was 1 week ago, I have simply disabled the message and my system is running fine...
oboedad55
October 24th, 2009, 08:18 AM
Backing up is always a good idea. This has been a known issue with karmic, where the utility is very sensitive. I haven't had the problem since like alpha 4 so I think it's been addressed.
hogrod
October 24th, 2009, 10:41 AM
I am getting the same warning with 9.10 RC, I can test the drive with the manufactures .iso utility and it test fine. I have found other post with people having similar issues.
My guess this hard drive smart check is a bit to sensitive.
thunderdan
October 24th, 2009, 03:08 PM
Thank you all for your input. I think I'm going to let the hard disk keep running for now. I'll back up my sensitive data daily and see if either this bug gets fixed or if my hard drive actually fails. Let's hope it really is a bug. Is there perhaps a bug report I should file on launchpad, or is it already filed?
mr_steve
October 24th, 2009, 03:14 PM
This could be a bug, as I've seen a lot of bizarre issues with the new disk utility, BUT it is important to note that previous versions of Ubuntu did not come with the disk monitoring utility. Thus, you could indeed have a failing drive that wasn't being reported until now.
recluce
October 24th, 2009, 03:55 PM
This could be a bug, as I've seen a lot of bizarre issues with the new disk utility, BUT it is important to note that previous versions of Ubuntu did not come with the disk monitoring utility. Thus, you could indeed have a failing drive that wasn't being reported until now.
This cannot be emphazised enough!
Ubuntu up to 9.04 did not monitor the SMART status, unless you installed a monitoring tool! It makes total sense that many people will only find out about their failing harddrive once they do the upgrade.
I would not always trust manufacturer's SMART tools. They seem more to be build to avoid warranty claims than anything else.
To the OP: you posted two lines from your SMART report:
"Reallocated Sector Count, Normalized: 100, Worst: 100, Threshold: 10, Value: 1 sector."
This seems perfectly fine to me. Most SMART values start at 100 and count down to the threshold level. The raw value of 1 also does not alarm me.
"Current Pending Sector Count, Normalized: 100, Worst: 100, Threshold: 0, Value: 983 sectors."
I am baffled that the value/worst pair are at 100 with a raw value of 983 sectors. This indicates that your drive has 983 "weak" sectors that the drive has trouble reading from and which will eventually be remapped if the reading problems continue. That value IS alarming and indicates trouble.
If in doubt: install smartmontools and run smartctl to execute some tests on the drive.
"sudo smartctl -a /dev/sda" will report all current information on the drive.
"sudo smartctl --test=short /dev/sda" will run a short self test on the drive.
Replace sda with the actual device designation. Read the man file for more information.
thunderdan
October 24th, 2009, 04:07 PM
=== START OF INFORMATION SECTION ===
Device Model: SAMSUNG HM121HI
Serial Number: S14PJD0Q814689
Firmware Version: LZ100-11
User Capacity: 120,034,123,776 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 7
ATA Standard is: ATA/ATAPI-7 T13 1532D revision 0
Local Time is: Sat Oct 24 10:03:36 2009 CDT
==> WARNING: May need -F samsung or -F samsung2 enabled; see manual for details.
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 48) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 48) minutes.
SCT capabilities: (0x003f) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 100 100 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0007 252 252 025 Pre-fail Always - 2125
4 Start_Stop_Count 0x0032 071 071 000 Old_age Always - 296724
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 1
9 Power_On_Hours 0x0032 089 089 000 Old_age Always - 6256
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 562
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 210
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 30
194 Temperature_Celsius 0x0022 118 094 000 Old_age Always - 40 (Lifetime Min/Max 10/48)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 304201
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 983
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 1927
199 UDMA_CRC_Error_Count 0x0036 252 252 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x000a 252 252 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 6254 -
# 2 Short offline Completed without error 00% 3428 -
# 3 Short offline Completed without error 00% 1 -
# 4 Short offline Completed without error 00% 0 -
SMART Selective Self-Test Log Data Structure Revision Number (0) should be 1
SMART Selective self-test log data structure revision number 0
Warning: ATA Specification requires selective self-test log data structure revision number = 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Completed [00% left] (0-65535)
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
thunderdan
October 24th, 2009, 07:00 PM
After thinking about it for a little while, I decided I should go ahead and replace the hard drive. So I was thinking about ways to back up my data. One thing I thought about was to make a shared folder on another computer and copying the entire contents of the bad hard drive to the other computer. Then I would replace the hard drive and copy the contents of the folder back to the new hard drive.
I've never done anything like this before, so I'm looking for advice on how to go about this. I figure the copying of files to the other computer should be easy enough. When I install the new hard drive, I suspect I will have to install Ubuntu onto it. Is this correct? After I install Ubuntu, will I also have to reinstall all the applications I had previously, or will copying the contents of the old hard drive put everything back into place as it should be? Any advice would be greatly appreciated.
Thanks,
mr_steve
October 24th, 2009, 07:11 PM
Is this a laptop or desktop? If you have a way to put the old and new hard drives into the same computer, you could then boot with a GParted (http://gparted.sourceforge.net) LiveCD and copy the partitions to the new drive. There seems to also be a project called Clonezilla (http://clonezilla.sourceforge.net) which I believe automates much of the process, although I haven't used it myself.
After cloning in such a way, the new drive should boot and function just like the old one did.
thunderdan
October 24th, 2009, 07:15 PM
Is this a laptop or desktop? If you have a way to put the old and new hard drives into the same computer, you could then boot with a GParted (http://gparted.sourceforge.net) LiveCD and copy the partitions to the new drive. There seems to also be a project called Clonezilla (http://clonezilla.sourceforge.net) which I believe automates much of the process, although I haven't used it myself.
After cloning in such a way, the new drive should boot and function just like the old one did.
It is a laptop. I don't know for sure, but I think only one hard drive can be in the machine at a time. Also, the new hard drive I would be installing would have more space. If I were to use something like GParted or Clonezilla, would the partitions on the new hard drive be the same as the old one or would it make the main partition bigger?
mr_steve
October 24th, 2009, 07:19 PM
It is a laptop. I don't know for sure, but I think only one hard drive can be in the machine at a time. Also, the new hard drive I would be installing would have more space. If I were to use something like GParted or Clonezilla, would the partitions on the new hard drive be the same as the old one or would it make the main partition bigger?
With a laptop you're probably stuck with only one hard drive at a time. If they're SATA drives you could connect them to a desktop machine with SATA ports pretty easy, but otherwise you might want to just copy everything off by some means, and re-install ubuntu on the new drive. There are some tools similar to Norton Ghost that would allow you to create an image of the drive and send it over the network to another machine, and then pull that image down on to the new drive. I'm not familiar with any specific examples, though.
If you can clone the drive directly somehow, I believe you can resize the partitions at the same time. Otherwise you can definitely use GParted after the fact to stretch your partitions to fill a bigger drive.
recluce
October 24th, 2009, 08:53 PM
"==> WARNING: May need -F samsung or -F samsung2 enabled; see manual for details."
Did you see the warning above and try the switch?
Other than that, the posted data did not tell me anything new. I still find it odd to see a raw "Currently Pending Sector Count" of 983 and the SMART value at 100.
Xalor
October 25th, 2009, 02:36 AM
Same problem here, haven't had it in Jaunty or in Windows, I think this is an actual bug. Any chance your HDs are Samsung ATA? Maybe that be the problem. Thank God other people are having the same problem, I thought my laptop was dead after two months.
thunderdan
October 25th, 2009, 09:13 AM
"==> WARNING: May need -F samsung or -F samsung2 enabled; see manual for details."
Did you see the warning above and try the switch?
Other than that, the posted data did not tell me anything new. I still find it odd to see a raw "Currently Pending Sector Count" of 983 and the SMART value at 100.
daniel@daniel-laptop:~$ sudo smartctl -t short -F samsung -d ata /dev/sda
[sudo] password for daniel:
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 2 minutes for test to complete.
Test will complete after Sun Oct 25 03:12:23 2009
Use smartctl -X to abort test.
daniel@daniel-laptop:~$
And then it does nothing else, just goes back to the command line.
mr_steve
October 25th, 2009, 05:01 PM
And then it does nothing else, just goes back to the command line.
After running that command, you have to wait at least two minutes for the tests to complete, possibly much longer, since the test will be interrupted any time the drive is accessed. Give it a while, and then running smartctl again will tell you if the test had any errors.
jmore9
October 25th, 2009, 05:14 PM
I got the same warning when i installed 9.04. Drive has worked flawlessly under xp and had just reformatted with manufacturs cd , no problems.
I just ignored it and kept on going. Been a week now no further warnings and no problems.
thunderdan
October 25th, 2009, 05:38 PM
After running that command, you have to wait at least two minutes for the tests to complete, possibly much longer, since the test will be interrupted any time the drive is accessed. Give it a while, and then running smartctl again will tell you if the test had any errors.
OK I ran that command, then this a few minutes later:
daniel@daniel-laptop:~$ sudo smartctl -a /dev/sda
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF INFORMATION SECTION ===
Device Model: SAMSUNG HM121HI
Serial Number: S14PJD0Q814689
Firmware Version: LZ100-11
User Capacity: 120,034,123,776 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 7
ATA Standard is: ATA/ATAPI-7 T13 1532D revision 0
Local Time is: Sun Oct 25 11:36:59 2009 CDT
==> WARNING: May need -F samsung or -F samsung2 enabled; see manual for details.
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 48) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 48) minutes.
SCT capabilities: (0x003f) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 100 100 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0007 252 252 025 Pre-fail Always - 2125
4 Start_Stop_Count 0x0032 071 071 000 Old_age Always - 296724
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 1
9 Power_On_Hours 0x0032 089 089 000 Old_age Always - 6263
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 564
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 210
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 30
194 Temperature_Celsius 0x0022 118 094 000 Old_age Always - 40 (Lifetime Min/Max 10/48)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 304610
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 985
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 1945
199 UDMA_CRC_Error_Count 0x0036 252 252 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x000a 252 252 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 6263 -
# 2 Short offline Completed without error 00% 6262 -
# 3 Short offline Completed without error 00% 6261 -
# 4 Short offline Completed without error 00% 6261 -
# 5 Short offline Completed without error 00% 6261 -
# 6 Short offline Completed without error 00% 6254 -
# 7 Short offline Completed without error 00% 3428 -
# 8 Short offline Completed without error 00% 1 -
# 9 Short offline Completed without error 00% 0 -
SMART Selective Self-Test Log Data Structure Revision Number (0) should be 1
SMART Selective self-test log data structure revision number 0
Warning: ATA Specification requires selective self-test log data structure revision number = 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Completed [00% left] (0-65535)
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
daniel@daniel-laptop:~$
recluce
October 25th, 2009, 07:44 PM
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 304610
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 985
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 1945
I do not like these three values. While I still don't understand why the "normalized/worst values stay at 100, this is what the three values acutally mean. I quote from Ariolic Software's explanations:
Attribute Name:
Reallocation Event Count
Attribute ID:
196
Description:
Count of remap operations (transfering data from a bad sector to a special reserved disk area - spare area).
The raw value of this attribute shows the total number of attempts to transfer data from reallocated sectors to a spare area. Unsuccessful attempts are counted as well as successful.
Your total is 304610 events!
Attribute Name:
Current Pending Sector Count
Attribute ID:
197
Description:
Current count of unstable sectors (waiting for remapping). The raw value of this attribute indicates the total number of sectors waiting for remapping. Later, when some of these sectors are read successfully, the value is decreased. If errors still occur when reading some sector, the hard drive will try to restore the data, transfer it to the reserved disk area (spare area) and mark this sector as remapped.
Your total is 985, up from 983 yesterday.
Attribute Name:
Uncorrectable Sector Count
Attribute ID:
198
Attribute meaning:
Quantity of uncorrectable errors. The raw value of this attribute indicates the total number of uncorrectable errors when reading/writing a sector. A rise in the value of this attribute indicates that there are evident defects of the disk surface and/or there are problems in the hard disk drive mechanical subsystem.
Your total is 1945.
I personally would not trust a drive with these numbers with any kind of valuable data.
recluce
October 25th, 2009, 07:56 PM
I got the same warning when i installed 9.04. Drive has worked flawlessly under xp and had just reformatted with manufacturs cd , no problems.
I just ignored it and kept on going. Been a week now no further warnings and no problems.
Are you sure you got the warning when installing 9.04? Or did you mean 9.10?
Of course the drive works flawlessly under XP - SMART is a diagnostic tool to predict likely drive failure before the fact. Your drive will "work flawlessly" for a long time with all the correction mechanisms that are built into today's drives. It will do so until catastrophic failure actually happens. At that point, your data is history.
But even SMART can only tell you that your drive is operating outside its designed specifications, it cannot tell you if your drive will fail tomorrow or in a year.
What do you believe should a format utility do to detect SMART errors? All tools that I know do exactly nothing, they would only detect hard errors like bad sectors (which only happen on today's drives if the drive runs out of spare sectors - very rare and extremely bad).
So to summarize:
"The drive runs flawlessly under XP" - tells you nothing, since XP does not monitor SMART - unless you install additional tools.
"The drive runs flawlessly under 9.04" - tells you nothing, since 9.04 does not install a SMART monitor automatically.
"The drive formatted without a hitch" - tells you nothing, since paritioning tools usually don't monitor SMART and only detect hard errors
"The drive has been working for a week without problems" - tells you nothing. It will likely continue to do so until catastrophic failure.
SMART is an early warning tools. You can of course choose to ignore these warning, but you do so at you own peril.
Dennis N
October 26th, 2009, 12:20 AM
If you are concerned about your drive, you might use Clonezilla (as mentioned in post 11) as an insurance policy. It boots and runs from a CD to create an image (as a set of files) of your hard drive on an external USB drive. That image then can be used to restore your system later to the same drive or to a new drive. A new replacement drive must be at least as large (gb) as the old one. The image made is a folder of compressed files which is considerably smaller than the original drive. Only the sectors used are saved.
I have used it to back up my computers periodically, and doing that part is fairly easy. Never have needed to do the restore part though.
www.clonezilla.org
thunderdan
October 26th, 2009, 06:28 AM
OK, I will definitely be replacing this HD ASAP.
thunderdan
October 26th, 2009, 08:56 PM
If you are concerned about your drive, you might use Clonezilla (as mentioned in post 11) as an insurance policy. It boots and runs from a CD to create an image (as a set of files) of your hard drive on an external USB drive. That image then can be used to restore your system later to the same drive or to a new drive. A new replacement drive must be at least as large (gb) as the old one. The image made is a folder of compressed files which is considerably smaller than the original drive. Only the sectors used are saved.
I have used it to back up my computers periodically, and doing that part is fairly easy. Never have needed to do the restore part though.
www.clonezilla.org
Now I need help using clonezilla. I have made a bootable CD with it and here is what I want to do:
I have a Windows XP machine with enough space on its hard drive to hold the image of my laptop hard drive, which is the one that needs to be replaced. I have both of these machines connected to a router. I made a shared folder on my XP machine. I would like to use clonezilla to make the image of the laptop hard drive, and save it to the folder on my XP machine. Then I'll replace the hard drive, boot with clonezilla again and copy the image back from the XP machine to the new hard drive. However, I am having trouble setting up clonezilla to find that folder on my XP machine. Does anyone have any ideas on how to do this? Or do you have an alternative method I should try?
All comments are appreciated.
Rich_B_uk
October 27th, 2009, 07:39 PM
In the words of Nelly the elephant - "Hold up"
I have just received a brand new netbook - a Dell Mini 10. I load up Karmic on a flash drive and receive this very same notification about bad sectors on my drive. Oddness. Especially as a friend of mine has just taken delivery of a Mini 12 and had the very same message.
Naturally I boot back into XP and run a chkdsk /r (which is currently chugging along at 15%)
I have an inkling that it won't have any luck finding bad sectors somehow. When I get back to the desktop I will run HD Tune to grab the SMART data and see what's going on. I very much doubt the drive is failing but yes, that is a strange (and coincidental) possibility.
More likely is that we're seeing some kind of Linux SATA controller driver or vendor specific SMART reporting issue. My disk is a Samsung as well, and although I'm not sure what my SATA controller is, I'm pretty sure it's not native to the board because Poulsbo didn't actually include SATA support in-chipset.
If you have a problem like this, I recommend you post your drive make/model, SATA controller make/model & SMART data dump so we can start to build up a picture over what's happening here.
thunderdan
October 27th, 2009, 07:44 PM
In the words of Nelly the elephant - "Hold up"
I have just received a brand new netbook - a Dell Mini 10. I load up Karmic on a flash drive and receive this very same notification about bad sectors on my drive. Oddness. Especially as a friend of mine has just taken delivery of a Mini 12 and had the very same message.
Naturally I boot back into XP and run a chkdsk /r (which is currently chugging along at 15%)
I have an inkling that it won't have any luck finding bad sectors somehow. When I get back to the desktop I will run HD Tune to grab the SMART data and see what's going on. I very much doubt the drive is failing but yes, that is a strange (and coincidental) possibility.
More likely is that we're seeing some kind of Linux SATA controller driver or vendor specific SMART reporting issue. My disk is a Samsung as well, and although I'm not sure what my SATA controller is, I'm pretty sure it's not native to the board because Poulsbo didn't actually include SATA support in-chipset.
If you have a problem like this, I recommend you post your drive make/model, SATA controller make/model & SMART data dump so we can start to build up a picture over what's happening here.
I think the data you're looking for is in post 19 of this thread. If you need something else, let me know. I still haven't been able to figure out Clonezilla yet, so my new hard drive is still in the box.
Rich_B_uk
October 27th, 2009, 07:45 PM
Ah, should have searched more in Launchpad.
THIS IS A KNOWN ISSUE
https://bugs.launchpad.net/ubuntu/+source/gnome-disk-utility/+bug/438136
Yes, you may have issues that 9.10 detects because it is apparently now actively monitoring S.M.A.R.T data, but the chances are that your hardware is fine and the value is being misreported.
Check in another OS/Software Package.
;)
thunderdan
October 27th, 2009, 07:48 PM
Ah, should have searched more in Launchpad.
THIS IS A KNOWN ISSUE
https://bugs.launchpad.net/ubuntu/+source/gnome-disk-utility/+bug/438136
Yes, you may have issues that 9.10 detects because it is apparently now actively monitoring S.M.A.R.T data, but the chances are that your hardware is fine and the value is being misreported.
Check in another OS/Software Package.
;)
Well that's a relief. I'll be taking this new hard drive back to Best Buy now.
Rich_B_uk
October 27th, 2009, 07:57 PM
Well that's a relief. I'll be taking this new hard drive back to Best Buy now.
Recommend you give the old drive a quick once over in a different machine/os/software package. The last thing you want to do is lose some data!
Will be posting more data on launchpad soon. Doesn't even look like a vendor/controller dependent issue though, more that Palimpsest is too twitchy. (i.e. I have 128 currently pending sectors, which is over the threshold of 100 - presumably set by palimpset itself. However, 128 is really nothing to worry about)
inigmatus
October 30th, 2009, 07:07 AM
I got this too after upgrading to 9.10. I'm running a Maxtor 40GB 4D040H2, and never had any problems before, and to be totally clean, I even zero-filled the drive with the Maxtor disk before installing 9.10 fresh, and still get "Disk has many bad sectors" warning.
Reallocated Sector Count:
Normalized: 231
Worst: 231
Threshold: 63
Value: 57 sectors
All other tests are clean and good in Palimpsest. Not sure what other info I can provide.
hero1900
October 30th, 2009, 06:36 PM
same happen to me same issue when i update to the new ubuntu
orb242
October 31st, 2009, 12:50 AM
Same thing happened to me....with all of the others with the same problem I am not to worried!
DaveHi
October 31st, 2009, 01:58 AM
Add me to the list.
My old, but, under used ATA Fujitsu is throwing up the same error message on my laptop. Three months ago was running absolutely clean under Windows. No apparent problem while running 8.10.
Think I'll stick with it for now, but, stay well backed up!
cyrex
October 31st, 2009, 02:33 AM
First problem
Sorry to say but i have 2 new HDD
I got 2 SAMSUNG HD103UJ ( 1TB Each )
1 says "DISK IS BEING USED OUTSIDE DESIGN PARAMETERS" and when i check where the problem is it says at
184 - Attribute 184 ( Failing)
The other disk says "Disk has a few bad sectors" the error it says
197 - Current Pending Sector Count ( Warning)
This are new hard drives ( the second one i bought to check if the first one was damaged but no it was not)
Then i got 2 new hdd, one 250gb and another 320gb.
1 Western Digital 320GB WD3200AAKS
1 Samsung 250GB HD252HJ
It says they are also bad. But all of this disks are all new.
Am getting like A LOT SCARED NOW!!.
My motherboard is the Intel DP35DP which i updated a week ago to the latest bios version. I have checked and change sata cables for all HDD (All SATA)
What more info can i give here.
swatsbiz
October 31st, 2009, 01:05 PM
I did an upgrade with one of the beta releases of 9.10, and got the error, so I stripped the old drive out and bought a new 500Gb drive around 3 weeks ago. I did a clean install, but just this morning I have a report that this hard drive is now failing!
It's brand new!
What I have noticed is with several hdd's in my PC it's always the one where Ubuntu is installed ... so my guess is it is a bug, but then I also noticed my PC took it's time booting through POST, I am worried, but I'm concerned that 9.10 is causing issues ... ?
David
ibbill
October 31st, 2009, 01:08 PM
same problem here did a fresh install no disk failure showing after a fresh install.
Bill
Milesio
October 31st, 2009, 02:43 PM
Aaand exactly the same here. Samsung ATA disk which apparently is full or errors, out of the blue, after upgrading from 9.04 to 9.10 on a Dell 1530 XPS.
I hope this really is a bug and gets fixed soon, cos it's annoying. Well, I'm pretty sure it's a bug cos we're all getting the same messages and most of us are pretty sure our HDs are fine, right?
I'm running other scans and stuff just in case and so far so good.
bhuvi
October 31st, 2009, 07:00 PM
same here http://ubuntuforums.org/showthread.php?t=1307972
viper250
October 31st, 2009, 08:00 PM
back up your data asap you do not need to copy the entire drive only the files you saved (photos, documents,songs, and programs you have written).
Install your os on a new drive and copy the files back onto the new drive.
If you would like to try to save the old drive first run d-ban on it if there are any errors on it you will get an error message stating d-ban finished early possibly due to bad sectors.
d-ban runs brute force tests on the drive so if you get errors the drive is failing any further use risks losing data. if no errors are found the drive has been completely wiped and can be used
matthewboh
October 31st, 2009, 08:55 PM
I'm having the same problem with a dual boot
yumrukcu
October 31st, 2009, 11:35 PM
i have same problem too. and i guess it's a bug because my hdd is brand new and ubuntu gives the error when i just start it, no tests longer than 30 sec. on the other hand, my vista chdsk the disk for 5 hours and found nothing.however, to be sure, i will check with hd tune and other hdd programs.
I'm having the same problem with a dual boot
oboedad55
October 31st, 2009, 11:38 PM
I had this problem with the alphas but not since the beta came out. It's scary at first to see those dire warnings. I think if you use the manufacturer's utilities and they say everything's okay then not to worry. Of course it's always a good idea to stay backed up anyway.
CRIMPS
October 31st, 2009, 11:53 PM
Same problem here, haven't had it in Jaunty or in Windows, I think this is an actual bug. Any chance your HDs are Samsung ATA? Maybe that be the problem. Thank God other people are having the same problem, I thought my laptop was dead after two months.
Similar issue here. Upgrade to Karmic and I am now being prompted to replace my "ATA SAMSUNG HM320JI" hard disk in my laptop. I think I will look for another utility to test this against just to rule out the possibility of the hard disk actually failing.
zaferaktan
November 1st, 2009, 12:48 AM
Same thing happened to me as well :(
I am surprised that they released the final 9.10 version with this freaking bug.
I have ATA SAMSUNG HM320JI disk - Dell 1330 notebook
potrzebie
November 1st, 2009, 01:35 AM
Glad to find this thread; I upgraded my desktop to 9.10 today and that was the first thing to greet me upon the first boot. It's my external drive which is said to be failing, a Seagate FreeAgent 500 GB. Testing under Windows showed it to be ok and I found the launchpad bug, but it's nice to hear from others having the same scare. I'm trusting that it's good, but redundant backups are a great thing
loconet
November 1st, 2009, 01:41 AM
I got this too after upgrading to 9.10. I'm running a Maxtor 40GB 4D040H2, and never had any problems before, and to be totally clean, I even zero-filled the drive with the Maxtor disk before installing 9.10 fresh, and still get "Disk has many bad sectors" warning.
Reallocated Sector Count:
Normalized: 231
Worst: 231
Threshold: 63
Value: 57 sectors
All other tests are clean and good in Palimpsest. Not sure what other info I can provide.
I have the exact same Hard drive as you. I'm getting the same warning after a fresh 9.10 install.
Add this the reports by many people regarding the whole HDs disappearing issue due to old RAID settings and this Ubuntu release is turning out somewhat messy IMO.
cariboo
November 1st, 2009, 04:16 AM
I got this too after upgrading to 9.10. I'm running a Maxtor 40GB 4D040H2, and never had any problems before, and to be totally clean, I even zero-filled the drive with the Maxtor disk before installing 9.10 fresh, and still get "Disk has many bad sectors" warning.
Reallocated Sector Count:
Normalized: 231
Worst: 231
Threshold: 63
Value: 57 sectors
All other tests are clean and good in Palimpsest. Not sure what other info I can provide.
You definitely have a failing disk you have a value of 57 bad sectors, and the threshold is 63, that means 6 more bad sectors and the drive is toast, it may run for years yet, or it could die tomorrow. I would suggest backing up your data asap and start looking for a new drive.
defcomexperiment
November 1st, 2009, 05:54 AM
You definitely have a failing disk you have a value of 57 bad sectors, and the threshold is 63, that means 6 more bad sectors and the drive is toast, it may run for years yet, or it could die tomorrow. I would suggest backing up your data asap and start looking for a new drive.
i understand that its a good idea to back up your data if you think something is going wrong, but did you miss the point that this is actually a known bug?
i get the same warning and there is nothing wrong with my hdd.
EuropaCar
November 1st, 2009, 09:14 AM
i'm getting the same results of 'many bad sectors'
only mine seems a LOT worse..
Normalized 100
Worst 100
Threshold 5
Value......196618 sectors..!!!!
is my hard drive really this sick..?
darkmdmn
November 1st, 2009, 01:47 PM
Same problem here.
Normalized: 100
Worst: 100
Threshold: 24
Value: 65527 sectors
My value is somewhat higher, so I am wondering if I have indeed a failing harddrive, of that it is a bug like suggested above.
My harddrive is about 4 to 5 years old, so I could imagine that some hardware is indeed failing... No critical data though, just experimenting with Ubuntu a bit.
Can someone tell me how I can check my harddrive with another tool than the standard Disk Utility? I am a Linux noob, so please keep it simple ;)
Thanks in advance,
Mark
msbealo
November 1st, 2009, 05:40 PM
I have the same problem with a Seagate SATA 500GB drive. Just upgraded to 9.10, drive only 3 months old.
What tests are recommended to determine if this is real or not?
Please be explicit.
Mark
Kokopelli
November 1st, 2009, 07:23 PM
OK I will take a crack at this... The disk utility seems to be causing some panic and may or may not be incorrect depending on your specific hard drive. In reality the SMART implementation of the hard drive matters.
BASIC TEST:
1) In disk utility note what the drive path is: as an example /dev/sda
2) install the smartmontools if they are not already installed
sudo apt-get install smartmontools
smartctl has a dependency on postfix so it will prompt you on how you want to configure mail. Unless you really want to configure mail properly I would choose none or local. (I have the smart tools email me notifications, but that is not something most people will need.)
3) from a command prompt and type
sudo smartctl -A <path to drive>|grep Reallocated_Sector
so in the example of the firdst drive (dev/sda)
sudo smartctl -A /dev/sda |grep Reallocated_Sector
Note: smartctl -A puts out a lot more information and you can look at it in its entirety by removing the grep. Since this is the particular attribute that is causing a stir I wanted to narrow the focus.
Now you should get output something like this
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 6
The only numbers to concern yourself with at the moment are the last one, the raw value, and the one immediately before "Pre-Fail", the threshold. The threshold will vary depending on the type of disk and manufacturer. Generally 36 seems to be a common number for desktop hardrives. For 2.5" (laptop) hard drives it varies more. An unscientific survey of the dirves in my laptops showed thresholds between 24 and 140.
- if your raw value is a very high number (over 500) then it is likely not reporting SMART data correctly, get the disk utilities from the manufacturer (see below)
- If your raw value is over your threshold then you very likely have a problem and the drive should be replaced. run "sudo smartctl -A /dev/sda" and look at the output in its entirety. Additionally you should run the manufacturers disk utilities (see below). This will speed up the RMA process if it is still under warranty.
- If your raw value is over half of your threshold I would suggest keeping an eye on the drive at least. you might consider trying to RMA it but depending on the manufacturer it may not be "bad" yet.
- If your raw value is under half of your threshold then the drive is very likely fine.
Again this is personal experience, not concrete fact, but I have noticed a tendency for Seagate and Samsung Drives to report a few bad sectors on a regular basis. Western Digital however rarely reports bad sectors until the drive is starting to have problems. This may be just the luck of the draw for me or it may be a variation in the way the different manufacturers report SMART data.
MANUFACTURER SPECIFIC TESTS:
If you want additional assurance on the status of your drive I would suggest going to the manufacturers site and downloading their disk testing utility. Most will require you to boot into windows but some will have a bootable CD. In reality for the most part these disk utilities will basically be reading the SMART data, but since it is a program from the manufacturer I would argue that they are better equipped to determine the appropriate values for the data.
Here is a link to the disk diagnostic tool for Seagate:
http://www.seagate.com/www/en-us/support/downloads/seatools
This site (once you pick your specific drive) is for Western Digital
http://support.wdc.com/product/download.asp?lang=en
To put this in perspective I have multiple fairly large RAID5 arrays. Out of 5 1.5TB Seagate drives I have 3 with bad sectors(1, 3, and 6 bad sectors). Out of 5 1.5 TB Western Digitals I have none currently reporting bad sectors. I do not attibute the difference to WD being more reliable than Seagate but to a difference in the way the 2 manufacturers report SMART. Out of 12 other drives from various manufacturers and size only 2 are currently reporting bad sectors (14 and 9). None of these numbers are high enough to cause me alarm at this stage but I do have mail notification set to monitor the health of the drives so if/when one does approach PRE-FAIL I can replace it before it becomes a problem. Disk Utility is a great tool in that it puts SMART data in a visible element for the end user. But it has the unfortunate side effect of causing panic when the user is not used to seeing errors on their hard drives.
EDIT: I wanted to add a small editorial on Windows "check disk" chkdsk and other (non manufacturer) Windows disk utilities are pretty much worthless for determining the health of the hardware. What the Micorsoft tools check is whether the file system is intact, not whether the hardware on which the file system resides is intact. In most cases by the time chkdsk would detect a problem caused by hardware failure it is too late. It is for this reason that SMART was created in the first place. It is not fool proof, not even close (check the paper written by Google on the subject), but it does help.
inigmatus
November 2nd, 2009, 06:35 AM
Here is my report:
Keep in mind the drive only has Ubuntu installed. I have two other disk drives, one for win and the other for my data. If this drive dies, I lose nothing, except a separate disk for Ubuntu, and the MBR, which is easy to fix.
Anyways, have a look and tell me if disaster as close as Palimpsest and smartmontools says it is. Again, this is for an ATA Maxtor 4D040H2 40GB drive after a zero-fill and clean 9.10 install:
sudo smartctl -A /dev/sdc
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
3 Spin_Up_Time 0x0027 234 233 063 Pre-fail Always - 7169
4 Start_Stop_Count 0x0032 251 251 000 Old_age Always - 4790
5 Reallocated_Sector_Ct 0x0033 231 231 063 Pre-fail Always - 57
6 Read_Channel_Margin 0x0001 253 253 100 Pre-fail Offline - 0
7 Seek_Error_Rate 0x000a 253 252 000 Old_age Always - 0
8 Seek_Time_Performance 0x0027 251 234 187 Pre-fail Always - 36519
9 Power_On_Minutes 0x0032 227 227 000 Old_age Always - 305h+40m
10 Spin_Retry_Count 0x002b 253 252 223 Pre-fail Always - 0
11 Calibration_Retry_Count 0x002b 253 252 223 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 241 241 000 Old_age Always - 4790
192 Power-Off_Retract_Count 0x0032 253 253 000 Old_age Always - 0
193 Load_Cycle_Count 0x0032 253 253 000 Old_age Always - 0
194 Unknown_Attribute 0x0032 253 253 000 Old_age Always - 0
195 Hardware_ECC_Recovered 0x000a 253 252 000 Old_age Always - 34
196 Reallocated_Event_Count 0x0008 253 253 000 Old_age Offline - 0
197 Current_Pending_Sector 0x0008 253 244 000 Old_age Offline - 0
198 Offline_Uncorrectable 0x0008 253 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0008 199 199 000 Old_age Offline - 0
200 Multi_Zone_Error_Rate 0x000a 253 252 000 Old_age Always - 0
201 Soft_Read_Error_Rate 0x000a 253 168 000 Old_age Always - 2
202 TA_Increase_Count 0x000a 253 252 000 Old_age Always - 0
203 Run_Out_Cancel 0x000b 253 252 180 Pre-fail Always - 0
204 Shock_Count_Write_Opern 0x000a 253 252 000 Old_age Always - 0
205 Shock_Rate_Write_Opern 0x000a 253 252 000 Old_age Always - 0
207 Spin_High_Current 0x002a 253 252 000 Old_age Always - 0
208 Spin_Buzz 0x002a 253 252 000 Old_age Always - 0
209 Offline_Seek_Performnce 0x0024 253 253 000 Old_age Offline - 0
99 Unknown_Attribute 0x0004 253 253 000 Old_age Offline - 0
100 Unknown_Attribute 0x0004 253 253 000 Old_age Offline - 0
101 Unknown_Attribute 0x0004 253 253 000 Old_age Offline - 0
warmaster777
November 2nd, 2009, 09:46 AM
I just upgraded my HP Laptop to 9.10 today and i'm getting the same Notification.
it might be a bug, but its better to back up.
jimbog
November 2nd, 2009, 10:23 AM
Same problem here, with a Compaq Presario R3000. It's got a 60GB Hitachi Travelstar HD, showing:
Normalized: 100
Worst: 100
Threshold: 5
Value: 65551 (!)
Everything's backed up just in case, but I'm going to ignore this warning.
infyniti
November 2nd, 2009, 06:38 PM
Same problem on Dell XPS 1330 using Samsung HM250JI. Infact upgraded on two different mahcines which are way older without any errors. Strange that most of them who reported this problem are using Samgsung HD. Is this something to do with this brand of HD ??
Anant
v0idE
November 2nd, 2009, 11:26 PM
Same problems here. Palimpsest Disk Util says I have 445 bad sectors (it was 444 yesterday) on a 12 m/o 500GB Seagate Barracuda that gets little use. This is the third of three disks in my box.
I have SMART software on my Windows partition and it says the disk is fine. I have never had read/write problems to/from the disk. With this conflicting information, which do I believe? My data is always backed up so I it appears to just be a waiting game now: either the disk fails soon, or it doesn't.
Doughy
November 3rd, 2009, 04:13 AM
Ubuntu is reporting that my drive is failing, but I'm suspicious it's a program bug. The tool is reporting that my drive is 18.6 years old. So, either my drive is reporting false information, or the diagnostic program is buggy.
zesc
November 3rd, 2009, 11:09 AM
Same here! hd samsung hm251jj on a dell studio.
I did a bunch of diagnostics (they took a few hours) on hd and everything seems ok. I am ignoring the message (disc failing - disc has many bad sectors) at my own risk.
I am not removing the warning, I hope it will disappear soon after an update ;).
darkmdmn
November 3rd, 2009, 03:26 PM
- if your raw value is a very high number (over 500) then it is likely not reporting SMART data correctly, get the disk utilities from the manufacturer (see below)
I ran your code in the terminal, and my raw value is in the billions... I have a Fujitsu ATA drive, and I can't find diagnostic tool for linux from the website of the manufacturer. Anyone advice on what to do now?
kingofpain
November 3rd, 2009, 05:49 PM
Doing a long test (including repairing) with "Seagate Seatools for DOS" solved my problem!
Actually... it reduced the number of sectors "waiting to be reallocated", so, no more warnings for my Seagate drive :D
Aenari
November 7th, 2009, 04:11 PM
I've got the same message, but in that my hard drive is one day old, I'm thinking Bug.
Isakill
November 7th, 2009, 08:49 PM
I upgraded to 9.10 on both my desktop and laptop.
The desktop reports that both my harddrives are failing and the laptop as well.
at first I was worried about my laptop until I upgraded my desktop. Now I see this as a minor annoyance.
BTW both computers are dualboot WinXP/ubuntu.
and Windows reports no such problems.
swatsbiz
November 8th, 2009, 03:01 PM
My new hard drive which was less than a month old died Yesterday :-( I guess I should have paid attention to the warning.
Is there anyway that 9.10 could cause a Hard Drive failure? My original thoughts were that it couldn't?
Or I just bought a dodgy hdd
venomheir
November 11th, 2009, 02:02 PM
Hey team, Dont know if it is a bug or not but I personally believe it may be true, my friend has dell inspiron 8600 40 gig hard drive running windows xp sp3,has boot up problems, blue screen of death, when it does boot it is extremely slow, I did a live test using ubuntu 9.10 i386 and it reports multiple bad sectors.
My brothers hp pavilion laptop 250 gig hard drive has been
running ubuntu 9.10 64 bit since release with no errors.
On the dell inspiron I will update bios as it is behind by 14 Revisions, perhaps that may help? Oh yeah my cousins laptop was running ubuntu 8.04 32 bit and often upon boot ubuntu would show orange scan disk in left corner, performed upgrade to 9.10 32 bit
and it reports failure bad sectors, hard drive is also a 40 GIG ????
infine0n
November 16th, 2009, 03:54 AM
Same problem here, with a Compaq Presario R3000. It's got a 60GB Hitachi Travelstar HD, showing:
Normalized: 100
Worst: 100
Threshold: 5
Value: 65551 (!)
Everything's backed up just in case, but I'm going to ignore this warning.
I've got a Compaq Presario R4000 w/ 80GB, now that I've found this thread I am no longer worried about this.
Normalized: 100
Worse: 100
Threshold: 5
Value: 4194597 sectors
I also ran chkdsk/R under Windows 7 and it found no bad sectors.
MasterLenus
November 16th, 2009, 07:59 AM
I installed Ubuntu a few day ago and every time I booted it showed me the warning that mi HDD is failing. I have a Samsung HM329JI HDD
2190 sectors are bad.
SeePU
November 16th, 2009, 08:31 AM
Yeah, just ignore it. It's a bug and the utility is crap. It's just more of the same of Ubuntu .... not caring what programs/utility is part of it and whether things are working or not. :rolleyes:
xieqiao
November 16th, 2009, 09:14 AM
You are right. I have the same problem after upgrading to 9.10. Just ignore it, and wait for 10.04 to resolve the bug.
Actually, I added the Alpha upgrade source for 10.04 and upgraded to 10.04 Lucid yesterday. The problem disappered first time I booted into Lucid, although the bug reappeared today.
keta
November 16th, 2009, 05:11 PM
Just to note that the same happens to me in my laptop with an ATA Samsung HM160HI. I chose to ignore it and everything works fine. If I had any sensible data, I'd probably have made a backup, though. ;)
ScottinSoCal
November 17th, 2009, 05:49 PM
I'm thinking bug.
Normalized: 97
Worst: 97
Threshold: 24
Value: 6029404 sectors
Running on a Dell Latitude D810, ATA Fujitsu MHT2080AH HDD
I still ordered a new HD for it. Upgrade from 80GB to 160GB for $100. Not bad.
Devi 710
November 22nd, 2009, 04:34 AM
Doing a long test (including repairing) with "Seagate Seatools for DOS" solved my problem!
Actually... it reduced the number of sectors "waiting to be reallocated", so, no more warnings for my Seagate drive :D
Any idea how to get the Seagate Seatools for DOS booting from a USB? I am using a netbook so I don't have a CD Drive and it won't boot from a USB.
My threshold is "0" and the value is "153" (up from 60 the other day)
deamon_knight
November 23rd, 2009, 02:22 AM
This should be considered a bug. Palimpset may be accurately reporting how many reallocated sectors you drive is reporting, but forming the wrong conclusion from that data You only have a failure when you run out of reallocateable sectors. Even with some sectors that are bad and cannot be reallocated a drive may functional without observable difficulty or data loss. What to take away from this is that all rives will fail and with little real warning, so you should make sure you have any crucial data backed on some other disk/media so you can survive a failure.
In my case, my Western Digital HDD passes all of the Manufacturers test and reports Normalized SMART Values, of 200 with a threshold of 140, Values Under 140 are reported as errors. (It does not report the RAW data as Palimpset does)
Palimpset warns that I have 183 pending (and that there are 1982 bad sectors, an amazing number). This is not reallocated or bad sectors but sectors that need to be checked to be reallocated next time data is written to them. If they write correctly they are flagged good and are reallocated if they fail to write.
Ultimately this means nothing more than I might be having a failure soon, maybe, so its not diagnostically meaningful. The bug is that Palimpset set is reporting sectors flagged as "Possibly Bad" to be "Definitely Bad", when this is not the case.
To satisfy my own curiosity I'm going to zero out this drive and see what happens.
deamon_knight
November 23rd, 2009, 06:26 AM
Yea, I completed a full wipe with Western Digital's software utility, wrote zeros to entire disk. After I reinstalled 9.10. Palimpset now no longer reports imminent disk failure.
The Category I had was "Current Pending Sector Count", and it no longer shows 1900+ bad sectors. I have no idea what this value is supposed to record. I gather it means the HDD firmware has identified some sectors as "for review" and wants to check for correct behavior next time data is written to those sectors. Since most users aren't writing over large sections of their disks on a regular basis, these sectors never get written too and thus were never written/checked until I wrote zeros to the entire drive. Palimpset's bug is reporting these as "bad" (failed sectors) , when they are not.
phillw
November 25th, 2009, 12:11 AM
palimpset is the subject of a bug report for false positives https://bugs.launchpad.net/ubuntu/+source/libatasmart/+bug/438136
For those affected, do take the time to get a login name & report what it is reporting - it is an 'active' bug and you can follow its progress there.
Regards,
Phill.
Sentience
November 25th, 2009, 06:46 PM
Has any9one gotten their hard drive fixed after having this problem?
Cr125f150ba
November 28th, 2009, 09:49 PM
I'm also thinking this is a bug because I ran a memory test on my hard drive and it said it was fine. Then just to make sure i installed Windows XP and checked the status of my hard drive and it also said my hard drive is fine. But when i installed Ubuntu 9.10 It always says I have many bad sectors. So I'm pretty sure it's a bug because I'm sure windows xp would have caught it and the memory check i ran would have found some errors if my hard drive really had bad sectors.
ScottinSoCal
December 6th, 2009, 06:33 PM
It was not a false alarm, although I still don't believe the numbers it was reporting. TigerDirect sent me a 320GB hard drive, for the same price as the 160GB - I guess they were out of the one I ordered. Yesterday my computer started locking up randomly, and the HDD was on the way out. I replaced it, reinstalled Ubuntu, and managed to get my key files off the old hard drive. It's gone to Hardware Heaven now.
FEBE got my bookamrks, user IDs and passwords off my WinXP installation and transferred to the new Ubuntu installation. While I was at it, I researched WINE and successfully transferred my Forte Agent to Ubuntu, and this allows me to put my WinXP laptop into storage. The only things I still need it for is Spore and Bryce. So, a failed hard drive turned out to be a good thing.
Jakxx
January 6th, 2010, 02:59 AM
I had the same problem when I upgraded. I bought a replacement hard drive and received the exact same error again, so I can only assume that it is a bug.
BeanBagKing
February 5th, 2010, 11:10 PM
Just found this thread, though I'd leave a comment on my experiences.
Was using XP on this laptop, which was running horribly slowly. So I installed Ubuntu and got the "DISK HAS MANY BAD SECTORS" warning. Well, I figured I found the problem and used Ultimate Boot CD to run SeaTools (Seagate Hard Drive) and run an extended test. Sure enough, got a bunch of failures and write errors.
Then I came across this thread looking for more info where a lot of people are saying that the "Disk has bad sectors" message in 9.10 is just a bug.
Moral of the story here is that for some this may be a bug with 9.10, but in my case it was correct. Don't read this thread and simply shrug it off, run some other tools (especially the manufacturers) and see what you get. Even if all this is good, backups are always smart.
For now my disk is still working fine, I never would have known had it not been for 9.10 until a catastrophic failure and everything was lost. Just because your disk "acts" fine doesn't mean it is.
If your getting this warning don't ignore it! Even if it turns out to be nothing, it's hard to be sure with a failing drive.
phillw
February 28th, 2010, 04:32 AM
I am 100% with you there.
But, my manufactures diagnostics says it is okay.
Now, once again I am getting errors reported & the Ubuntu 'test your disk' is not working ... (Tried about 10 minutes -- the short / quick one) ... 30 minutes later, it is 'hung' so I have to cancel it.
But, as I'm also running 10.04 alpha, I have plenty of backups -- You can never have too many backups.
Phill.
Rick Deckard
March 8th, 2010, 03:16 PM
I recently upgraded from Intrepid -> Jaunty -> Karmic - as soon as Karmic booted, I got the warning that my drive was bad and that I should replace it.
I have an Hitachi Deskstar 7K160 / HDS721616PLA380 / 160GB drive.
Palimpsest overall assessment is 'DISK HAS MANY BAD SECTORS - back up data and replace disk'.
I wasn't convinced, as my drive has been running fine for years, so I spent several hours investigating this problem.
I've learnt quite a bit about SMART in the last 24 hours, so I'll post what I know here, so people can make an informed decision before replacing drives - I fear that some people have already replaced perfectly healthy drives because of this false error.
The palimpsest (very stupid name) utility reports that my disk drive has 196,619 bad sectors - THIS IS NOT CORRECT, my drive ACTUALLY has *3* reallocated sectors, which is perfectly fine for a modern disk drive.
In my case, SMART attribute 5 (Reallocated Sector Count) has a raw value** of 0x0B0003000000 - Palimpset assumes that this is a single 48-bit integer value and converts it to 196,619 (0x00000003000B - byte-sequence is reversed low-to-high). The format and meaning of the raw value is entirely up to the manufacturers. They can put what they like in here and don't have to release the meaning of the value - some treat it as a 'trade secret'.
So, as far as SMART monitoring is concerned, what is important are the normalised VALUE and THRESHOLD values.
Here are my drive SMART stats:
> sudo smartctl -A /dev/sda
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 095 016 Pre-fail Always - 0
2 Throughput_Performance 0x0005 100 100 050 Pre-fail Offline - 85
3 Spin_Up_Time 0x0007 120 100 024 Pre-fail Always - 168 (Average 164)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 1768
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 196619
7 Seek_Error_Rate 0x000b 100 099 067 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 136 100 020 Pre-fail Offline - 31
9 Power_On_Hours 0x0012 098 098 000 Old_age Always - 16416
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 1767
192 Power-Off_Retract_Count 0x0032 099 099 000 Old_age Always - 1887
193 Load_Cycle_Count 0x0012 099 099 000 Old_age Always - 1887
194 Temperature_Celsius 0x0002 166 130 000 Old_age Always - 36 (Lifetime Min/Max 13/47)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 3
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 253 000 Old_age Always - 294
For most attributes the VALUE will start at either 200 or 100 when your drive is new - over time, some of these values will decrease towards the THRESHOLD value. If a VALUE reaches or drops below the THRESHOLD value, the attribute is flagged as FAILED and the health status of your drive may change.
Earlier in this thread, someone made the assumption that the threshold value had a direct relation to the attribute count - this isn't necessarily true. The VALUE and THRESHOLD are calculated and updated by the hard-drive's firmware - only the manufacturer knows what these normalised values really mean (they're more likely to be percentage values than actual counts).
The WHEN_FAILED column will show the point in the lifetime of the drive, that the attribute VALUE reached the THRESHOLD - the drive keeps track of how many hours it has been in use (powered-on accumulated hours) and will put the current value in the WHEN_FAILED column.
As you can see from the smartctl output, NONE of my drive's attributes values have reached their threshold and therefore all the WHEN_FAILED values are blank.
I have a healthy drive.
Palimpsest should not be interpreting the raw values of some attributes and then making assumptions about them - it certainly should not be suggesting I change my drive based on a value that has NO AGREED FORMAT. The hard drive firmware is designed to indicate problems through the SMART attributes table - the important indicators are VALUE, THRESHOLD and WHEN_FAILED and that is what I'll be paying careful attention to, from now on.
** (to get a raw value from palimpsest, just hover the mouse pointer over the attribute)
kramerr
April 13th, 2010, 02:15 PM
My 2 cents:
Assuming the "Bad sector" probability subspace doesn't naturally take up near 100% of all failure probability space, the scenario we are in is highly improbable. You'd think one of us would find our drives were failing for a different reason. I'm concluding its a bug.
Rick Deckard
April 13th, 2010, 05:08 PM
Prompted by the last post, I decided to check on my drives again out of curiosity. Bizarrely, palimpsest is now reporting 'SMART unavailable' for my two installed hard-drives...
...oh well, I really wish I hadn't 'upgraded' from Jaunty to Karmic. Roll-on 10.04...
googeek
April 14th, 2010, 03:42 AM
!!!WARNING!!!THIS REALLY MAY NOT BE A BUG. I thought it was for the longest time. I'm a techie when it comes to hardware though, and after testing it on over 5 drives all in the same box, different boxes, different jumpers, configs, you name it.... the report only occured for me only on the drives that eventually went bad. All the drives that it said were failing have, either a week later, or 3 months later, etc. If you have this error, make sure to back up pretty much all the time. One minute your drive will work, the next it wont. period. Until then though, to my knowledge it will work perfectly, which is what makes it seem like a simple bug. I discounted this error over and over and thank goodness I back up fairly religiously or I would have lost more data. So, please, back your important stuff up and stop thinking of this as a bug before you lose some data like I did. If it so happens your hard drive doesn't fail for the next year and a half you have permission to call me a horses ****.
Powered by vBulletin® Version 4.2.2 Copyright © 2023 vBulletin Solutions, Inc. All rights reserved.