24.04 causes my harddrive to say BRAAAPPP

**palteater** · April 26th, 2024

I have a 4TB mechanical 2.5" archive drive (aside the NVMe OS drive) with an ext4 partition, and it intermittently rasps now - which it did not do with 23.10, 23.04 etc back to 19.10 or some such.

Brrpp..........brrpp..........brrpp.......*silence for a long time*...... brrpp.

Thing is I have had this once before, on another computer and with a completely different drive. That one was a 3.5" and the sound was really loud and annoying. It began right away when new and I formatted it to ext4, but for some reason it went silent after I reformatted it to NTFS. Incidentally, it died after a year despite none of the S.M.A.R.T diagnostics showing any error. Then the replacement did the same one year later, and I replaced it with an SSD.

So naturally I wonder if there is something software-related killing my drives, although I cannot for the life of me understand what or why?

And IF this is some sort of precursor to failure, why would it appear instantly after upgrading to 24.04? That drive had nothing to do with the installation, I reconnected it later.

Is there some way to diagnose what that causes that sound?

**TheFu** · April 26th, 2024

Use the drive manufacturer's diagnostic tools.

The only things I can think of is that you didn't properly align your sectors on 4K boundaries. That seems to make it hard for HDDs. Also, SMART isn't 100% accurate. I watch mine weekly running short tests and long tests monthly, then get the reports and look for changes in the reports over time. That's really the only way to predict a failure that I know. A single test run yearly isn't sufficient to see the changes.

For example, the normal things in the SMART reports with pending and sector errors were all fine with the last HDD that failed here. But here's the last report I used to get the vendor to approve an RMA:

Code:

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   001   001   051    Pre-fail  Always   FAILING_NOW 87106
  3 Spin_Up_Time            0x0027   161   115   021    Pre-fail  Always       -       10933           
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       20              
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0               
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   092   092   000    Old_age   Always       -       6133
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       20
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       14
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       7
194 Temperature_Celsius     0x0022   105   091   000    Old_age   Always       -       47
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   198   198   000    Old_age   Offline      -       1819

Nothing in those to make me worry about data corruption. Well, not when this initially started. Over time,

Code:

$ egrep Raw_Read_Error_Rate smart.202*sda
smart.2023-10-10.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
smart.2023-10-17.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
smart.2023-10-24.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
smart.2023-10-31.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
smart.2023-11-07.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       1
smart.2023-11-14.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       1
smart.2023-11-21.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       1
smart.2023-11-28.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       1
smart.2023-12-05.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
smart.2023-12-12.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
smart.2023-12-19.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
smart.2023-12-26.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
smart.2024-01-02.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       5
smart.2024-01-09.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0 
smart.2024-01-16.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       3
smart.2024-01-23.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
smart.2024-01-30.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
smart.2024-02-06.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       5
smart.2024-02-13.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       2
smart.2024-02-20.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       2
smart.2024-02-27.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       2 
smart.2024-03-05.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       7
smart.2024-03-12.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       10
smart.2024-03-19.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       20
smart.2024-03-26.sda:  1 Raw_Read_Error_Rate     0x002f   199   199   051    Pre-fail  Always       -       61
smart.2024-04-02.sda:  1 Raw_Read_Error_Rate     0x002f   117   117   051    Pre-fail  Always       -       3184
smart.2024-04-07.sda:  1 Raw_Read_Error_Rate     0x002f   001   001   051    Pre-fail  Always   FAILING_NOW 87106
smart.2024-04-09.sda:  1 Raw_Read_Error_Rate     0x002f   001   001   051    Pre-fail  Always   FAILING_NOW 87095
smart.2024-04-16.sda:  1 Raw_Read_Error_Rate     0x002f   200   001   051    Pre-fail  Always   In_the_past 5
smart.2024-04-23.sda:  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0

That last line is from a new HDD.

Kept getting worse and worse, until the drive failed. When it did fail, it was already replaced and being accessed using a USB2 dock, being wiped with random data. It has been shipped for RMA.

Initially, it became slow, very slow, with writing. Then reading from the files that were slow to write was also REALLY slow. Other files were fine. That made me look at the SMART data more carefully, since the HDD was only 8 months old and came new with a 5 yr warranty. I stopped buying HDDs with less than 5 yr warranties about 3-4 yrs ago. The inconvenience of dealing with data issues more than about once a decade is just too much hassle for me. Anyway, since there weren't any reallocated events or pending, I ensured all the data was backed up to other disks and reformatted it with a fresh ext4, then moved all the data back. To get the data initially moved off, a simple copy was failing, so I used ddrescue on a file-by-file basis. If there were 100 files, then over 99 of them moved quickly, but that last 1% ran overnight.

I also was monitoring the drive temperature. It was warm, but not hot.

BTW, I really do run those SMART tests weekly:

Code:

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      6013         -
# 2  Short offline       Completed without error       00%      5832         -
# 3  Short offline       Completed without error       00%      5664         -
# 4  Short offline       Completed without error       00%      5497         -
# 5  Extended offline    Completed without error       00%      5342         -
# 6  Short offline       Completed without error       00%      5162         -
# 7  Short offline       Completed without error       00%      4995         -
# 8  Short offline       Completed without error       00%      4827         -
# 9  Extended offline    Completed without error       00%      4670         -
#10  Short offline       Completed without error       00%      4491         -
#11  Short offline       Completed without error       00%      4323         -
#12  Short offline       Completed without error       00%      4155         -
#13  Short offline       Completed without error       00%      3988         -
#14  Extended offline    Completed without error       00%      3832         -
#15  Short offline       Completed without error       00%      3652         -
#16  Short offline       Completed without error       00%      3484         -
#17  Short offline       Completed without error       00%      3316         -
#18  Extended offline    Completed without error       00%      3160         -
#19  Short offline       Completed without error       00%      2981         -
#20  Short offline       Completed without error       00%      2813         -
#21  Short offline       Completed without error       00%      2645         -

That's a long test the first Monday of every month and short tests every other Monday.
See how looking at the data over time let me be proactive? In the end, I didn't lose any data, even with 1 new file being inaccessible when the problem first began.
I should also mention, that disk was for scratch use, not archival of stuff, so I didn't have great daily backups, like I do with all other data. Most of the data was being migrated from an old RAID setup to this drive and I just got bogged down. I didn't delete the RAID data, which is why almost nothing was lost that wasn't in the "scratch" area.

Drives making noise is never good. Start looking more closely at the smart reports and testing weekly. You won't know the problem until it is an emergency, but you need to be prepared. If it were me, I'd move a noisy disk that still worked to be a backup and put in a new disk that's quiet for the primary.

**palteater** · April 27th, 2024

Okay. I will dump this thing and get a new.

The thing is that I do find it to show suspicious slowdowns at times, aside from the weird sound.

Thanks!

**MAFoElffen** · April 28th, 2024

But your description of that sound makes me 'cringe'. It reminds me of hearing that sound from time to time, over the years on failing drives. Like when a magnetic HDD drive has a seek error. Then the drive heads vibrate trying to correct itself during the error.

RE: https://www.lacie.com/support/kb/ide...hat-they-mean/

**animehunter123u** · May 13th, 2024

Hello,

Necroposting this because this issue happened in 22.04, and now I noticed the same thing with Ubuntu 24.04. (Bought a brand new desktop, and same thing). "mount.ntfs with spinning drives causes too much IO in Ubuntu 24.04

I read online that people said the newer kernels and mount.ntfs do cause this behaviour.

On my box, i ran `iotop` and sorted by DISK READ and saw this:

Total DISK READ: 1975.83 K/s | Total DISK WRITE: 34.42 K/s
Current DISK READ: 1972.39 K/s | Current DISK WRITE: 0.00 B/s
TID PRIO USER DISK READ> DISK WRITE COMMAND 1158 be/4 root 1975.83 K/s 0.00 B/s mount.ntfs /dev/sda1 /data/MyHardDrive -o rw

I confirmed this happened on my OTHER computer as well, and this happened on Ubuntu 22.04 as well

I think the common denominator issue is the way I am mounting the drive via:

# cat /etc/fstab | grep My
/dev/sda1 /data/MyHardDrive ntfs defaults 0 0

So I did this today, and hopefully it fixes it:

# cat /etc/fstab | grep My
/dev/sda1 /data/MyHardDrive ntfs defaults,noatime 0 0

Thanks guys.

**animehunter123u** · May 14th, 2024

So I awoke this morning to the new computer with new nas drive again loud doing a done of read/writes

Basically its too much DISK READ/WRITE: Total DISK READ: 2.83 M/s | Total DISK WRITE: 31.11 K/sCurrent DISK READ: 2.83 M/s | Current DISK WRITE: 13.83 K/s TID PRIO USER DISK READ> DISK WRITE COMMAND 1178 be/4 root 2.83 M/s 0.00 B/s mount.ntfs /dev/sda1 /data/MyHardDrive -o rw,noatime

So I decided to mount it differently and disable ntfs indexing... maybe this will fix it if anyone is curious:

# cat /etc/fstab | grep My
/dev/sda1 /data/MyHardDrive ntfs remove_hiberfile,noatime,nodiratime,noexec,nodefra g,noindex 0 0

This should hopefully fix it.

**animehunter123u** · May 16th, 2024

Unfortunately, It started spinning again, and iotop now shows:

TID PRIO USER DISK READ> DISK WRITE COMMAND
1181 be/4 root 1465.65 K/s 0.00 B/s mount.ntfs /dev/sda1 /data/MyHardDrive -o rw,rem~ile,noatime,nodiratime,noexec,nodefrag,noin dex

Anyone know how to do stop this behaviour, I am out of ideas

I will just stop using ntfs, and format it as ext4 and hopefully if it works, i wont post here again - meaning it worked:

# cat /etc/fstab | grep 11tb | grep -v ^#
/dev/sda1 /data/MyHardDrive ext4 defaults 0 0

**TheFu** · May 17th, 2024

Using the device filename, /dev/sda1, is a bad idea. Those can change at every reboot depending on the order that storage devices are found. Use either a UUID or a LABEL instead for the mount.

NTFS should only be used when the HDD will physically be connected to an MS-Windows computer. For all other storage needs, use a native Linux file system.