Page 1 of 2 12 LastLast
Results 1 to 10 of 18

Thread: Help need to fix server disk errors.

  1. #1
    Join Date
    Sep 2020
    Beans
    48

    Help need to fix server disk errors.

    My ubuntu server is throwing errors and it looks like the drive is failing.


    I searched for how to fix the drive errors and tried the following but I can get it to work.

    here is what I tried

    > df -h

    returns among other devices. ( note /dev/SDB does not show up but is assigned as md127 or should be If I set the rad up correctly)

    /dev/sda
    /dev/md127 ( was a raid drive but I removed one of the drive and disabled the raid )

    > umount /dev/md127 ( states drive is unmounted)

    > fsck -A /dev/md127 ( aborts with msg drive is mounted)


    not sure what I'm doing wrong.
    here are a few of the error msg.
    --
    ata3.01 exception fail read DMA
    BLK-update dev sdb io error sector 11.......
    --
    also I have a mapped/samba drive on the ubuntu server to my windows pc and when I try to copy some files from the server to the pc a whole lot of error msgs appear on the severs monitor.
    --
    I also get error about the MYSQL program, ( not sure where the MYSQL programs are installed either on the SDA or the SDB drive) but I was able to backup the database.


    would using clonezilla to clone the drive as is be recommended? I could get some of the data off the drive if I cant fix it. I think clonezilla attempt to read the bad sectors.
    Last edited by tross9; November 17th, 2024 at 02:07 AM.

  2. #2
    Join Date
    Mar 2010
    Location
    Been there, meh.
    Beans
    Hidden!
    Distro
    Ubuntu

    Re: Help need to fix server disk errors.

    There are 2 types of errors. Logical errors and physical errors.

    Lots can be done to address logical errors.

    Not much can be done to address physical errors - besides RAID (not RAID0) and good backups.

    Sometimes the best solution for RAID issues is to destroy the array, recreate it and restore the data from backups. RAID never replaces backups. RAID solves 2 problems. Backups solve 999 problems, and I'm not exaggerating.

    So, first, let's get an overview of your storage. Run these 2 command and post the command + output back here. Use forum code tags or it will be too hard to read (i.e. I won't read it).

    Code:
    df -hT -x squashfs -x tmpfs -x devtmpfs
    lsblk -e 7 -o name,type,fstype,size,FSAVAIL,FSUSE%,label,mountpoint
    If the setup uses LVM, you should also run and post these commands:
    Code:
    sudo pvs
    sudo vgs
    sudo lvs
    If BTRFS or ZFS are involved, I can't help, but you need to be clear about that.

    After we see that output, next steps can happen.
    I don't need descriptions of the output.

    For logical errors, the first step is to run an fsck on the unmount file system device. The device for the file system will be in the output requested above. Many issues can be solved by that.
    BTW, you have looked at the system logs already, right?

  3. #3
    Join Date
    Sep 2020
    Beans
    48

    Re: Help need to fix server disk errors.

    thanks to replying I do not have direct access to the server from any remote pc. so I need to write down and retype the output ( typos may occur)

    I've attach a text doc of the output

    ServerOutput.txt

    as I was trying to write down the DF output , the server keep throwing the ata3 errors msgs, after about 5 to 10 minutes the errors stop showing and the DF command stopped showing any info in drive SDB.
    I needed to shutdown -h then power it back on to get the info on SDB to start showing. shutdown -r did not seem to restart SDB.
    Last edited by tross9; November 17th, 2024 at 04:52 PM.

  4. #4
    Join Date
    Mar 2010
    Location
    Been there, meh.
    Beans
    Hidden!
    Distro
    Ubuntu

    Re: Help need to fix server disk errors.

    Quote Originally Posted by tross9 View Post
    thanks to replying I do not have direct access to the server from any remote pc. so I need to write down and retype the output ( typos may occur)
    You should use redirection to store the output into a file https://github.com/jlevy/the-art-of-command-line , copy the files using a USB drive. Typos will lead to mistakes - by me and by you. That link has multiple tips about using redirection. Learn them. Know them. Love them.

    If a HDD is disappearing, it is likely failing. Hope you have backups and a replacement storage device ready.

    Looking at attachments feels like work.

  5. #5
    Join Date
    Mar 2010
    Location
    Been there, meh.
    Beans
    Hidden!
    Distro
    Ubuntu

    Re: Help need to fix server disk errors.

    Ok, first, if you typed all that, ouch. I wouldn't have. Definitely learn to use redirection.

    There are 2 storage devices in the system. I don't see any connections between sda and sdb, so whatever you did related to RAID didn't work. It was a complexity and waste of your time from the output.

    You used LVM, but only for the OS and swap. BTW, the root LV shouldn't be over 35GB in size. You should have made other LVs for other needs. The output shows it is 1.8TB in size. That's just crazy.

    Any chance you setup a server without really knowing what you were doing? We all start out that way.
    What you've done on sdb is very odd. You have both LVM and RAID on the same area? I can't really tell. Perhaps the indentation is wrong? I have never seen anything like what has been posted before. The attempted LVM setup on sdb1 could easily have corrupted the attempt at RAID1.

    Was there ever a 3rd disk for the RAID1 previously?

    As it is now, I think you should backup everything you can ASAP, run some SMART tests to check whether the disks are failing or not, and then start over with a fresh install with more consideration about storage layouts and methods. That's easy for me to say from here.

    In general, RAID1 should only be used for HA needs. It never replaces the need for backups. RAID solves 2 issues. Availability and, perhaps, sometimes, performance. But RAID on a single device does nothing. Nothing at all.

    LVM provides flexibility, but only when used correctly. Much of that flexibility comes by NOT allocating all the storage. Only allocate what you actually need to the different "parts" of the system that need it. I've posted and written in these forums many times about how to setup LVM. Go find one of those posts, read it and ask questions if it isn't clear.

    When using SW-RAID, it is common to use RAID1 and then to place LVM onto the full RAID1 device for better storage management. LVM can create RAID storage too, but it is ugly and not as easy to restore after a failure. That's an opinion. OTOH, I don't know how to convince Ubuntu to setup mdadm RAID for the OS during installation, but I do know how to convert an LVM install into an LVM-RAID1 setup post install. Again, LVM-RAID is ugly.

    I stopped using RAID when I switched to better SSDs. SSD failure rates are very small and most of them should last 10 yrs, so backups are all I use. NVMe SSD storage is already so fast that any RAID1 performance gains wouldn't matter at all.

    Anyway, ask specific questions if you have any. I think regaining access to your RAID1 storage beyond what is already possible is unlikely, since it appears to be failing. Backup everything ASAP.

  6. #6
    Join Date
    Sep 2020
    Beans
    48

    Re: Help need to fix server disk errors.

    thanks again for replying.



    "Definitely learn to use redirection." I'll look into that.
    "If a HDD is disappearing, it is likely failing. " That what I was afraid of. started looking for a new drive was the first thing I did before posting this, but hoped that it was just bad sectors.
    so , any suggestions where to look on how to correctly rebuild the server from scratch, the right way? not what I build.
    I'll be adding or using it for the following:
    1) Lamp ( mainly for the PHP website and database)
    2) samba share

    lets start with the history.

    1) "Any chance you setup a server without really knowing what you were doing? " yes, this server was setup back around 2016 or earlier , had no knowledge of Ubuntu or Linux back then, I followed a setup tutorial that used LVM for the drives. and keep using it if I need to rebuild the server.
    2) "
    You have both LVM and RAID on the same area? " SDB was setup as a LVM non-raid and I got a second 1TB drive and created the Raid for SDB (still as a LVM partition).
    3) then I ran the Raid for About 4-5 yrs , then in 2019, I needed the second 1TB for my main PC (it's 1TB drive was failing, it was create in 2008 with the 1TB) broke/disabled the mirror and left it as you see it.
    4) yes, I do backup the two of the most important systems and data (from the server I backup the database to my main PC and clone it to an external SSD)

    as to the data lost from the server, I'll probable lose my folder where I keep my software
    Patches on my server ( mostly games patches and most likely I can get the patches off the internet if I need them)


    My expertise is in sql databases and programing for them ( VB, C#, MS VS 2007, 09 etc... , PowerBuilder ( old Sybase program similar to VS)


    in a nutshell. if we use the Microsoft course level to describe my knowledge. ( where 100 is beginner)

    as to servers
    Ubuntu = 100 -- so yes I can setup the server but cant completely understand what I'm doing.
    MS Servers = 300

    databases
    Informix = 150
    Sql = 300 ( MsSql , Mysql, etc)


    again thanks for the help.
    Last edited by tross9; November 17th, 2024 at 11:19 PM.

  7. #7
    Join Date
    Mar 2010
    Location
    Been there, meh.
    Beans
    Hidden!
    Distro
    Ubuntu

    Re: Help need to fix server disk errors.

    I know next to nothing about recent MSFT stuff. Haven't programmed on MS-Windows since the 1990s and I've avoided it as much as possible since about 2008. In short, 100, 300 means nothing to me, but I don't think it is relevant.

    Leaving a RAID broken for years is a poor administrative choice. Either have RAID and maintain it or don't.

    There are step by step guides on the internet for how to setup storage for a server. Some links are posted by others in these forums over the last 10 yrs. It isn't something that interests me. I'm more about "why", not "what to type". The what to type is spelled out in the manpages on every Linux system already.

    I'll spend 2 minutes searching for a server-setup guide .... if nothing is added to this post, then I didn't find one in that 2 minutes, but I'm positive they are in these forums. BTW, if you read the header, all those posts for guides are about to disappear in a few weeks, so don't wait.

  8. #8
    Join Date
    Sep 2020
    Beans
    48

    Re: Help need to fix server disk errors.

    Thanks for looking for setup guides and thanks for helping me out.

  9. #9
    Join Date
    Mar 2010
    Location
    Been there, meh.
    Beans
    Hidden!
    Distro
    Ubuntu

    Re: Help need to fix server disk errors.

    Quote Originally Posted by tross9 View Post
    Thanks for looking for setup guides and thanks for helping me out.
    I fear the guide links I was thinking of have been removed from these forums - lots of us older guys have been dying off, so our personal websites get taken down after death by our families. I think the username was "Hammond" who used to provide links to his base server setup with LVM. I looked for his personal website. Couldn't find it either.

    More and more, older posts have been disappearing. I've noticed that some of my diagrams have been removed. Don't know why and I didn't do it. OTOH, these forums are going away next in January, so nobody will be posting here. Haven't decided about Discourse.

  10. #10
    Join Date
    Jul 2021
    Beans
    149

    Re: Help need to fix server disk errors.

    TheFu,

    I think you're thinking of "LHammonds".
    "Just because you can do it doesn't mean you should do it."

    "If it ain't broke don't fix it."


Page 1 of 2 12 LastLast

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •