PDA

View Full Version : [ubuntu] How can I get my data back, LVM2+Soft RAID1 with One HDD failed


Hooman
January 4th, 2009, 09:03 AM
Hi,
I've my ubuntu server box configured in this way:
sda: HD 250Gb with LVM2 and linux OS,
sdb: HD 500Gb software RAID1 for data
sdc: HD 500Gb software RAID1 for data

All my important data are in the pair of mirrored 500G SATA disks. (I build the RAID and LVM in Feb/2008, following the guide in http://www.linuxdevcenter.com/pub/a/linux/2006/04/27/managing-disk-space-with-lvm.html?page=2)

I re-installed ubuntu 8.0.4 lts server x64 edition in Sep/2008. And the RAID-1 disks had worked fine with the new OS for months.

But after a normal shutdown, I found my server can't start up automatically and I have to press ctrl-D to exit busy box for each time. The saddest thing was that the volume built on the RAID-1 device was gone.

I noticed that one of the hard disk(sdc) was damaged, so I pull it out. But it can't fix my problem. I am wondering how can I get my data back.
I tried some tips which get by google, but no use.
# mdadm --assemble /dev/md0
mdadm: no devices found for /dev/md0

# mdadm --assemble -scan /dev/md0
mdadm: /dev/md0 not identified in config file.

# mdadm --assemble -scan
mdadm: No arrays found in config file


I checked the partition of remained disk:

Device Boot Start End Blocks Id System
/dev/sdb1 1 60801 488384001 8e Linux LVM


My mdadm.conf is as follow:
# cat /etc/mdadm/mdadm.conf
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default, scan all partitions (/proc/partitions) for MD superblocks.
# alternatively, specify devices to scan, using wildcards if desired.
DEVICE partitions

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=0e468419:3148659a:cda2274c:931f8ce2

# This file was auto-generated on Sat, 09 Aug 2008 13:30:20 +0000
# by mkconf $Id$


My backup configuration file for the VG
# cat /etc/lvm/backup/krmbVG2
# Generated by LVM2: Sat Aug 9 13:30:23 2008

contents = "Text Format Volume Group"
version = 1

description = "Created *after* executing '/sbin/vgcfgbackup'"

creation_host = "kurumba" # Linux kurumba 2.6.24-19-generic #1 SMP Wed Jun 18 14:15:37 UTC 2008 x86_64
creation_time = 1218288623 # Sat Aug 9 13:30:23 2008

krmbVG2 {
id = "qT0kDo-WqMx-Jdwd-41gS-u2o7-e8Be-iI31U7"
seqno = 12
status = ["RESIZEABLE", "READ", "WRITE"]
extent_size = 8192 # 4 Megabytes
max_lv = 0
max_pv = 0

physical_volumes {

pv0 {
id = "UDby3C-CdIU-eOBO-trXa-ksBD-6RFu-ZF83PY"
device = "/dev/md0" # Hint only

status = ["ALLOCATABLE"]
dev_size = 976767872 # 465.759 Gigabytes
pe_start = 384
pe_count = 119234 # 465.758 Gigabytes
}
}

logical_volumes {

krmbDataBackup {
id = "HK2RX3-eiKa-Je91-7h8M-n7Tj-Bf2F-hQVcc5"
status = ["READ", "WRITE", "VISIBLE"]
segment_count = 1

segment1 {
start_extent = 0
extent_count = 119232 # 465.75 Gigabytes

type = "striped"
stripe_count = 1 # linear

stripes = [
"pv0", 0
]
}
}
}
}



# lvm pvdisplay
--- Physical volume ---
PV Name /dev/sda5
VG Name krmbVG1
PV Size 232.65 GB / not usable 1.49 MB
Allocatable yes
PE Size (KByte) 4096
Total PE 59557
Free PE 1189
Allocated PE 58368
PV UUID JC7h5P-jTg0-YcLm-XK52-humb-LVHL-V6zH5S

--- NEW Physical volume ---
PV Name /dev/sdb1
VG Name
PV Size 465.76 GB
Allocatable NO
PE Size (KByte) 0
Total PE 0
Free PE 0
Allocated PE 0
PV UUID vMoP99-cpsM-7HVV-Mu8u-2Qrr-dDfx-v6wMpg

Hooman
January 4th, 2009, 11:52 PM
Does the partition type "8e" means all data in my RAID disks have been damaged?

Hooman
January 6th, 2009, 09:33 PM
Does anybody can help me? I am afraid to kill the hope of recovery, so I stopped doing anything against the disk.

But I don't know where I can ask for help except this forum.

fjgaude
January 8th, 2009, 11:38 AM
Does the partition type "8e" means all data in my RAID disks have been damaged?

I don't use LVM with raid but the 8E means that the volume is Linux LVM. I don't know if that is what you wish but it would seem your data may be okay so far.

Coreigh
January 8th, 2009, 01:18 PM
You need to "fail" the bad disk from the array and run the array with one disk. You can do this because it is a mirror, it would not work on a striped array (or on a RAID5 with less than 4 disks.)

Please do some research as I am not 100% sure of this command but it is something like this:
sudo mdadm --manage /dev/md0 --fail /dev/sdX
Where sdX is the bad disk. For example if it is disk 2 then it would be /dev/sdb.

Then to check the state of md0;
sudo mdadm --detail /dev/md0

References here:
http://joshua.hoblitt.com/blog/2008/02/replacing-failed-mdadm-mirror-disk.html

and here:
http://nst.sourceforge.net/nst/docs/user/ch14.html

Hooman
January 12th, 2009, 01:39 AM
You need to "fail" the bad disk from the array and run the array with one disk. You can do this because it is a mirror, it would not work on a striped array (or on a RAID5 with less than 4 disks.)

Please do some research as I am not 100% sure of this command but it is something like this:
sudo mdadm --manage /dev/md0 --fail /dev/sdX
Where sdX is the bad disk. For example if it is disk 2 then it would be /dev/sdb.

Then to check the state of md0;
sudo mdadm --detail /dev/md0

References here:
http://joshua.hoblitt.com/blog/2008/02/replacing-failed-mdadm-mirror-disk.html

and here:
http://nst.sourceforge.net/nst/docs/user/ch14.html

I tried this command
$ sudo mdadm --manage /dev/md0 --fail /dev/sdc1
mdadm: cannot get array info for /dev/md0

It seems that mdadm can't find any RAID information on the remained disk (/dev/sdb1)

I am afraid the RAID supperblock has been changed or something.
What I found is as below, but I don't know how to view the contents in /dev/sdb1 without modifying or destroying anything.

$ sudo fdisk -l

Disk /dev/sda: 250.0 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0xd779d779

Device Boot Start End Blocks Id System
/dev/sda1 * 1 31 248976 83 Linux
/dev/sda2 32 30401 243947025 5 Extended
/dev/sda5 32 30401 243946993+ 8e Linux LVM

Disk /dev/sdb: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sdb1 1 60801 488384001 8e Linux LVM

$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
unused devices: <none>

$ sudo mdadm --detail /dev/md0
mdadm: md device /dev/md0 does not appear to be active.

Hooman
January 12th, 2009, 01:45 AM
I don't use LVM with raid but the 8E means that the volume is Linux LVM. I don't know if that is what you wish but it would seem your data may be okay so far.

Good news to me. Thanks a lot!

When I was creating the md device, I used to change this partition type to "fd". But what's a pity that I didn't record the partition type after I finished setting up the LVM2 & Soft RAID1. Or else I should know if the "8e" is normal().

Hooman
January 19th, 2009, 10:58 AM
I bought a new disk drive and installed it at /dev/sdb
Now I'm trying

dd if=/dev/sdc of=/dev/sdb bs=10485760 &

When it finished, I can start trying to recover the data in /dev/sdb.
Anyone have better suggestion?

hyper_ch
January 19th, 2009, 12:06 PM
here's a howto with two failing disks (and shrinking the lvm). There should be enough things covered to recover your data:

Don't do it 1:1 but try to understand what was done for recovering.

http://www.howtoforge.com/how-to-resize-lvm-software-raid1-partitions-shrink-and-grow

Hooman
January 20th, 2009, 05:00 AM
here's a howto with two failing disks (and shrinking the lvm). There should be enough things covered to recover your data:

Don't do it 1:1 but try to understand what was done for recovering.

http://www.howtoforge.com/how-to-resize-lvm-software-raid1-partitions-shrink-and-grow

Thank you very much! It looks like I have already get my data back!!!

What I have done is:
1. Modify the system ID of the partion /dev/sdb1 to "fd", so it looks like:
# sfdisk -l /dev/sdb

Disk /dev/sdb: 60801 cylinders, 255 heads, 63 sectors/track
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

Device Boot Start End #cyls #blocks Id System
/dev/sdb1 0+ 60800 60801- 488384001 fd Linux raid autodetect
/dev/sdb2 0 - 0 0 0 Empty
/dev/sdb3 0 - 0 0 0 Empty
/dev/sdb4 0 - 0 0 0 Empty

2. Issued an mdadm --create command using the disk I just cloned (I forgot to try mdadm --assemble -scan /dev/md0 first after modified the partition table. Anyway, "create" worked)
$ sudo mdadm --create /dev/md0 -a yes -l 1 -n 2 /dev/sdb1 missing

Then checked the status, the /dev/md0 was active.
# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid1 sdb1[0]
488383936 blocks [2/1] [U_]

unused devices: <none>


It was strange that /etc/mdadm/mdadm.conf had not been changed.

3. Restore the VG.
I issued
# lvm pvscan
# lvm vgscan

It found a new PV on /dev/md0, but could't found my VG.
So I tried
# dd if=/dev/md0 bs=512 count=255 skip=1 of=/tmp/md0-raw-start

to get the VG meta data. Then I found I had a backup VG file which was much more readable.

I issued "vgcfgrestore krmbVG2" but failed. So I update the backup VG file, just changed the PV ID to which I got from
#file /tmp/md0-raw-start
/tmp/md0-raw-start: LVM2 (Linux Logical Volume Manager) , UUID: vMoP99cpsM7HVVMu8u2QrrdDfxv6wMp


Finally, I could active all my VGs and LVs on the disk.

# vgcfgrestore krmbVG2
# vgscan
# vgchange krmbVG2 -a y
# lvscan


4. Mount the LV.
I mount the Logical Volume. Cool!!! All my photos, movies are back!

Thank you very much!

Now....
I checked the partition table for /dev/sdb, the system id is still "fd". Don't know why the system id for /dev/sdc was changed in the event.

I still need to add the /dev/sdc back to the RAID group. But /dev/sdc is not new, and it almost same as the /dev/sdb except the partion system id.
# sfdisk -l /dev/sdc

Disk /dev/sdc: 60801 cylinders, 255 heads, 63 sectors/track
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

Device Boot Start End #cyls #blocks Id System
/dev/sdc1 0+ 60800 60801- 488384001 8e Linux LVM
/dev/sdc2 0 - 0 0 0 Empty
/dev/sdc3 0 - 0 0 0 Empty
/dev/sdc4 0 - 0 0 0 Empty


Can I just change the partition system id and use following command to add it directly?
$ sudo mdadm --manage /dev/md0 --add /dev/sdc1

hyper_ch
January 20th, 2009, 06:28 AM
I really don't know what your original setup was... if you're unsure of what you're doing, then get another harddisk, make a 1:1 clone of it using dd and then try to work on the clone to get it all working again :)