PDA

View Full Version : [ubuntu] [12.04.1 - raid - lvm] Freeze during boot



Commifreak
September 16th, 2012, 07:21 PM
Hi :)

I am using Zentyal v3 which uses Ubuntu 12.04.1.

I've already posted on the Zentyal forums [1], but I think its a Ubuntu issue, so I try to get a hint on these site :)

I have an existing lvm and raid configuration from the previous Zentyal version (Ubuntu 10.04).
After starting the 12.04 installer (no upgrade) - the partitioner has detected the raid and my lvm config (with curious names, like md125, md126 and md 127).

The installation was successfully, I've formatted all partitions except the /home partition.

Deatiled config:

4 x 2TB drives
2 x Raid 1 (/boot and swap)
1 x Raid 5 (LVM)

the LVM provides 2 partitions: / and /home

After the installation, Ubuntu hangs on every boot at "The disk drive for x is not yet ready or not present".

I've figured out, that the installer misconfigured the fstab: It had set the swap-partition device as "/dev/md125" which does not exist. The right name was /dev/md1.

After I changed this and upgraded 5 packets (linux kernel, linux-firmware and a few libs) the message gone away, but the boot still freezes at this point:


fsck from util-linux 2.20.1
fsck from util-linux 2.20.1
fsck from util-linux 2.20.1
/dev/mapper/VG1-System: clean, 88435/3055616 files, 558454/12206080 blocks
/dev/md0: clean, 230/488640 files, 108855/975860 blocks
/dev/mapper/VG1-Data: clean, 27599/181342208 files, 645157851/1450730496 blocks

After 2-3 minutes, the boot continues without any error. All drives are mounted.

Where to begin to search? Has someone a hint?

Should I clean all drives and create both, the raid and the lvm with 12.04 again?

Thanks for any hint!

mdadm.conf:

# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan, using
# wildcards if desired.
#DEVICE partitions containers

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays
ARRAY /dev/md/0 metadata=1.2 UUID=fa658e2d:9cf50829:dc877c7d:68e5af77 name=server:0
ARRAY /dev/md/1 metadata=1.2 UUID=127f876f:d06c6640:10e117cb:9eefbbf5 name=server:1
ARRAY /dev/md/2 metadata=1.2 UUID=2e48847e:e825e289:b0463acd:d643e598 name=server:2

# This file was auto-generated on Fri, 14 Sep 2012 21:00:00 +0200
# by mkconf $Id$



fstab:

# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point> <type> <options> <dump> <pass>
proc /proc proc nodev,noexec,nosuid 0 0
/dev/mapper/VG1-System / ext4 errors=remount-ro 0 1
# /boot was on /dev/md127 during installation
UUID=6bd29ed7-06ec-422c-a39e-dac8a3b97d9e /boot ext2 defaults 0 2
/dev/mapper/VG1-Data /home ext4 defaults 0 2
# MODIFIED BY ME (original: /dev/md125)
/dev/md1 none swap sw 0 0

[1] - http://forum.zentyal.org/index.php/topic,12030.0.html

MikeNash1977
September 27th, 2012, 08:10 PM
Nudge.. Commifreak (http://ubuntuforums.org/member.php?u=787124) have you had any success with this, or does anyone else out there have any ideas? My situation is slightly different, but the symptoms are the same.

I am running Ubuntu 12.04 LTS 64-bit server, and everything was working fine but around a week ago my server starting hanging during boot. Using the recovery option from grub, I managed to get it to boot without mounting my logical volume. I updated fstab not to mount the logical volume, and that gets the server booting.

If I then mount the logical volume manually from the command prompt, mount sits there for about two or three minutes before eventually finishing, at which point the logical volume is mounted without error.

My logical volume is on a soft raid 5 array. I have checked the array, no errors. I have run fsck.ext4 on the logical volume, no errors.

I am wondering based on the timing if this may have been due to an update? Is that possible?

Any help you can give would be greatly appreciated!

Thanks


-- Mike --

P.S. I am Windows convert, so I'm fairly new to the whole linux thing. If I may have missed something simple, please don't hesitate to say.

Commifreak
September 27th, 2012, 08:46 PM
Hi Mike,

no, nothing new to this thing.

I've made a few tests but I gave up :-/

I've posted a few more things here (http://forum.zentyal.org/index.php/topic,12030.msg50077.html#msg50077).

Please have a special look onto post #5, too :)

MikeNash1977
September 27th, 2012, 09:12 PM
Hi Commifreak

Thanks for the update and the link! I don't know if this will help you, but after banging my head against this thing for the last week this evening I have finally got somewhere (sort of)!

The update theory (for my system at least) is right. There seems to be a bug in kernel 3.2.0-31-generic, fortunately I still have the 3.2.0-29-generic kernel in my grub "previous versions" menu. If I select that to boot then everything starts up perfectly with the logical volume in fstab, and a further manual test shows a normal mount time (under two seconds).

For now, I've updated grub to boot into the older kernel by default, which has fixed things for me. I'll keep an eye on further kernel updates to see if the problem disappears (all I need to do is fathom out how to make sure that that kernel doesn't disappear off my machine without my permission).

I also see if I can fathom out how the heck you file a ubuntu bug report and do that as well.

I hope that helps you out, and again thanks for the help!

Cheers


-- Mike --

Commifreak
September 27th, 2012, 09:32 PM
Dammit - You're right.

I noticed, that this hang was there together with the kernel-update - I've forgot that already.

So, it seems that this issue is kernel-related :D

Thanks for this hint!

elsalvador
September 29th, 2012, 09:39 AM
Just to confirm that this fix worked for me too.

LVM with 9Tbytes as well as the usual / & boot partitions etc
ran fsck every boot
This took many minutes

tune2fs -l /dev/mapper/Storage-filestore

This told me the volume was CLEAN, but still fsck every boot

took the above advice & set grub to default to 3.0.0.29 and all good. Nice one guys, thanks.:KS

OOH - PS - don't try to manage Grub2 from webmin, use the grub2 customizer, easy peasy.

Commifreak
September 29th, 2012, 12:49 PM
Hi,

I've set up the productive system again today.

Now, finally - its a problem with the kernel :D 3.2.0-29 is working well.

I hope this issue will be gone in future kernel releases.