Update:
I made sure to back up my /home before proceeding.
From the installation of 12.04.1 that I accomplished with great difficulty, I executed a mount command. This shows, among other things:
Code:
/dev/md2p1 on / type ext4 (rw, errors=remount-ro)
/dev/md3 on /home type ext4 (rw)
I have no idea how I ended up with /dev/md2p1 instead of plain-old /dev/md2. A partition inside a partition? And that's where the OS installed? That isn't what I wanted. If I was sure that it wouldn't take me three hours to reinstall 12.04.1, I might reformat both of my hard drives and start again.
Onward. From the live CD, with swap disabled and all hard drive partitions unmounted:
badblocks /dev/sda took ~2.5 hours, during which time the hard disk access light was on steadily, and returned... nothing. No errors, apparently.
badblocks /dev/sdb took TEN hours, even though it's the same size as /dev/sda, during which time the hard disk access light was on INTERMITTENTLY, and returned...
Code:
625130700
625130701
625130702
625130703
625130704
625130705
625130706
625130707
OK, I have bad blocks. I'm not sure exactly where on /dev/sdb they reside. So: I tried fdisk -l :
Code:
Disk /dev/sda: 640.1 GB, 640135028736 bytes
255 heads, 63 sectors/track, 77825 cylinders, total 1250263728 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0000970f
Device Boot Start End Blocks Id System
/dev/sda1 2048 15624191 7811072 fd Linux raid autodetect
/dev/sda2 15624192 54685695 19530752 fd Linux raid autodetect
/dev/sda3 54685696 93747199 19530752 fd Linux raid autodetect
/dev/sda4 93747200 1250263039 578257920 fd Linux raid autodetect
Disk /dev/sdb: 640.1 GB, 640133946880 bytes
255 heads, 63 sectors/track, 77825 cylinders, total 1250261615 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0009028a
Device Boot Start End Blocks Id System
/dev/sdb1 * 2048 15624191 7811072 fd Linux raid autodetect
/dev/sdb2 15624192 54685695 19530752 fd Linux raid autodetect
/dev/sdb3 54685696 93747199 19530752 fd Linux raid autodetect
/dev/sdb4 93747200 1250260991 578256896 fd Linux raid autodetect
Disk /dev/md0: 7997 MB, 7997476864 bytes
2 heads, 4 sectors/track, 1952509 cylinders, total 15620072 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Disk /dev/md0 doesn't contain a valid partition table
Disk /dev/md1: 20.0 GB, 19998367744 bytes
2 heads, 4 sectors/track, 4882414 cylinders, total 39059312 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Disk /dev/md1 doesn't contain a valid partition table
Disk /dev/md2: 20.0 GB, 19998367744 bytes
255 heads, 63 sectors/track, 2431 cylinders, total 39059312 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0000c8cb
Device Boot Start End Blocks Id System
/dev/md2p1 2048 39057407 19527680 83 Linux
Disk /dev/md3: 592.1 GB, 592133873664 bytes
2 heads, 4 sectors/track, 144563934 cylinders, total 1156511472 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000
Disk /dev/md3 doesn't contain a valid partition table
If I total up the number of blocks indicated on /dev/sdb, I get 625,129,472. But my first bad block is numbered 625,130,700? That high number suggests that the bad blocks aren't inside any of the partitions!
I don't know whether this is a problem. In any case, the HUGE difference in time between the badblocks scan of /dev/sda and /dev/sdb concerns me. It suggests that /dev/sdb is running at a quarter the speed that it should. Is it dying on me?
Finally, I tried sudo apt-get install mdadm; then sudo mdadm --assemble --scan; and finally, fsck on each RAID partition that I could. I got several error messages when I tried various forms of the fsck command. The fsck choices seem to be limited on the live CD. The live CD does not include fsck.swap, if it even exists, so I could not scan my swap partition, /dev/md0. The correct syntax to check an ext4 partition, per this post, is apparently fsck -fyv <device name>.
sudo fsck -fyv /dev/md1 returns:
Code:
fsck from util-linux 2.20.1
e2fsck 1.42 (29-Nov-2011)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences: +4242389 +4242397 +(4243248--4243272) +(4243275--4243295) +(4244844--4244847) +(4246075--4246079) +(4246100--4246111) +(4246132--4246143) +(4246180--4246200) +(4246202--4246203) +(4246231--4246239) +4246257 +(4246260--4246271) +4246305 +(4246308--4246319) -(4246328--4246335) +4246369 +(4246372--4246391) -4246399 +(4246417--4246431) +(4246461--4246463) +(4246481--4246495) +(4246513--4246527) +(4246595--4246607) -(4246608--4246623) +(4246647--4246655) +(4246727--4246735) -(4246736--4246751) +(4246775--4246783) -(4246864--4246879) +(4246903--4246911) -(4246992--4247007) +(4247029--4247039) +(4247111--4247119) -(4247120--4247135) +(4247157--4247167) +(4247225--4247231) +(4247287--4247295) +(4247331--4247336) +(4247340--4247343) -(4247344--4247359) +(4247418--4247423) +4247471 -(4247472--4247487) +(4247537--4247543) -(4247984--4247999) -4248029 +(4248030--4248031) -(4248052--4248063) -(4248807--4248824) -(4248826--4248827) -4248849 +(4248850--4248851) -(4248852--4248863) -4248881 +(4248882--4248883) -(4248884--4248895) +(4248958--4248959) -4249025 +(4249026--4249027) -(4249028--4249039) +(4249082--4249083) -(4249084--4249087) -4254141 +(4254142--4254143) -(4254168--4254175) -(4254195--4254207) -(4254440--4254451) -4254454 +4254455 -(4254531--4254543) -(4254584--4254591) -(4254660--4254671) -4254709 +(4254710--4254711) -(4254712--4254719) -(4254788--4254799) -4254837 +(4254838--4254839) -(4254840--4254847) +(4254914--4254915) -(4254916--4254927) -(4254963--4254975) -(4255035--4255039) +(4255078--4255079) -(4255080--4255087) +(4255174--4255175) +(4255222--4255223) -(4255268--4255271) +(4255342--4255343) +(4255422--4255423) +(4255478--4255479) +(4255530--4255531) -(4255532--4255535) +(4255598--4255599) -(4255676--4255679) +(4255734--4255735) +(4255982--4255983) +(4256038--4256039) +(4256166--4256167) +(4256234--4256235) -(4256902--4256903) -(4256946--4256947) -(4257018--4257019) -(4257166--4257167) -(4257542--4257543) -(4257662--4257663) -(4257726--4257727) +(4260066--4260092) -(4261208--4261234)
Fix? yes
/dev/md1: ***** FILE SYSTEM WAS MODIFIED *****
153673 inodes used (12.59%)
99 non-contiguous files (0.1%)
168 non-contiguous directories (0.1%)
# of inodes with ind/dind/tind blocks: 0/0/0
Extent depth histogram: 113719/17
720026 blocks used (14.75%)
0 bad blocks
1 large file
86743 regular files
13115 directories
55 character device files
25 block device files
0 fifos
33 links
53725 symbolic links (39848 fast symbolic links)
1 socket
--------
153697 files
Whatever those block bitmap differences are, they just got fixed.
Next, sudo fsck -fyv /dev/md2
Code:
fsck from util-linux 2.20.1
e2fsck 1.42 (29-Nov-2011)
fsck.ext2: Superblock invalid, trying backup blocks...
fsck.ext2: Bad magic number in super-block while trying to open /dev/md2
The superblock could not be read or does not describe a correct ext2
filesystem. If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 <device>
I'm not sure if this is serious or not. The superblock could not be read, but this is my mystery partition which apparently has a child partition inside it. I did not try rerunning with an alternate superblock.
sudo fsck -fyv /dev/md2p1 yields:
Code:
fsck from util-linux 2.20.1
e2fsck 1.42 (29-Nov-2011)
/dev/md2p1: recovering journal
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong (3891192, counted=3873488).
Fix? yes
Free inodes count wrong (1038877, counted=1037276).
Fix? yes
/dev/md2p1: ***** FILE SYSTEM WAS MODIFIED *****
183332 inodes used (15.02%)
125 non-contiguous files (0.1%)
236 non-contiguous directories (0.1%)
# of inodes with ind/dind/tind blocks: 0/0/0
Extent depth histogram: 141781/29
1008432 blocks used (20.66%)
0 bad blocks
1 large file
109752 regular files
18007 directories
55 character device files
25 block device files
0 fifos
34 links
55483 symbolic links (41433 fast symbolic links)
1 socket
--------
183357 files
Another minor(?) fix.
And finally, sudo fsck -fyv /dev/md3 returns:
Code:
fsck from util-linux 2.20.1
e2fsck 1.42 (29-Nov-2011)
Pass 1: Checking inodes, blocks, and sizes
Inodes that were part of a corrupted orphan linked list found. Fix? yes
Inode 17580946 was part of the orphaned inode list. FIXED.
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Unattached inode 17580946
Connect to /lost+found? yes
Inode 17580946 ref count is 65535, should be 1. Fix? yes
Inode 17581027 ref count is 1, should be 2. Fix? yes
Unattached inode 17581256
Connect to /lost+found? yes
Inode 17581256 ref count is 2, should be 1. Fix? yes
Pass 5: Checking group summary information
/dev/md3: ***** FILE SYSTEM WAS MODIFIED *****
68877 inodes used (0.19%)
1319 non-contiguous files (1.9%)
15 non-contiguous directories (0.0%)
# of inodes with ind/dind/tind blocks: 0/0/0
Extent depth histogram: 68718/127/2
38257487 blocks used (26.46%)
0 bad blocks
2 large files
63001 regular files
5841 directories
0 character device files
0 block device files
0 fifos
1 link
25 symbolic links (19 fast symbolic links)
1 socket
--------
68867 files
I'm in a bit over my head. I don't know whether there are any serious issues here or not. I'm rebooting from my hard drive now, to see whether anything has changed. I would especially like to see whether the system shuts down quickly now, instead of taking over an hour (and maybe never shutting down at all). The fsck repairs I just did here took only a few seconds per partition. Therefore I doubt that fsck was the reason that my shutdown was delayed the last time.
Bookmarks