I upgraded my Ubuntu Hardy server to Intrepid the night before last.
When I woke up yesterday morning, my main server I use as my ssh
gateway into the others was totally messed up.
My server is FreeBSD. I haven't had to touch it since I set it up.
My clients are 6 Ubuntu servers. All are identical in packages,
configs (I diff'ed /etc), kernel versions, network setup, etc.
Only *one* of the machines is suffering (and of course, one of the
most important ones) - and it isn't even one of the busiest.
I've tried downgrading the kernel on the box suffering the issue from
2.6.27-10 to 2.6.27-7, my next attempt will be picking a kernel .deb
that was from the previous Ubuntu release...
What is odd is that it works great after reboot and lasts for a couple
hours, then stops working. I can umount -l /home and then try to
remount it (see below) but it never gets anywhere and eventually dies
with a generic message. I tried to strace -f it, and it gave me
nothing to work with. The FreeBSD server doesn't give me anything in
logs to go off of either. I can ping and ssh between the two no
problem at this point still. It's just NFS that is odd. Also I did
notice trying to restart services manually and try to debug them that
portmap seemed to throw a kernel error in my logs once in a while. But
I don't get a connection to portmap when I run the mount command, and
I would assume if portmap is required for mounting NFS shares that it
would need to contact it. That could totally be irrelevant though.
Any help or insight or request for additional information is
appreciated. On-list or off-list is fine. I will pay someone via
Paypal who can help me resolve this quickly...
[root@lvs01 ~]# mount -vvvv /home
mount: fstab path: "/etc/fstab"
mount: mtab path: "/etc/mtab"
mount: lock path: "/etc/mtab~"
mount: temp path: "/etc/mtab.tmp"
mount: spec: "raid01:/home"
mount: node: "/home"
mount: types: "nfs"
mount: opts: "rsize=8192,rsize=8192,tcp,rw,acregmin=30"
mount: external mount: argv[0] = "/sbin/mount.nfs"
mount: external mount: argv[1] = "raid01:/home"
mount: external mount: argv[2] = "/home"
mount: external mount: argv[3] = "-v"
mount: external mount: argv[4] = "-o"
mount: external mount: argv[5] = "rw,rsize=8192,rsize=8192,tcp,acregmin=30"
mount.nfs: timeout set for Sun Dec 7 06:36:39 2008
mount.nfs: text-based options:
'rsize=8192,rsize=8192,tcp,acregmin=30,addr=10.13. 220.94'
(just stalls here, normally a connection is near instant. eventually
it will die with a generic error message. i can control-C to quit it
too, so it's not frozen completely)
thanks...
Bookmarks