I need help diagnosing/fixing a freezing/halting issue.
Running 14.04 Server. Completely fresh install + apt-get upgrade + ssh server. Very stock/normal install.
If I leave the server on, it becomes completely unresponsive/freezes after 24-48 hours (usually overnight). By unresponsive I mean, completely unresponsive. The monitor is on waiting for login on tty1 and does not respond to keyboard input (there is no blinking cursor). On a previous install I had serial console (getty) set up, and it was also unresponsive on the serial console. Trying to connect via ssh fails, and the server does not respond to pings.
After a hard reset:
/var/log/syslog shows nothing abnormal. cron/anacron runs every hour and then it doesn't. that's the end of the log.
dmesg shows no messages past boot messages.
I have a server-grade motherboard Supermicro A1SAi-2750F with logging enabled. There is nothing in the logs.
To diagnose hardware issues:
I have ECC RAM. Using memcheck tools shows nothing abnormal.
I have replaced the PSU with one from a working machine. Motherboard also shows normal voltages on the PSU, pre and post freeze.
I have replaced the hard drive and SATA cables (from a working machine)
I believe the only parts I have not replaced is the motherboard and the RAM. They are both expensive parts.
Part of what makes this a difficult problem to diagnose is that I must wait days after changing something to see if the freeze will happen again. I have not idea how to initialize the halt.
Thanks in advance for any help.
EDIT: Based on this thread http://ubuntuforums.org/showthread.php?t=2187009 I have uninstalled pm-utils and powermgmt-base (although I did not see any power logs?). I won't be sure if that did anything until a few days have passed.
Bookmarks