itsadok
January 25th, 2009, 06:19 AM
I have several Ubuntu 8.04 servers in a remote location. The servers are heavily loaded, with CPU usage in the upper 90s.
Occasionally (every 4-5 weeks), a server would stop responding when I try to ssh to it. Since I don't have physical access to the servers I have had to ask the remote site support to power cycle the server, which usually means lots of work when it comes back up.
In my despair, I tried installing telnetd on the servers so that I'd have an alternative method of logging in on emergencies. However, this didn't help since telnet hangs in a similar fashion.
Note that the server still works - I have http access and the log files show no noticeable problem.
When I do ssh -vvv I see that the authentication succeeds, and the connection hangs after "Entering interactive session.". When I telnet I get the welcome message, I enter my username and password and then it hangs.
I'm guessing something is happening in .bashrc or .profile, but I haven't touched it other than add a few aliases.
So:
1. Anything I can do to access my server right now without a hard reset?
2. Anything I can do to prevent this from happening again?
Right now my best idea is to add a web interface to shut down my processes, so that a hard reset would cause less damage, but I don't love that idea for several reasons.
Occasionally (every 4-5 weeks), a server would stop responding when I try to ssh to it. Since I don't have physical access to the servers I have had to ask the remote site support to power cycle the server, which usually means lots of work when it comes back up.
In my despair, I tried installing telnetd on the servers so that I'd have an alternative method of logging in on emergencies. However, this didn't help since telnet hangs in a similar fashion.
Note that the server still works - I have http access and the log files show no noticeable problem.
When I do ssh -vvv I see that the authentication succeeds, and the connection hangs after "Entering interactive session.". When I telnet I get the welcome message, I enter my username and password and then it hangs.
I'm guessing something is happening in .bashrc or .profile, but I haven't touched it other than add a few aliases.
So:
1. Anything I can do to access my server right now without a hard reset?
2. Anything I can do to prevent this from happening again?
Right now my best idea is to add a web interface to shut down my processes, so that a hard reset would cause less damage, but I don't love that idea for several reasons.