I've got an Ubuntu 12.04 server (a VM running on a vmWare ESXi 4 host if that matters) running a pretty bland LAMP stack which hosts my website, a Wordpress blog. I keep it up to date - both with Ubuntu patches and Wordpress patches. It doesn't use a great deal of plugins and is using a stock theme. Beyond that, I write on it and post photos and link to YouTube videos. Pretty mundane, boring stuff.
This morning when I tried to access my site, the browser just timed out. When I checked via SSH, which was a very frustrating, slow process, the server was under a very heavy load (in excess of 8.0 according to top). But I couldn't find ANYTHING hogging resources. Top didn't show a single process using more than 4% of CPU time - mySQL and Apache were swapping back and forth. Still not enough to cause the load and sluggishness I was seeing. Combined CPU use was like less than 6 or 7%. The machine is sitting idle most of the time (Very light traffic. 6k visitors in a 24 hour period, better than half of these being crawlers.)
I checked the other two virtual machines on that host and they were using less CPU and resources than my server. Suffice to say the host was handling the load just fine. Yet my server was unresponsive to the point where I rebooted it to get the site back online.
I have fail2ban on the machine. I also run ufw, blocking all but the needed ports (apache, dns, ssh). Yesterday, fail2ban blocked 6 addresses for SSH attempts...ufw is blocking some things, but that's its job. Beyond that, I'm not seeing any real concerns.
I'd be willing to call this a fluke except it's happened before - several times now. Not like clockwork, but about 14 days of uptime is what it takes to get the machine into this state. I can find nothing wrong, nor can I see an active attack. It just goes bonkers.
So I'm wondering if anyone else has seen this and what I can do to stop it.