Brief description of the problem
I and many other people (just search this forum for "overheating"!) have found that their laptop CPUs get insanely hot when running Linux because the CPU fan does not turn on. Or, to be more specific, the fan turns on when the laptop first turns on, remains on during BIOS POST, grub and the first few seconds of Linux's boot process but then turns off and stays off until the CPU gets near its critical temperature.
In my case, the CPU fan doesn't turn on until the CPU gets to 105 degrees C (that's not a typo: I do mean 5 degrees above the boiling point of water! That temp is reported by the temp sensor on the CPU using the coretemp module, and I've double-checked using an IR thermometer pointed at the CPU heatsink).
I've read many, many forum theads on this issue. I'm pretty confident there isn't a simple fix. So what I need to know is: where do I report this bug?
Lots more details of this issue:
This isn't a hardware issue
The failure of the fan to start isn't a hardware issue: the fan is free from dust and the fan runs perfectly in Windows 7 / grub / the laptop BIOS screen.
What's so bad about 105 degrees C?
The processor's specification sheet states that the processor (an i5) can handle a max temp of 105 degrees C. But the plastic keyboard immediately above the CPU almost certainly can't! And my lap can't either. And the motherboard probably doesn't enjoy the thermal stresses either. There are reports on this forum of motherboards needing to be replaced and over forced shutdowns due to overheating. So, however you look at it, a laptop CPU should never be allowed to get anywhere near 105 degrees C. Windows 7 keeps the CPU below 70 degrees C, even under heavy load.
The fact that Linux allows the CPU to get so hot is a very serious bug IMHO. As far as I'm concerned, bugs which can potentially wreck expensive hardware are a high priority.
The desired fix
The problem is not to do with throttling the CPU. The problem is really very simple: the fan should just be on all the time. (I'm seriously considering just permanently connecting power to the fan.)
Things I've tried
Alternative Linux distros
I'm running Ubuntu 12.04 64-bit so I thought I'd try some alternatives. They all exhibited exactly the same behaviour of the fan:
- Fedora 17 64-bit live CD (kernel 3.3.4)
- Ubuntu 12.10 64-bit live CD (kernel 3.5)
- Ubuntu 12.04 32-bit live CD (kernel 3.2)
- Ubuntu 11.10 64-bit live CD
- Ubuntu 11.04 64-bit live CD
- Ubuntu 10.10 64-bit live CD
lm-sensors and fancontrol
I ran sensors-detect. It told me to add the coretemp module to /etc/modules, which I did. sensors reports the temperature of both my physical CPU cores. pwmconfig complains that "There are no pwm-capable sensor modules installed" so I can't use fancontrol.
Just reports the fans are set to "-1" and refuses to change them. (Yes, i8kmon is loaded.)
Tinkering with the /proc/ or /sys/ file system
I don't have a /proc/acpi/fan directory. The directories /sys/class/thermal/thermal_zone?/ appear to only describe a few heat sensors on the motherboard (not the CPU temperatures reported by coretemp) and I can't seem to modify the trip_point_0_temp files.
The CPU temps seem to live in /sys/class/hwmon/hwmon1/device. If I attempt to change the max from 95 degrees C to 50 degrees C by running sudo bash -c "echo 50000 > /sys/class/hwmon/hwmon1/device/temp2_max" I get the following error: bash: /sys/class/hwmon/hwmon1/device/temp2_max: Permission denied
When my CPU fan does run (i.e. when the CPU temp gets near to 105 degrees C) nothing changes in /sys/class/thermal/cooling_device?/cur_state. i.e. Linux doesn't appear to acknowledge that the fan has started to run.
Incidentally, Hardware Monitor (running under Windows) can't find my CPU fan either (but the fan just magically works under Windows). So clearly my fan isn't exposing itself very clearly to the OS.
The following all had no effect:
The only thing I tried which had any effect on my CPU fan was starting with the boot option acpi=off. This allowed the fan to continue spinning even while Ubuntu was running. But completely disabling ACPI isn't a workable option for a production system.
Talking directly to the fan controller over I2C
I guessed that my fan controller might be connected to the southbridge via I2C. So I tinkered with i2c-tools but i2cdetect always complained that the /dev/i2c-? files were not valid I2C devices. I tried loading a number of i2c-related modules but to no avail.
What's controlling my CPU fan? The ACPI subsystem?
The evidence is somewhat contradictory. On the one hand, setting the boot option acpi=off successfully enables the fan, which suggests that the ACPI subsystem is switching off my fan during a normal boot. On the other hand, there's a bunch of evidence suggesting the fan isn't controlled by ACPI on this laptop:
I dissassembled my DSDT and found that PNP0c0b is not present (which apparently is the code for a fan controller). Also, the evidence above in the "Tinkering with the /proc/ or /sys/ file system" section would suggest that my fan's controller isn't exposed over ACPI (or, alternatively, it is exposed over ACPI but Linux doesn't know how to handle it).
My guess at what's happening
This is total guess work but I think this is how it's working at the moment: coretemp queries the CPU and finds that it can handle max temp of 105 degrees C. coretemp is not responsible for controlling the fan (apparently) so some other subsystem (hwmon? acpi? the BIOS?) must be taking the max temperatures stated by coretemp and using these as the set points for the fan. Without allowing me, the user, any input. Not cool. If I wanted to use an OS which bans the user for modifying interesting system parameters I would've bought myself a Mac
My system config
HP ProBook 6450b with 4GB of RAM and a magnetic hard disk. I'm running the latest HP BIOS.
From CPU-Z (running on Windows):
Name = Intel Core i5 450M. Two cores, each with 2 threads.
Codename = Arrandale
Specification = Intel(R) Core(TM) i5 CPU M 450 @ 2.40GHz
Package (platform ID) = Socket 989 rPGA (0x4)
CPUID = 6.5.5
Extended CPUID = 6.25
Core Stepping = K0
Northbridge = Intel Havendale/Clarkdale Host Bridge rev. 02
Southbridge = Intel HM57 rev. 05
Memory Type = DDR3
Mainboard Model = 146D (0x000000DF - 0x00002220)
LPCIO Vendor = SMSC
LPCIO Vendor ID = 0x55
LPCIO Chip ID = 0x4C
So, in summary: I'm 99% sure there's a bug in there somewhere. The fan really, really should turn on long before the CPU gets anywhere near 105 degrees C. I guess I have two questions: firstly, does anyone know a fix?! If not, does can anyone suggest to which bug reporting system I should submit this bug? It appears to be associated with the Linux kernel rather than Ubuntu. But to which linux subsystem dev team should I submit the bug?