PDA

View Full Version : [ubuntu] 10.04.1 sometimes ksoftirqd get 100% on one core



nr1c0re
October 1st, 2010, 01:52 PM
Hello everyone!

I've got router 9.10 ubuntu that routes 100mb/s+ traffic. It has no NAT or firewall enabled, only ipsec-tools and racoon with another peer over leased 1gb/s line. I've tested with iperf - about 950mbit/s over that link can encrypt this router.

Yesterday I decided to move to LTS version - 10.04.
I've made
apt-get dist-upgrade
reboot
do-release-upgrade
reboot

All good when I got less then ~10-20mbit/s. But when it get like 50+mbit/s ping increase from 1ms to 300-400 ms and i've got about 50% packet loss. If i get more traffic over that router - I have even more % of packet loss.

In top I see that one process ksoftirqd makes one core of cpu to 100% load.

I tryed to check what can be a reason of it with powertop:

In 9.10 and 10.04 normal operation I see folowing picture:
_________________________________________

Cn Avg residency P-states (frequencies)
C0 (cpu running) (21.0%)
polling 0.0ms ( 0.0%)
C1 halt 0.0ms ( 0.0%)
C2 0.0ms (10.7%)
C3 0.2ms (68.3%)

Wakeups-from-idle per second : 6445.3 interval: 0.5s
no ACPI power usage estimate available

Top causes for wakeups:
56.0% ( inf) <interrupt> : eth1
41.5% ( inf) <interrupt> : eth0
1.1% ( inf) <interrupt> : extra timer interrupt
0.7% ( inf) <kernel core> : add_timer (smi_timeout)
0.4% ( inf) kipmi0 : schedule_timeout_interruptible (process_timeout)
0.0% ( inf) <interrupt> : ipmi_si
0.0% ( inf) <kernel core> : neigh_add_timer (neigh_timer_handler)
0.0% ( inf) <kernel core> : ipmi_timeout (ipmi_timeout)
0.0% ( inf) <kernel core> : neigh_periodic_timer (neigh_periodic_timer)
0.0% ( inf) <kernel IPI> : Rescheduling interrupts
_________________________________________

And when traffic rate increase in 10.04 i see folowing:

_________________________________________

48.8% (847.6) [eth0] <interrupt>
23.0% (400.4) [kernel scheduler] Load balancing tick
11.5% (200.4) [kernel core] add_timer (smi_timeout)
7.0% (121.2) [extra timer interrupt]
5.8% (100.0) kipmi0
2.2% ( 39.0) [Rescheduling interrupts] <kernel IPI>
0.6% ( 10.0) [kernel core] ipmi_timeout (ipmi_timeout)
0.4% ( 7.0) [ipmi_si] <interrupt>
0.2% ( 4.0) [kernel core] usb_hcd_poll_rh_status (rh_timer_func)
_________________________________________

As you can see eth1 dissapears in 10.04 under heavy load in powertop and "[kernel scheduler] Load balancing tick" and other things takes more work.

Why is it so? What has changed from 9.10 to 10.04.1 in networking?
Any ideas?
Thanks in advance!!!

a9k3d
October 2nd, 2010, 07:12 AM
The most likely explanation for the symptoms is interrupt overruns. Possibly the eth0 driver is polling instead of using DMA. It is taking too long handling a packet and leaving no time for much else.

I'd check the driver for eth0.

Check the new driver has same name as the old driver. It could be that 10.04 auto-detected and installed a wrong driver.
Check kernel change notes for that driver. Maybe a parameter was added that has to be configured for better speed.

Maybe this article can help with the problem: https://help.ubuntu.com/community/ReschedulingInterrupts
Also there has been some kernel changes in the load balancing: https://bugzilla.redhat.com/show_bug.cgi?id=635813

Good luck hunting this down. I am sure you don't want to be adding kprint's to the kernel to debug this.

nr1c0re
October 3rd, 2010, 10:01 PM
Thanks for you reply!

It's HP server with onboard broadcom - bnx2 driver.

On 9.10 all works very good.

More in depth:
Problem appears on 2 different servers, but all are HP with broadcom onboard. One server has 1 CPU with 4cores, second server has 1 CPU with 2 cores. Both servers has 2 broadcom NIC's.
I made CPU affinity to each NIC to pin interrupts of each NIC to dedicated core, that gave me additional speed up to 1gbit/s of IPSEC'ed traffic.
But making CPU affinity was a bit tricky. Those broadcom NIC's are not good with MSI/MSI-X interrupts, they can not migrate from one CPU to another when I change their (NIC) affinity in /proc/irq/*id_num*/smp_affinity.
I had to make
pre-up echo 'CPU_CORE_NUMBER' > /proc/irq/*id_num*/smp_affinity
post-up echo 'ff' > /proc/irq/*id_num*/smp_affinity

for each NIC to pin it's IRQ to core.

May be those "kernel sheduler/resheduler interrupts" got some problems with it...

Is there any other utilities except powertop to view what's happening there in ksoftirqd process in kernel?

psillithid
October 25th, 2010, 10:51 PM
I'm seeing a similar problem on two of my Lucid boxes.

One was a clean install and I assumed it must've just been bad hardware in that server, but I've upgrade a Jaunty server to Lucid and it new exhibits the same problem.

On both servers the NICs are Intel Pro/1000's, running the e1000e driver.

It only seems to take 1Mbit/s over IPSec to send either server into a soft death. Load continues to grow, the system becomes practically unresponsive, and one core is 100% dealing with soft interrupts. Killing the IPSec tunnel stops the problem and the server recovers (after a while).

I was wondering if you found a solid work around for the problem, or got to the bottom of the cause?

nr1c0re
October 26th, 2010, 09:07 AM
Still no resolution found.
Want to check out 10.10 server in a week. But it's not LTS :(
Will report soon.