PDA

View Full Version : [ubuntu] 8.04 rare intermittent TCP I/O WARNING


alecm3
June 20th, 2009, 02:24 AM
We have some high traffic 8.04 quad core AMD64 servers running Apache and serving static files (no web scripts). That's all these servers do.
They have high traffic (about 2700 requests per sec), and the bandwidth is about 70Mbps on each.

I get two types of intermittent TCP warnings in dmesg:

1)
Jun 19 21:38:12 web18 kernel: [15963621.891196] WARNING: at /build/buildd/linux-2.6.24/net/ipv4/tcp_input.c:2413 tc
p_fastretrans_alert()
Jun 19 21:38:12 web18 kernel: [15963621.891205] Pid: 0, comm: swapper Not tainted 2.6.24-19-server #1
Jun 19 21:38:12 web18 kernel: [15963621.891207]
Jun 19 21:38:12 web18 kernel: [15963621.891208] Call Trace:
Jun 19 21:38:12 web18 kernel: [15963621.891209] <IRQ> [tcp_ack+0x1d2d/0x1d60] tcp_ack+0x1d2d/0x1d60
Jun 19 21:38:12 web18 kernel: [15963621.891242] [ipv6:tcp_rcv_established+0x44e/0xc60] tcp_rcv_established+0x44e/0
x9f0
Jun 19 21:38:12 web18 kernel: [15963621.891247] [ip_tables:ipt_do_table+0x240/0x520] :ip_tables:ipt_do_table+0x240
/0x520
Jun 19 21:38:12 web18 kernel: [15963621.891253] [ipv6:tcp_v4_do_rcv+0x36b/0x700] tcp_v4_do_rcv+0x36b/0x700
Jun 19 21:38:12 web18 kernel: [15963621.891269] [nf_conntrack:nf_ct_deliver_cached_events+0x98/0xa0] :nf_conntrack
:nf_ct_deliver_cached_events+0x98/0xa0
Jun 19 21:38:12 web18 kernel: [15963621.891275] [nf_conntrack_ipv4:ipv4_confirm+0x33/0x60] :nf_conntrack_ipv4:ipv4
_confirm+0x33/0x60
Jun 19 21:38:12 web18 kernel: [15963621.891279] [tcp_v4_rcv+0x895/0xaf0] tcp_v4_rcv+0x895/0xaf0
Jun 19 21:38:12 web18 kernel: [15963621.891286] [ip_local_deliver_finish+0xc3/0x250] ip_local_deliver_finish+0xc3/
0x250
Jun 19 21:38:12 web18 kernel: [15963621.891289] [ip_rcv_finish+0x114/0x3b0] ip_rcv_finish+0x114/0x3b0
Jun 19 21:38:12 web18 kernel: [15963621.891292] [ip_rcv+0x212/0x300] ip_rcv+0x212/0x300
Jun 19 21:38:12 web18 kernel: [15963621.891297] [tg3:netif_receive_skb+0x3ac/0x7b0] netif_receive_skb+0x3ac/0x490
Jun 19 21:38:12 web18 kernel: [15963621.891308] [tg3:tg3_poll+0x873/0xa60] :tg3:tg3_poll+0x873/0xa60
Jun 19 21:38:12 web18 kernel: [15963621.891317] [net_rx_action+0x128/0x230] net_rx_action+0x128/0x230
Jun 19 21:38:12 web18 kernel: [15963621.891324] [__do_softirq+0x75/0xe0] __do_softirq+0x75/0xe0
Jun 19 21:38:12 web18 kernel: [15963621.891330] [call_softirq+0x1c/0x30] call_softirq+0x1c/0x30
Jun 19 21:38:12 web18 kernel: [15963621.891333] [do_softirq+0x35/0x90] do_softirq+0x35/0x90
Jun 19 21:38:12 web18 kernel: [15963621.891336] [irq_exit+0x88/0x90] irq_exit+0x88/0x90
Jun 19 21:38:12 web18 kernel: [15963621.891339] [do_IRQ+0x80/0x100] do_IRQ+0x80/0x100
Jun 19 21:38:12 web18 kernel: [15963621.891341] [default_idle+0x0/0x40] default_idle+0x0/0x40
Jun 19 21:38:12 web18 kernel: [15963621.891343] [default_idle+0x0/0x40] default_idle+0x0/0x40
Jun 19 21:38:12 web18 kernel: [15963621.891345] [ret_from_intr+0x0/0x0a] ret_from_intr+0x0/0xa
Jun 19 21:38:12 web18 kernel: [15963621.891347] <EOI> [default_idle+0x29/0x40] default_idle+0x29/0x40
Jun 19 21:38:12 web18 kernel: [15963621.891354] [cpu_idle+0x6f/0xc0] cpu_idle+0x6f/0xc0

2)
Jun 19 21:38:12 web18 kernel: [15963621.851246] WARNING: at /build/buildd/linux-2.6.24/net/ipv4/tcp_output.c:1799 t
cp_simple_retransmit()
Jun 19 21:38:12 web18 kernel: [15963621.851254] Pid: 0, comm: swapper Not tainted 2.6.24-19-server #1
Jun 19 21:38:12 web18 kernel: [15963621.851256]
Jun 19 21:38:12 web18 kernel: [15963621.851256] Call Trace:
Jun 19 21:38:12 web18 kernel: [15963621.851258] <IRQ> [ipv6:tcp_simple_retransmit+0x1f7/0x200] tcp_simple_retrans
mit+0x1f7/0x200
Jun 19 21:38:12 web18 kernel: [15963621.851285] [tcp_v4_err+0x571/0x630] tcp_v4_err+0x571/0x630
Jun 19 21:38:12 web18 kernel: [15963621.851291] [ipv6:nf_hook_slow+0x9e/0x230] nf_hook_slow+0x9e/0xf0
Jun 19 21:38:12 web18 kernel: [15963621.851294] [ip_local_deliver_finish+0x0/0x250] ip_local_deliver_finish+0x0/0x
250
Jun 19 21:38:12 web18 kernel: [15963621.851299] [icmp_rcv+0x116/0x190] icmp_rcv+0x116/0x190
Jun 19 21:38:12 web18 kernel: [15963621.851302] [ip_local_deliver_finish+0xc3/0x250] ip_local_deliver_finish+0xc3/
0x250
Jun 19 21:38:12 web18 kernel: [15963621.851305] [ip_rcv_finish+0x114/0x3b0] ip_rcv_finish+0x114/0x3b0
Jun 19 21:38:12 web18 kernel: [15963621.851308] [ip_rcv+0x212/0x300] ip_rcv+0x212/0x300
Jun 19 21:38:12 web18 kernel: [15963621.851313] [tg3:netif_receive_skb+0x3ac/0x7b0] netif_receive_skb+0x3ac/0x490
Jun 19 21:38:12 web18 kernel: [15963621.851325] [tg3:tg3_poll+0x873/0xa60] :tg3:tg3_poll+0x873/0xa60
Jun 19 21:38:12 web18 kernel: [15963621.851334] [net_rx_action+0x128/0x230] net_rx_action+0x128/0x230
Jun 19 21:38:12 web18 kernel: [15963621.851341] [__do_softirq+0x75/0xe0] __do_softirq+0x75/0xe0
Jun 19 21:38:12 web18 kernel: [15963621.851347] [call_softirq+0x1c/0x30] call_softirq+0x1c/0x30
Jun 19 21:38:12 web18 kernel: [15963621.851350] [do_softirq+0x35/0x90] do_softirq+0x35/0x90
Jun 19 21:38:12 web18 kernel: [15963621.851353] [irq_exit+0x88/0x90] irq_exit+0x88/0x90
Jun 19 21:38:12 web18 kernel: [15963621.851355] [do_IRQ+0x80/0x100] do_IRQ+0x80/0x100
Jun 19 21:38:12 web18 kernel: [15963621.851357] [default_idle+0x0/0x40] default_idle+0x0/0x40
Jun 19 21:38:12 web18 kernel: [15963621.851360] [default_idle+0x0/0x40] default_idle+0x0/0x40
Jun 19 21:38:12 web18 kernel: [15963621.851362] [ret_from_intr+0x0/0x0a] ret_from_intr+0x0/0xa
Jun 19 21:38:12 web18 kernel: [15963621.851364] <EOI> [default_idle+0x29/0x40] default_idle+0x29/0x40
Jun 19 21:38:12 web18 kernel: [15963621.851371] [cpu_idle+0x6f/0xc0] cpu_idle+0x6f/0xc0


Could someone point out the meaning/significance of this, since I am not looking forward to reading /build/buildd/linux-2.6.24/net/ipv4/tcp_output.c file?

root@web18:~# uname -a
Linux web18 2.6.24-19-server #1 SMP Wed Jun 18 14:44:47 UTC 2008 x86_64 GNU/Linux

Tolaris
August 19th, 2009, 04:28 AM
I have the same problem on a number of my 8.04 servers, both amd64 and i686, all running the server kernel. It doesn't seem to cause an outage, but I'm guessing it may mean at least one dropped TCP packet or session.

alecm3
August 19th, 2009, 04:30 AM
I have the same problem on a number of my 8.04 servers, both amd64 and i686, all running the server kernel. It doesn't seem to cause an outage, but I'm guessing it may mean at least one dropped TCP packet or session.

Interestingly, these errors do not happen on 9.04

Tolaris
August 19th, 2009, 05:04 AM
Interestingly, these errors do not happen on 9.04

It is likely specific to the 2.6.24/hardy kernel. I am a corporate user and we are not willing to run a non-LTS release on our servers.

I do not remember this happening at least 2 kernel releases ago, say from 2.6.24-22 or before. For future reference, current kernel in hardy is 2.6.24-24.

alecm3
August 19th, 2009, 02:49 PM
It is likely specific to the 2.6.24/hardy kernel. I am a corporate user and we are not willing to run a non-LTS release on our servers.

I do not remember this happening at least 2 kernel releases ago, say from 2.6.24-22 or before. For future reference, current kernel in hardy is 2.6.24-24.


I am a corporate user too, however we were forced into 9.04, since 8.04LTS simply does not install on IBM x3550M2 Xeon 5500-based servers... 9.04 is 2.6.28-11-server.

Does this happen in you upgrade hardy to 2.6.24-24 from 2.6.24-19?

Tolaris
August 19th, 2009, 04:00 PM
I am a corporate user too, however we were forced into 9.04, since 8.04LTS simply does not install on IBM x3550M2 Xeon 5500-based servers... 9.04 is 2.6.28-11-server.

Does this happen in you upgrade hardy to 2.6.24-24 from 2.6.24-19?

We've run hardy since at least 2.6.24-19, but we tend to reboot regularly (many of our machines are part of clusters that can be rebooted without loss of service). I'm sure we've run every kernel between -19 and -24, but I don't remember seeing this message until the last few months.

Doesn't install? Yikes! amd64 or i386 kernel? alternate or server install? We have some Via chipset motherboards that cannot run any -server kernel, but are fine with the i386 -generic kernel. If you've written about this issue anywhere, I'd appreciate reading it.

alecm3
August 21st, 2009, 01:59 AM
We've run hardy since at least 2.6.24-19,
Doesn't install? Yikes! amd64 or i386 kernel? alternate or server install? We have some Via chipset motherboards that cannot run any -server kernel, but are fine with the i386 -generic kernel. If you've written about this issue anywhere, I'd appreciate reading it.

Yes, here is the link: http://ubuntuforums.org/showthread.php?t=1219514

it's a DVD-ROM problem, you can do PXE install via Ethernet cable, but I did not bother.