I'm stuck. Can't figure out why 1 VM can't access the internet when all other VMs on the same VM host, using the same bridge, can.
I patch weekly, Saturday mornings. Have a few home servers running on KVM with manually created Linux bridges.
Have about 10 virtual machines running and only 1 has been impacted with this issue. All the others, and the KVM hostOS (Ubuntu 14.04.5 Server), aren't having any network issues. LAN and WAN all work on the other VMs.
The impacted VM is running "Ubuntu 16.04.5 LTS" according to LSB info. The problem system runs nextcloud, but only can be reached from the internal LAN. No WAN firewall ports inbound are open for it. It has outbound access like any typical home network. The router config hasn't changed in months. It doesn't seem hacked. I consider it a low risk system.
This is a wired GigE network. All static IPs configured for the servers. The WAN IPs,.29, are static too. Don't think that matters.
The firewall is minimal.
Code:
$ sudo ufw status
Status: active
To Action From
-- ------ ----
22 ALLOW 172.22.22.0/24
80 ALLOW 172.22.22.0/24
443 ALLOW 172.22.22.0/24
No IPv6 enabled. I only use IPv4.
Nothing network-wise has changed the last year. It has been working fine all this time, just weekly patches are the only changes.
No external IPs can be pinged.
Code:
$ ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
^C
--- 8.8.8.8 ping statistics ---
7 packets transmitted, 0 received, 100% packet loss, time 6047ms
But any LAN IP can be pinged.
Code:
$ ping 172.22.22.1
PING 172.22.22.1 (172.22.22.1) 56(84) bytes of data.
64 bytes from 172.22.22.1: icmp_seq=1 ttl=64 time=1.10 ms
64 bytes from 172.22.22.1: icmp_seq=2 ttl=64 time=1.28 ms
64 bytes from 172.22.22.1: icmp_seq=3 ttl=64 time=0.986 ms
64 bytes from 172.22.22.1: icmp_seq=4 ttl=64 time=0.879 ms
^C
--- 172.22.22.1 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3004ms
rtt min/avg/max/mdev = 0.879/1.064/1.289/0.151 ms
So, that would mean networking is fine on the VM and all the wiring is fine. Perhaps a routing issue?
Code:
$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 172.22.22.1 0.0.0.0 UG 0 0 0 ens3
172.22.22.0 0.0.0.0 255.255.255.0 U 0 0 0 ens3
$ ifconfig
ens3 Link encap:Ethernet HWaddr 52:54:00:63:f8:45
inet addr:172.22.22.34 Bcast:172.22.22.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:62747 errors:0 dropped:1 overruns:0 frame:0
TX packets:25420 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:8128758 (8.1 MB) TX bytes:22392145 (22.3 MB)
I don't see anything wrong there. It matches what I expect and what other, working, VMs have configured.
Looks more like a router issue. We did have an storm that lost internet early Saturday morning for 30 minutes. Power was not impacted. All servers and networking gear are on UPS power.
Here's a ping from another VM running on the same host:
Code:
$ ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=119 time=28.5 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=119 time=26.5 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=119 time=15.6 ms
^C
--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 15.644/23.606/28.592/5.692 ms
They all work. Every other VM isn't showing any network issues. Just the Nextcloud server can't ping the outside.
I claim that name resolution is working perfectly. From the nextcloud system:
Code:
$ dig google.com
; <<>> DiG 9.10.3-P4-Ubuntu <<>> google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 52040
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 6, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;google.com. IN A
;; ANSWER SECTION:
google.com. 235 IN A 64.233.177.101
google.com. 235 IN A 64.233.177.138
google.com. 235 IN A 64.233.177.113
google.com. 235 IN A 64.233.177.100
google.com. 235 IN A 64.233.177.139
google.com. 235 IN A 64.233.177.102
It does because I'm using a LAN DNS provider. It isn't slow either. Fast as usual. The /etc/resolv.conf:
Code:
$ more /etc/resolv.conf
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 172.22.22.1
nameserver 1.1.1.1
nameserver 1.0.0.1
DNS runs on the router and is working.
The VM machine cannot ping 1.1.1.1
Code:
$ ping 1.1.1.1
PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data.
^C
--- 1.1.1.1 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2015ms
jp@nextcloud:~$ ping 1.0.0.1
PING 1.0.0.1 (1.0.0.1) 56(84) bytes of data.
^C
--- 1.0.0.1 ping statistics ---
4 packets transmitted, 0 received, 100% packet loss, time 3024ms
But other VMs don't have any issues with that:
Code:
$ ping 1.0.0.1
PING 1.0.0.1 (1.0.0.1) 56(84) bytes of data.
64 bytes from 1.0.0.1: icmp_seq=1 ttl=56 time=30.4 ms
64 bytes from 1.0.0.1: icmp_seq=2 ttl=56 time=19.2 ms
^C
--- 1.0.0.1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 19.203/24.833/30.463/5.630 ms
jp@lubuntu:~$ ping 1.1.1.1
PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data.
64 bytes from 1.1.1.1: icmp_seq=1 ttl=56 time=17.7 ms
64 bytes from 1.1.1.1: icmp_seq=2 ttl=56 time=28.3 ms
64 bytes from 1.1.1.1: icmp_seq=3 ttl=56 time=26.6 ms
^C
--- 1.1.1.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 17.730/24.231/28.301/4.649 ms
On the VM host, the bridges are:
Code:
$ brctl show
bridge name bridge id STP enabled interfaces
br0 8000.0004e2d6a6e4 no eth1
vnet0
vnet1
vnet2
vnet3
vnet4
vnet5
vnet6
vnet7
br1 8000.000000000000 no
$ ifconfig
br0 Link encap:Ethernet HWaddr 00:04:e2:d6:a6:e4
inet addr:172.22.22.4 Bcast:172.22.22.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:47031793 errors:0 dropped:0 overruns:0 frame:0
TX packets:38997947 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:43391145533 (43.3 GB) TX bytes:16119365345 (16.1 GB)
No errors. Plenty of xfers.
Anyways, I get RSS feeds through Nextcloud News, so I'm really missing those. I have a few things to run down still - checking the router settings now. Router seems fine. This was the first login in months.
At this point, just looking for ideas for things to check. Lacking that, I can migrate the VM to another VM host. I'm betting the issue follows the VM.
Bookmarks