LXC私有网络不通方法分析

今天有一个LXC私有网络到外网不通,首先简单看了下L3 agent的绑定情况,是没有问题的,并且同租户其它KVM虚拟机私有网络都是无误的,更加确认了应该是LXC自身的问题

其实,一开始PORT的State为BUILD,我没有放在心上

查看namespace里,LXC里ping网关,根本收不到任何包

~$ sudo ip netns exec qrouter-8e8268bf-0202-4401-8a20-70d118791451 ip a
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
48: tun0:  mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 100
    link/none
    inet 10.180.66.1/23 scope global tun0
       valid_lft forever preferred_lft forever
44450: ha-fdba3c47-12:  mtu 1400 qdisc htb state UNKNOWN group default qlen 1000
    link/ether fa:16:3e:8d:06:9b brd ff:ff:ff:ff:ff:ff
    inet 10.180.64.10/23 brd 10.180.65.255 scope global ha-fdba3c47-12
       valid_lft forever preferred_lft forever
    inet 10.180.64.1/23 scope global secondary ha-fdba3c47-12
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe8d:69b/64 scope link
       valid_lft forever preferred_lft forever
44451: qg-8e8268bf-02:  mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 00:16:3e:fe:08:5f brd ff:ff:ff:ff:ff:ff
    inet 169.254.8.95/18 brd 169.254.63.255 scope global qg-8e8268bf-02
       valid_lft forever preferred_lft forever
    inet6 fe80::216:3eff:fefe:85f/64 scope link
       valid_lft forever preferred_lft forever
~$ sudo ip netns exec qrouter-8e8268bf-0202-4401-8a20-70d118791451 tcpdump -i ha-fdba3c47-12 icmp -en
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ha-fdba3c47-12, link-type EN10MB (Ethernet), capture size 262144 bytes
^C
0 packets captured
0 packets received by filter
0 packets dropped by kernel

从namespace里ping网关,是好的

~$ sudo ip netns exec qrouter-8e8268bf-0202-4401-8a20-70d118791451 ping 10.180.64.1
PING 10.180.64.1 (10.180.64.1) 56(84) bytes of data.
64 bytes from 10.180.64.1: icmp_req=1 ttl=64 time=0.058 ms
^C
--- 10.180.64.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.058/0.058/0.058/0.000 ms

这么更肯定了是LXC内部到网关的问题,查看TAP设备也是没有任何包

~$ sudo tcpdump -i tap0aaa1d9c-99 icmp -en
^C

查看一下ovs上该PORT的状况

sudo ovs-vsctl show | less

但是令人惊讶的是,关注的tap设备居然没有tag

Port "tap0aaa1d9c-99"
            Interface "tap0aaa1d9c-99"

此时就只有查看下neutron-server日志,看到了RPC超时

~$ grep 0aaa1d9c-99da-4c72-88d0-c81c85a9af30 /data//log/neutron/neutron-openvswitch-agent.log --color
省略……
2016-09-18 14:57:53.674 32685 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent DeviceListRetrievalError: Unable to retrieve port details for devices: [u'169c8bbe-fc88-49a2-82d6-111ef018daf4', u'3c1e1c4a-3889-49be-8a62-8acfcb6e2f01', u'0aaa1d9c-99da-4c72-88d0-c81c85a9af30', u'c214fc37-3b5b-4ac4-8eba-782d32d446e2', u'e2272d3f-d95f-422e-b9c2-d0f37229f3cd'] because of error: Timeout while waiting on RPC response - topic: "q-plugin", RPC method: "get_devices_details_list" info: “unknown"

也就是RPC超时导致没刷上去

此时解决方法就是重启下该节点的ovs进程

sudo service neutron-plugin-openvswitch-agent restart

刷好了之后查询一下,tag有了

~$ sudo ovs-vsctl list port tap0aaa1d9c-99
_uuid               : 16ad8807-c9c0-4718-8bc5-aacc2b954c9b
bond_downdelay      : 0
bond_fake_iface     : false
bond_mode           : []
bond_updelay        : 0
external_ids        : {}
fake_bridge         : false
interfaces          : [d80d3d57-6cfc-46e9-9b57-5fe66648ee7b]
lacp                : []
mac                 : []
name                : "tap0aaa1d9c-99"
other_config        : {}
qos                 : []
statistics          : {}
status              : {}
tag                 : 1
trunks              : []
vlan_mode           : []

从ovs查询的结果也正确了

Port "tap0aaa1d9c-99"
            tag: 1
            Interface "tap0aaa1d9c-99"

此时私有网络就通了

/# ping www.baidu.com
PING www.a.shifen.com (115.239.210.27) 56(84) bytes of data.
64 bytes from 115.239.210.27: icmp_req=1 ttl=56 time=1.64 ms
64 bytes from 115.239.210.27: icmp_req=2 ttl=56 time=1.37 ms
64 bytes from 115.239.210.27: icmp_req=3 ttl=56 time=1.31 ms
64 bytes from 115.239.210.27: icmp_req=4 ttl=56 time=1.33 ms
64 bytes from 115.239.210.27: icmp_req=5 ttl=56 time=1.37 ms
64 bytes from 115.239.210.27: icmp_req=6 ttl=56 time=1.52 ms

顺便备注一下LXC的进入方法:

sudo virsh -c lxc:/// lxc-enter-namespace 2902f3c8-4d13-490d-97c4-9121cfb28008 --noseclabel /bin/bash

发表回复