Neutron里OpenvSwitch Security Group的神奇BUG

在测试云网络OVS安全组的过程中,遇到了一些奇怪的问题,其中最神奇的当属下面一个

简单说下基于OpenvSwitch的安全组,实质上就是一个firewall,只不过是基于ovs的,也就是针对PORT的,但是也分ingress和egress,这里可以对应iptables里的INPUT和OUTPUT链,需要切记的是ingress的限制是白名单,也就是添加一条规则,表明了允许该规则的包能够进来,而egress的限制是黑名单,也就是默认出方向畅通无阻,但该方向安全组规则是不让满足规则的包出去,与此同时对应的四元组也有所不同,对于IP Addr来说,ingress针对的是src ipaddr,egress针对的是dst ipaddr,而对于PORT来说,ingress和egress针对的永远是dst port,因此可以看到,假如在测试当中,源和目的两方,假如security group不做任何修改,入方向包就全部丢掉了,所以ingress方向需要首先全部放行,再来逐一验证安全组规则

大致原理内容简述完了,下面是一个神奇的BUG

测试的具体用例就是egress方向端口的屏蔽,上面已经说过了,egress方向默认全开放,指定的端口就是要被屏蔽的,因此这里通过iperf工具来进行流量测试,这里只需要知道一点,iperf的server端默认监听端口是5001

A:安全组规则egress方向指定tcp协议,端口5001,经过测试流量的确被过滤掉了

B:安全组规则egress方向指定tcp协议,端口5002,经过测试流量正常发送和接收,没被过滤掉,大概长这样

$ neutron security-group-show 61e43f12-12ec-48a1-b2d5-c7392290978c
+----------------------+--------------------------------------------------------------------+
| Field                | Value                                                              |
+----------------------+--------------------------------------------------------------------+
| description          |                                                                    |
| id                   | 61e43f12-12ec-48a1-b2d5-c7392290978c                               |
| name                 | group-egress-tcp-115.236.127.223-5002                              |
| security_group_rules | {                                                                  |
|                      |      "remote_group_id": null,                                      |
|                      |      "direction": "egress",                                        |
|                      |      "remote_ip_prefix": "115.236.127.223/32",                     |
|                      |      "protocol": "tcp",                                            |
|                      |      "tenant_id": "10e5051a1cee4f8ebb0e8b5d877de581",              |
|                      |      "port_range_max": 5002,                                       |
|                      |      "security_group_id": "61e43f12-12ec-48a1-b2d5-c7392290978c",  |
|                      |      "port_range_min": 5002,                                       |
|                      |      "ethertype": "IPv4",                                          |
|                      |      "id": "f2994944-958b-4f06-aed4-4f3dccf903fc"                  |
|                      | }                                                                  |
| tenant_id            | 10e5051a1cee4f8ebb0e8b5d877de581                                   |
+----------------------+--------------------------------------------------------------------+

~# iperf -c 115.236.127.223 -t 10000 -i 1
------------------------------------------------------------
Client connecting to 115.236.127.223, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  3] local 115.236.127.222 port 56948 connected with 115.236.127.223 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 1.0 sec  57.4 MBytes   481 Mbits/sec
[  3]  1.0- 2.0 sec  57.2 MBytes   480 Mbits/sec
[  3]  2.0- 3.0 sec  56.9 MBytes   477 Mbits/sec
[  3]  3.0- 4.0 sec  57.1 MBytes   479 Mbits/sec
[  3]  4.0- 5.0 sec  57.2 MBytes   480 Mbits/sec
[  3]  5.0- 6.0 sec  56.9 MBytes   477 Mbits/sec
[  3]  6.0- 7.0 sec  57.4 MBytes   481 Mbits/sec
[  3]  7.0- 8.0 sec  57.6 MBytes   483 Mbits/sec

~# iperf -s -i 1
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 115.236.127.223 port 5001 connected with 115.236.127.222 port 56948
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0- 1.0 sec  57.0 MBytes   478 Mbits/sec
[  4]  1.0- 2.0 sec  57.0 MBytes   478 Mbits/sec
[  4]  2.0- 3.0 sec  57.0 MBytes   478 Mbits/sec
[  4]  3.0- 4.0 sec  57.0 MBytes   478 Mbits/sec
[  4]  4.0- 5.0 sec  57.0 MBytes   479 Mbits/sec
[  4]  5.0- 6.0 sec  57.0 MBytes   478 Mbits/sec
[  4]  6.0- 7.0 sec  57.0 MBytes   478 Mbits/sec
[  4]  7.0- 8.0 sec  57.0 MBytes   479 Mbits/sec

C:安全组规则egress方向指定tcp协议,端口5000,经过测试流量被过滤掉了,居然被丢掉了!

$ neutron security-group-show a762e5d7-d3c1-490b-92e2-ffe844a80650
+----------------------+--------------------------------------------------------------------+
| Field                | Value                                                              |
+----------------------+--------------------------------------------------------------------+
| description          |                                                                    |
| id                   | a762e5d7-d3c1-490b-92e2-ffe844a80650                               |
| name                 | group-egress-tcp-115.236.127.223-5000                              |
| security_group_rules | {                                                                  |
|                      |      "remote_group_id": null,                                      |
|                      |      "direction": "egress",                                        |
|                      |      "remote_ip_prefix": "115.236.127.223/32",                     |
|                      |      "protocol": "tcp",                                            |
|                      |      "tenant_id": "10e5051a1cee4f8ebb0e8b5d877de581",              |
|                      |      "port_range_max": 5000,                                       |
|                      |      "security_group_id": "a762e5d7-d3c1-490b-92e2-ffe844a80650",  |
|                      |      "port_range_min": 5000,                                       |
|                      |      "ethertype": "IPv4",                                          |
|                      |      "id": "5edefe65-48da-4b52-b257-dd5b147ef1a6"                  |
|                      | }                                                                  |
| tenant_id            | 10e5051a1cee4f8ebb0e8b5d877de581                                   |
+----------------------+--------------------------------------------------------------------+


~# iperf -c 115.236.127.223 -t 10000 -i 1


^C

上面的操作,iperf的client和server端都没有做任何变动,修改的只是neutron port的security group规则,理论上指定了5001端口,才会被过滤,5000和5002理应都不会丢掉流量,可见5000也丢掉是一个奇怪的问题,以为有其他规则影响,查找原因足足测试了3次!!终于确认这是一个BUG

至于测试过程中为什么偏偏选个5000和5002,就近原则吧,当然random一个也可以,不过作为一般边界值测试容易出问题来说,靠近5001两边的值出问题的可能性也不小

至于这个问题,在社区也能够找到这个BUG,可惜这功能咱们上得太晚了,不然给社区提这个BUG的说不定就是我了

https://bugs.launchpad.net/neutron/+bug/1611991

这哥们测试用例应该和我一致,只不过端口号他选的是22和23,没有写ingress还是egress应该是默认的ingress,只不过只设置了22端口,结果23端口也生效了

Seen on master devstack, ubuntu xenial.

Steps to reproduce:

1. Enable ovs firewall in /etc/neutron/plugins/ml2/ml2.conf

[securitygroup]
firewall_driver = openvswitch

2. Create a security group with icmp, tcp to 22.

3. Boot a VM, assign a floating ip.

4. Check that port 23 can be accessed via tcp (telnet, nc, etc).

可见这个问题影响远远不止一两个端口,而是某种算法或者什么原因导致规则出现BUG

下面有一个哥们说

The bug is in port masking, 22 is masked by tp_src=0x16/0xfffe which matches number 23 as well. Good catch!

Changed in neutron:
importance:	Undecided → High

这里还是说的一样的,接着往下看,一堆人轮流轰炸了一堆之后,来了个fix

Reviewed: https://review.openstack.org/353782
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=0494f212aa625a03587af3d75e823008f1198012
Submitter: Jenkins
Branch: master

commit 0494f212aa625a03587af3d75e823008f1198012
Author: Inessa Vasilevskaya
Date: Thu Aug 11 02:21:29 2016 +0300

    ovsfw: fix troublesome port_rule_masking

    In several cases port masking algorithm borrowed
    from networking_ovs_dpdk didn't behave correctly.
    This caused non-restricted ports to be open due to
    wrong tp_src field value in resulting ovs rules.

    This was fixed by alternative port masking
    implementation.

    Functional and unit tests to cover the bug added as well.

    Co-Authored-By: Jakub Libosvar 
    Co-Authored-By: IWAMOTO Toshihiro 

这里说明了,ovs Firewall,修复了port_rule_masking的问题,在某些情况下端口屏蔽的算法不对;结果就是由于ovs规则里错误的tp_src字段的值导致非限制的端口也屏蔽了

修改了一大波,可以看看commit

https://git.openstack.org/cgit/openstack/neutron/commit/?id=dd75f7e96afc713b57ad4ab21f01175be7b571fe

这规则屏蔽算法是在看着心碎,本来还想来个BUG分享,看这问题原因还是算了吧,总结就四个字:算法有误

解决办法就是,将上面的Reviewd版本合进来,这已经Commit到Master里了

Reviewed: https://review.openstack.org/353782
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=0494f212aa625a03587af3d75e823008f1198012

更新完之后恢复正常

$ neutron security-group-show a762e5d7-d3c1-490b-92e2-ffe844a80650
+----------------------+--------------------------------------------------------------------+
| Field                | Value                                                              |
+----------------------+--------------------------------------------------------------------+
| description          |                                                                    |
| id                   | a762e5d7-d3c1-490b-92e2-ffe844a80650                               |
| name                 | group-egress-tcp-115.236.127.223-5000                              |
| security_group_rules | {                                                                  |
|                      |      "remote_group_id": null,                                      |
|                      |      "direction": "egress",                                        |
|                      |      "remote_ip_prefix": "115.236.127.223/32",                     |
|                      |      "protocol": "tcp",                                            |
|                      |      "tenant_id": "10e5051a1cee4f8ebb0e8b5d877de581",              |
|                      |      "port_range_max": 5000,                                       |
|                      |      "security_group_id": "a762e5d7-d3c1-490b-92e2-ffe844a80650",  |
|                      |      "port_range_min": 5000,                                       |
|                      |      "ethertype": "IPv4",                                          |
|                      |      "id": "5edefe65-48da-4b52-b257-dd5b147ef1a6"                  |
|                      | }                                                                  |
| tenant_id            | 10e5051a1cee4f8ebb0e8b5d877de581                                   |
+----------------------+--------------------------------------------------------------------+

~# iperf -s -i 1
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 115.236.127.223 port 5001 connected with 115.236.127.222 port 39420
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0- 1.0 sec  57.0 MBytes   478 Mbits/sec
[  4]  1.0- 2.0 sec  57.0 MBytes   479 Mbits/sec
[  4]  2.0- 3.0 sec  57.0 MBytes   478 Mbits/sec
[  4]  3.0- 4.0 sec  57.0 MBytes   478 Mbits/sec
[  4]  4.0- 5.0 sec  57.0 MBytes   478 Mbits/sec
[  4]  5.0- 6.0 sec  57.0 MBytes   478 Mbits/sec
[  4]  6.0- 7.0 sec  57.0 MBytes   478 Mbits/sec
[  4]  7.0- 8.0 sec  57.0 MBytes   478 Mbits/sec
[  4]  8.0- 9.0 sec  57.0 MBytes   478 Mbits/sec
[  4]  9.0-10.0 sec  57.0 MBytes   478 Mbits/sec
[  4] 10.0-11.0 sec  57.0 MBytes   478 Mbits/sec
[  4] 11.0-12.0 sec  57.0 MBytes   478 Mbits/sec

 

发表回复