QEMU升级新版本引起的异常

正常情况下,节点上更新qemu版本,虚拟机都会进行关机再开机,因为qemu进程需要重启,但真正实际生活中,虚拟机里可能运行着比较重要的业务,中断会影响流水,甚至如果出现耗时很久才重新恢复服务影响就十分大,因此开发了新qemu版本,能够在虚拟机都不关机的情况下,进行上线更新qemu

在更新完之后,出现了一个小异常,因为本身就在测试上线过程,因此记录一下

最先一哥们的虚拟机表示不能使用了,看了下VM信息,状态ERROR了

+--------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Property                                         | Value                                                                                                                                                                                                                                     |
+--------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| OS-DCF:diskConfig                                | MANUAL                                                                                                                                                                                                                                    |
| OS-EXT-AZ:availability_zone                      | lihui.openstack-1                                                                                                                                                                                                                          |
| OS-EXT-SRV-ATTR:host                             | 10-10-10-140                                                                                                                                                                                                                              |
| OS-EXT-SRV-ATTR:hypervisor_hostname              | 10-10-10-140                                                                                                                                                                                                                              |
| OS-EXT-SRV-ATTR:instance_name                    | instance-00125333                                                                                                                                                                                                                         |
| OS-EXT-STS:power_state                           | 4                                                                                                                                                                                                                                         |
| OS-EXT-STS:task_state                            | -                                                                                                                                                                                                                                         |
| OS-EXT-STS:vm_state                              | error                                                                                                                                                                                                                                     |
| OS-SRV-USG:launched_at                           | 2016-10-28T07:49:44.000000                                                                                                                                                                                                                |
| OS-SRV-USG:terminated_at                         | -                                                                                                                                                                                                                                         |
| accessIPv4                                       |                                                                                                                                                                                                                                           |
| accessIPv6                                       |                                                                                                                                                                                                                                           |
| availability_zone                                | lihui.openstack-1                                                                                                                                                                                                                          |
| config_drive                                     | 1                                                                                                                                                                                                                                         |
| created                                          | 2016-10-28T07:40:36Z                                                                                                                                                                                                                      |
| fault                                            | {"message": "Cannot open log file: '/var/log/libvirt/qemu/instance-00125333.log': Device or resource busy", "code": 500, "details": "  File \"/usr/lib/python2.7/dist-packages/nova/compute/manager.py\", line 318, in decorated_function |
|                                                  |     return function(self, context, *args, **kwargs)                                                                                                                                                                                       |
|                                                  |   File \"/usr/lib/python2.7/dist-packages/nova/compute/manager.py\", line 2639, in reboot_instance                                                                                                                                        |
|                                                  |     bad_volumes_callback=bad_volumes_callback)                                                                                                                                                                                            |
|                                                  |   File \"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py\", line 2662, in reboot                                                                                                                                             |
|                                                  |     block_device_info)                                                                                                                                                                                                                    |
|                                                  |   File \"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py\", line 2810, in _hard_reboot                                                                                                                                       |
|                                                  |     reboot=True)                                                                                                                                                                                                                          |
|                                                  |   File \"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py\", line 4448, in _create_domain_and_network                                                                                                                         |
|                                                  |     raise ex                                                                                                                                                                                                                              |
|                                                  | ", "created": "2016-12-14T06:01:27Z"}                                                                                                                                                                                                     |
| flavor                                           | flavor_17 (17)                                                                                                                                                                                                                            |
| hostId                                           | bba6ff4f8d52bb969cbf510a3fb1aa9f2ebdbdf973868c65c04e2f89                                                                                                                                                                                  |
| hypervisor_type                                  | qemu                                                                                                                                                                                                                                      |
| id                                               | 70a1da60-1cd9-4fd9-afa8-e8498e8a4d63                                                                                                                                                                                                      |
| image                                            | realserver (6a4b01ca-c1cf-4571-8c9a-66c65c35876f)                                                                                                                                                                                         |
| key_name                                         | ggg,hhlonglonglo_pub7,tengine                                                                                                                                                                                                             |
| metadata                                         | {}                                                                                                                                                                                                                                        |
| name                                             | lxcRealServer-70a1da60-1cd9-4fd9-afa8-e8498e8a4d63                                                                                                                                                                                        |
| os-extended-volumes:volumes_attached             | []                                                                                                                                                                                                                                        |
| os-netease-extended-volumes:volumes_attached     | []                                                                                                                                                                                                                                        |
| os-server-status                                 | down                                                                                                                                                                                                                                      |
| os_type                                          | linux                                                                                                                                                                                                                                     |
| private_93610904ad8e4da2b98fc58dd196bb17 network | 10.177.194.116                                                                                                                                                                                                                            |
| security_groups                                  | default                                                                                                                                                                                                                                   |
| status                                           | ERROR                                                                                                                                                                                                                                     |
| tenant_id                                        | 93610904ad8e4da2b98fc58dd196bb17                                                                                                                                                                                                          |
| updated                                          | 2016-12-14T06:01:27Z                                                                                                                                                                                                                      |
| use_ceph                                         | no                                                                                                                                                                                                                                        |
| user_id                                          | e1923a0ad6cf43ba8e29142d8ce09a71                                                                                                                                                                                                          |
| vncPass                                          | 123321                                                                                                                                                                                                                                    |
+--------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

看了下TRACE信息,比较奇怪

2016-12-14 14:01:27.859 111095 DEBUG nova.statsd [req-9030dfa6-b2ef-4f64-b2cc-ed110ceb1146 e1923a0ad6cf43ba8e29142d8ce09a71 93610904ad8e4da2b98fc58dd196bb17
] Statsd: openstack.rpc_api@nova-netease-compute-manager-ComputeManager-reboot_instance.10-10-10-140.rpc_error:1|c _send /usr/lib/python2.7/dist-packages/no
va/statsd.py:39
2016-12-14 14:01:27.860 111095 ERROR nova.openstack.common.rpc.amqp [req-9030dfa6-b2ef-4f64-b2cc-ed110ceb1146 e1923a0ad6cf43ba8e29142d8ce09a71 93610904ad8e4
da2b98fc58dd196bb17] Exception during message handling
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp Traceback (most recent call last):
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py", line 465, i
n _process_data
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp     **args)
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/dispatcher.py", line
:
Message from syslogd@10-10-10-140 at Dec 14 19:41:33 ...
 iscsid:
179, in dispatch
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp     result = getattr(proxyobj, method)(ctxt, **kwargs)
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 413, in decorated_function
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp     return function(self, context, *args, **kwargs)
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 90, in wrapped
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp     payload)
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 73, in wrapped
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp     return f(self, context, *args, **kw)
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 303, in decorated_function
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp     pass
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 289, in decorated_function
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp     return function(self, context, *args, **kwargs)
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 354, in decorated_function
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp     function(self, context, *args, **kwargs)
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 331, in decorated_function
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp     e, sys.exc_info())
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 318, in decorated_function
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp     return function(self, context, *args, **kwargs)
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2639, in reboot_instance
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp     bad_volumes_callback=bad_volumes_callback)
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 2662, in reboot
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp     block_device_info)
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 2810, in _hard_reboot
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp     reboot=True)
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 4448, in _create_domain_and_network
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp     raise ex
2016-12-14 14:01:27.860 111095 TRACE nova.openstack.common.rpc.amqp libvirtError: Cannot open log file: '/var/log/libvirt/qemu/instance-00125333.log': Device or resource busy

看下之前做过什么操作,找到了reboot

2016-12-14 14:01:26.445 111095 AUDIT nova.compute.manager [req-9030dfa6-b2ef-4f64-b2cc-ed110ceb1146 e1923a0ad6cf43ba8e29142d8ce09a71 93610904ad8e4da2b98fc58
dd196bb17] [instance: 70a1da60-1cd9-4fd9-afa8-e8498e8a4d63] Rebooting instance

也就是仅仅做了reboot操作,虚拟机就ERROR了,继续找下周围,还是有很多TRACE相关错误信息

2016-12-14 14:01:27.549 111095 WARNING nova.virt.libvirt.driver [req-9030dfa6-b2ef-4f64-b2cc-ed110ceb1146 e1923a0ad6cf43ba8e29142d8ce09a71 93610904ad8e4da2b
98fc58dd196bb17] [instance: 70a1da60-1cd9-4fd9-afa8-e8498e8a4d63] libvirt error: error code 38, error message: Cannot open log file: '/var/log/libvirt/qemu/
instance-00125333.log': Device or resource busy
2016-12-14 14:01:27.553 111095 DEBUG nova.compute.manager [req-9030dfa6-b2ef-4f64-b2cc-ed110ceb1146 e1923a0ad6cf43ba8e29142d8ce09a71 93610904ad8e4da2b98fc58
dd196bb17] [instance: 70a1da60-1cd9-4fd9-afa8-e8498e8a4d63] Checking state _get_power_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:967
2016-12-14 14:01:27.555 111095 ERROR nova.compute.manager [req-9030dfa6-b2ef-4f64-b2cc-ed110ceb1146 e1923a0ad6cf43ba8e29142d8ce09a71 93610904ad8e4da2b98fc58
dd196bb17] [instance: 70a1da60-1cd9-4fd9-afa8-e8498e8a4d63] Cannot reboot instance: Cannot open log file: '/var/log/libvirt/qemu/instance-00125333.log': Dev
ice or resource busy

看到这里,得看看libvrit日志了,却意外地发现了这个

2016-12-14 04:59:12.951+0000: 17159: error : virQEMUCapsNewForBinaryInternal:3737 : Cannot check QEMU binary /usr/bin/kvm: No such file or directory
2016-12-14 04:59:12.952+0000: 17159: error : virQEMUCapsNewForBinaryInternal:3737 : Cannot check QEMU binary /usr/bin/kvm: No such file or directory
2016-12-14 04:59:12.953+0000: 17159: error : virQEMUCapsNewForBinaryInternal:3737 : Cannot check QEMU binary /usr/bin/kvm: No such file or directory
2016-12-14 04:59:12.955+0000: 17159: error : virQEMUCapsNewForBinaryInternal:3737 : Cannot check QEMU binary /usr/bin/kvm: No such file or directory
2016-12-14 04:59:12.956+0000: 17159: error : virQEMUCapsNewForBinaryInternal:3737 : Cannot check QEMU binary /usr/bin/kvm: No such file or directory
2016-12-14 04:59:12.957+0000: 17159: error : virQEMUCapsNewForBinaryInternal:3737 : Cannot check QEMU binary /usr/bin/kvm: No such file or directory
2016-12-14 04:59:12.959+0000: 17159: error : virQEMUCapsNewForBinaryInternal:3737 : Cannot check QEMU binary /usr/bin/kvm: No such file or directory
2016-12-14 04:59:12.960+0000: 17159: error : virQEMUCapsNewForBinaryInternal:3737 : Cannot check QEMU binary /usr/bin/kvm: No such file or directory
2016-12-14 04:59:12.961+0000: 17159: error : virQEMUCapsNewForBinaryInternal:3737 : Cannot check QEMU binary /usr/bin/kvm: No such file or directory
2016-12-14 04:59:12.963+0000: 17159: error : virQEMUCapsNewForBinaryInternal:3737 : Cannot check QEMU binary /usr/bin/kvm: No such file or directory
2016-12-14 04:59:12.964+0000: 17159: error : virQEMUCapsNewForBinaryInternal:3737 : Cannot check QEMU binary /usr/bin/kvm: No such file or directory
2016-12-14 04:59:12.966+0000: 17159: error : virQEMUCapsNewForBinaryInternal:3737 : Cannot check QEMU binary /usr/bin/kvm: No such file or directory
2016-12-14 04:59:12.967+0000: 17159: error : virQEMUCapsNewForBinaryInternal:3737 : Cannot check QEMU binary /usr/bin/kvm: No such file or directory
2016-12-14 04:59:12.968+0000: 17159: error : virQEMUCapsNewForBinaryInternal:3737 : Cannot check QEMU binary /usr/bin/kvm: No such file or directory
2016-12-14 04:59:12.970+0000: 17159: error : virQEMUCapsNewForBinaryInternal:3737 : Cannot check QEMU binary /usr/bin/kvm: No such file or directory
2016-12-14 04:59:12.971+0000: 17159: error : virQEMUCapsNewForBinaryInternal:3737 : Cannot check QEMU binary /usr/bin/kvm: No such file or directory
2016-12-14 04:59:12.972+0000: 17159: error : virQEMUCapsNewForBinaryInternal:3737 : Cannot check QEMU binary /usr/bin/kvm: No such file or directory
2016-12-14 04:59:12.974+0000: 17159: error : virQEMUCapsNewForBinaryInternal:3737 : Cannot check QEMU binary /usr/bin/kvm: No such file or directory
2016-12-14 04:59:12.975+0000: 17159: error : virQEMUCapsNewForBinaryInternal:3737 : Cannot check QEMU binary /usr/bin/kvm: No such file or directory
2016-12-14 04:59:12.977+0000: 17159: error : virQEMUCapsNewForBinaryInternal:3737 : Cannot check QEMU binary /usr/bin/kvm: No such file or directory
2016-12-14 04:59:12.978+0000: 17159: error : virQEMUCapsNewForBinaryInternal:3737 : Cannot check QEMU binary /usr/bin/kvm: No such file or directory
2016-12-14 04:59:12.979+0000: 17159: error : virQEMUCapsNewForBinaryInternal:3737 : Cannot check QEMU binary /usr/bin/kvm: No such file or directory

这里是标准时间,+8就是中午更新qemu的时间,也就是更新了之后一直有问题,但居然是找不到二进制文件,这就尴尬了

不以为然地看了下,惊呆了

~$ which kvm
~$

看下随便一个虚拟机的dumpxml的信息

~$ sudo virsh dumpxml instance-00135533 | grep emulator
    /usr/bin/qemu-system-x86_64

居然变成了/usr/bin/qemu-system-x86_64

~$ which qemu-system-x86_64
/usr/bin/qemu-system-x86_64

可见在更新qemu后,卸载掉了原来的kvm,而新的虚拟机xml里用到的是qemu-system-x86_64

这样新创建的虚拟机,dumpxml信息emulator会变成qemu-system-x86_64无误,但是老的虚拟机里面还都是kvm,导致无法使用,也就是无法兼容老的虚拟机的正常,因此这里需要做修改

自行解决的方法就比较简单,做一个软连接即可

ln -s /usr/bin/qemu-system-x86_64 /usr/bin/kvm

查询

~$ ls -l /usr/bin/kvm
lrwxrwxrwx 1 root root 27 Dec 14 16:07 /usr/bin/kvm -> /usr/bin/qemu-system-x86_64

最后需要重启libvirtd服务,否则老的KVM虚拟机无法管理

看看dumpxml信息,这样就OK了

~$ sudo virsh dumpxml instance-00135533 | grep emulator
    /usr/bin/kvm

这只是临时解决方案,最终还是要放在打包中配置好

发表评论