在虚拟机删除的操作当中,时长会有一些异常出现,比如nova-compute服务卡主,libvirtd中断等,当然也有可能还有其它的一些依赖服务出现问题,导致本来不是主机服务异常却引起无法删除虚拟机的场景出现,下面就是一个云硬盘服务的异常触发的问题
首先,虚拟机执行删除操作
lihui@MacBook ~/server/source_txt nova show 83e35fec-6da8-44c0-9211-e72e8ab95c61
+--------------------------------------------------+-----------------------------------------------------------------------------------------------------------+
| Property | Value |
+--------------------------------------------------+-----------------------------------------------------------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | test1.badceph2 |
| OS-EXT-SRV-ATTR:host | nova30.openstack.org |
| OS-EXT-SRV-ATTR:hypervisor_hostname | nova30.openstack.org |
| OS-EXT-SRV-ATTR:instance_name | instance-0001242d |
| OS-EXT-STS:power_state | 0 |
| OS-EXT-STS:task_state | deleting |
| OS-EXT-STS:vm_state | active |
| OS-SRV-USG:launched_at | 2016-11-21T02:25:48.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| availability_zone | test1.badceph2 |
| config_drive | 1 |
| created | 2016-11-21T02:24:06Z |
| flavor | flavor_2 (2) |
| hostId | c58c33d798ee65479cfb84695845d537f40079d4796e9313ba6da758 |
| hypervisor_type | qemu |
| id | 83e35fec-6da8-44c0-9211-e72e8ab95c61 |
| image | debian_7_x86_64_pub_static_36840.raw (0d26f602-7c69-43f4-aa70-1371bd05b1e1) |
| key_name | lihui_yq_test |
| metadata | {} |
| name | lihui-test1.badceph2:nova30.openstack.org-7 |
| os-extended-volumes:volumes_attached | [{"id": "b3a7925e-e2de-4b4a-94e9-380dc10b0e36"}] |
| os-netease-extended-volumes:volumes_attached | [{"delete_on_terminate": false, "id": "b3a7925e-e2de-4b4a-94e9-380dc10b0e36", "device_name": "/dev/vdd"}] |
| os-server-status | down |
| os_type | linux |
| private_9bcf446410594faf884aaab076f43cbf network | 10.18.194.182 |
| progress | 0 |
| security_groups | default |
| status | ACTIVE |
| tenant_id | 9bcf446410594faf884aaab076f43cbf |
| updated | 2016-12-06T06:30:29Z |
| use_ceph | yes |
| user_id | fea367d530534801bd5332ea6e06fac6 |
| vncPass | cVMfx6 |
+--------------------------------------------------+-----------------------------------------------------------------------------------------------------------+
可以过了好久,居然又从deleting状态恢复了,而且虚拟机依旧存在
lihui@MacBook ~/server/source_txt nova show 83e35fec-6da8-44c0-9211-e72e8ab95c61 +--------------------------------------------------+-----------------------------------------------------------------------------------------------------------+ | Property | Value | +--------------------------------------------------+-----------------------------------------------------------------------------------------------------------+ | OS-DCF:diskConfig | MANUAL | | OS-EXT-AZ:availability_zone | test1.badceph2 | | OS-EXT-SRV-ATTR:host | nova30.openstack.org | | OS-EXT-SRV-ATTR:hypervisor_hostname | nova30.openstack.org | | OS-EXT-SRV-ATTR:instance_name | instance-0001242d | | OS-EXT-STS:power_state | 0 | | OS-EXT-STS:task_state | - | | OS-EXT-STS:vm_state | active | | OS-SRV-USG:launched_at | 2016-11-21T02:25:48.000000 |
从字段os-netease-extended-volumes:volumes_attached可以看出来,虚拟机挂载了一个云硬盘
接着看问题,首先计算节点查看nova-compute日志
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp Traceback (most recent call last):
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py", line 465, i
n _process_data
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp **args)
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/dispatcher.py", line
179, in dispatch
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp result = getattr(proxyobj, method)(ctxt, **kwargs)
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 413, in decorate
d_function
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp return function(self, context, *args, **kwargs)
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 90, in wrapped
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp payload)
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 73, in wrapped
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp return f(self, context, *args, **kw)
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 303, in decorated_function
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp pass
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 289, in decorated_function
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp return function(self, context, *args, **kwargs)
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 354, in decorated_function
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp function(self, context, *args, **kwargs)
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 331, in decorated_function
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp e, sys.exc_info())
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 318, in decorated_function
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp return function(self, context, *args, **kwargs)
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2219, in terminate_instance
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp do_terminate_instance(instance, bdms, clean_shutdown)
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/openstack/common/lockutils.py", line 248, in inner
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp return f(*args, **kwargs)
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2211, in do_terminate_instance
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp reservations=reservations)
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/hooks.py", line 105, in inner
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp rv = f(*args, **kwargs)
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2182, in _delete_instance
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp user_id=user_id)
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2154, in _delete_instance
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp clean_shutdown=clean_shutdown)
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2097, in _shutdown_instance
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp connector)
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/volume/cinder.py", line 185, in wrapper
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp res = method(self, ctx, volume_id, *args, **kwargs)
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/volume/cinder.py", line 293, in terminate_connection
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp connector)
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/cinderclient/v1/volumes.py", line 368, in terminate_connection
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp {'connector': connector})
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/cinderclient/v1/volumes.py", line 287, in _action
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp return self.api.client.post(url, body=body)
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/cinderclient/client.py", line 210, in post
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp return self._cs_request(url, 'POST', **kwargs)
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/cinderclient/client.py", line 199, in _cs_request
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp raise exceptions.ConnectionError(msg)
从错误日志可以看出来,是nova里调用了cinderclient的时候出错,因此接着需要查看的是cinder API的日志
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault Traceback (most recent call last):
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/cinder/api/middleware/fault.py", line 77, in __cal
l__
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault return req.get_response(self.application)
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/webob/request.py", line 1296, in send
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault application, catch_exc_info=False)
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/webob/request.py", line 1260, in call_application
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault app_iter = application(self.environ, start_response)
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/webob/dec.py", line 144, in __call__
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault return resp(environ, start_response)
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/keystoneclient/middleware/auth_token.py", line 598
, in __call__
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault return self.app(env, start_response)
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/webob/dec.py", line 144, in __call__
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault return resp(environ, start_response)
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/webob/dec.py", line 144, in __call__
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault return resp(environ, start_response)
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/routes/middleware.py", line 131, in __call__
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault response = self.app(environ, start_response)
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/webob/dec.py", line 144, in __call__
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault return resp(environ, start_response)
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/webob/dec.py", line 130, in __call__
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault resp = self.call_func(req, *args, **self.kwargs)
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/webob/dec.py", line 195, in call_func
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault return self.func(req, *args, **kwargs)
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/cinder/api/openstack/wsgi.py", line 898, in __call__
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault content_type, body, accept)
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/cinder/api/openstack/wsgi.py", line 946, in _process_stack
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault action_result = self.dispatch(meth, request, action_args)
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/cinder/api/openstack/wsgi.py", line 1022, in dispatch
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault return method(req=request, **action_args)
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/cinder/api/contrib/volume_actions.py", line 172, in _terminate_connection
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault self.volume_api.terminate_connection(context, volume, connector)
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/cinder/volume/api.py", line 78, in wrapped
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault return func(self, context, target_obj, *args, **kwargs)
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/cinder/volume/api.py", line 473, in terminate_connection
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault force)
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/cinder/volume/rpcapi.py", line 147, in terminate_connection
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault volume['host']))
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/cinder/openstack/common/rpc/proxy.py", line 129, in call
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault exc.info, real_topic, msg.get('method'))
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault Timeout: Timeout while waiting on RPC response - topic: "cinder-volume:nova43.openstack.org@ceph", RPC method: "terminate_connection" info: ""
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault
2016-12-06 14:38:50.568 129759 INFO cinder.api.middleware.fault [req-7a37b38e-1195-4cf2-874d-f96536296587 d8385714a8374c2395a899a6450ef22b 73ba41a9f57e4af8b6a362295ab92b4a] http://10.185.0.253:8776/v1/73ba41a9f57e4af8b6a362295ab92b4a/volumes/b3a7925e-e2de-4b4a-94e9-380dc10b0e36/action returned with HTTP 500
这里原因很清晰了,云硬盘所在的cinder-volume节点服务返回500,因此查一下对应节点服务状况
可以看到,服务的确DOWN了
lihui@MacBook ~/server/source_txt cinder service-list | grep nova43.openstack.org | 98 | cinder-volume | nova43.openstack.org@ceph | test1 | enabled | down | 2016-11-23T09:12:32.000000 |
所以根本原因是,收到删除虚拟机的请求后,需要卸载掉主机挂载的云硬盘,但是此刻cinder-volume节点服务是DOWN的,因此无法卸掉,导致返回失败,因此云主机也无法正常删除
如果想清理掉主机云硬盘资源,得需要修改云硬盘cinder-volume服务,也就是进行迁移(非正常接口)
lihui@MacBook ~/server/source_txt cinder host-volumes-migrate --force-volumes-migrate=True nova43.openstack.org@ceph nova34.openstack.org@ceph
迁移之后,可以看到cinder-volume服务成功变成了目标节点
lihui@MacBook ~/server/source_txt cinder show b3a7925e-e2de-4b4a-94e9-380dc10b0e36
+---------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Property | Value |
+---------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| attachments | [{u'device': u'/dev/nbs/xdjo', u'server_id': u'83e35fec-6da8-44c0-9211-e72e8ab95c61', u'id': u'b3a7925e-e2de-4b4a-94e9-380dc10b0e36', u'host_name': None, u'volume_id': u'b3a7925e-e2de-4b4a-94e9-380dc10b0e36'}] |
| availability_zone | test1 |
| bootable | false |
| created_at | 2016-11-21T06:49:16.000000 |
| display_description | None |
| display_name | lihui-ceph-100 |
| id | b3a7925e-e2de-4b4a-94e9-380dc10b0e36 |
| metadata | {u'readonly': u'False', u'attached_mode': u'rw'} |
| os-vol-host-attr:host | nova34.openstack.org@ceph |
| os-vol-mig-status-attr:migstat | None |
| os-vol-mig-status-attr:name_id | None |
| os-vol-provider-attr:provider_auth | None |
| os-vol-provider-attr:provider_geometry | None |
| os-vol-provider-attr:provider_location | None |
| os-vol-provider-attr:provider_pool_location | None |
| os-vol-tenant-attr:tenant_id | 9bcf446410594faf884aaab076f43cbf |
| size | 100 |
| snapshot_id | None |
| source_volid | None |
| status | in-use |
| volume_qos | {u'read_bps': u'86558041', u'write_bps': u'86558041', u'read_iops': u'122', u'write_iops': u'204'} |
| volume_type | ceph_bad |
+---------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
最后再进行删除主机操作即可
lihui@MacBook ~/server/source_txt nova delete 83e35fec-6da8-44c0-9211-e72e8ab95c61 ✘ lihui@MacBook ~/server/source_txt nova show 83e35fec-6da8-44c0-9211-e72e8ab95c61 ERROR: No server with a name or ID of '83e35fec-6da8-44c0-9211-e72e8ab95c61' exists.
