在虚拟机删除的操作当中,时长会有一些异常出现,比如nova-compute服务卡主,libvirtd中断等,当然也有可能还有其它的一些依赖服务出现问题,导致本来不是主机服务异常却引起无法删除虚拟机的场景出现,下面就是一个云硬盘服务的异常触发的问题
首先,虚拟机执行删除操作
lihui@MacBook ~/server/source_txt nova show 83e35fec-6da8-44c0-9211-e72e8ab95c61 +--------------------------------------------------+-----------------------------------------------------------------------------------------------------------+ | Property | Value | +--------------------------------------------------+-----------------------------------------------------------------------------------------------------------+ | OS-DCF:diskConfig | MANUAL | | OS-EXT-AZ:availability_zone | test1.badceph2 | | OS-EXT-SRV-ATTR:host | nova30.openstack.org | | OS-EXT-SRV-ATTR:hypervisor_hostname | nova30.openstack.org | | OS-EXT-SRV-ATTR:instance_name | instance-0001242d | | OS-EXT-STS:power_state | 0 | | OS-EXT-STS:task_state | deleting | | OS-EXT-STS:vm_state | active | | OS-SRV-USG:launched_at | 2016-11-21T02:25:48.000000 | | OS-SRV-USG:terminated_at | - | | accessIPv4 | | | accessIPv6 | | | availability_zone | test1.badceph2 | | config_drive | 1 | | created | 2016-11-21T02:24:06Z | | flavor | flavor_2 (2) | | hostId | c58c33d798ee65479cfb84695845d537f40079d4796e9313ba6da758 | | hypervisor_type | qemu | | id | 83e35fec-6da8-44c0-9211-e72e8ab95c61 | | image | debian_7_x86_64_pub_static_36840.raw (0d26f602-7c69-43f4-aa70-1371bd05b1e1) | | key_name | lihui_yq_test | | metadata | {} | | name | lihui-test1.badceph2:nova30.openstack.org-7 | | os-extended-volumes:volumes_attached | [{"id": "b3a7925e-e2de-4b4a-94e9-380dc10b0e36"}] | | os-netease-extended-volumes:volumes_attached | [{"delete_on_terminate": false, "id": "b3a7925e-e2de-4b4a-94e9-380dc10b0e36", "device_name": "/dev/vdd"}] | | os-server-status | down | | os_type | linux | | private_9bcf446410594faf884aaab076f43cbf network | 10.18.194.182 | | progress | 0 | | security_groups | default | | status | ACTIVE | | tenant_id | 9bcf446410594faf884aaab076f43cbf | | updated | 2016-12-06T06:30:29Z | | use_ceph | yes | | user_id | fea367d530534801bd5332ea6e06fac6 | | vncPass | cVMfx6 | +--------------------------------------------------+-----------------------------------------------------------------------------------------------------------+
可以过了好久,居然又从deleting状态恢复了,而且虚拟机依旧存在
lihui@MacBook ~/server/source_txt nova show 83e35fec-6da8-44c0-9211-e72e8ab95c61 +--------------------------------------------------+-----------------------------------------------------------------------------------------------------------+ | Property | Value | +--------------------------------------------------+-----------------------------------------------------------------------------------------------------------+ | OS-DCF:diskConfig | MANUAL | | OS-EXT-AZ:availability_zone | test1.badceph2 | | OS-EXT-SRV-ATTR:host | nova30.openstack.org | | OS-EXT-SRV-ATTR:hypervisor_hostname | nova30.openstack.org | | OS-EXT-SRV-ATTR:instance_name | instance-0001242d | | OS-EXT-STS:power_state | 0 | | OS-EXT-STS:task_state | - | | OS-EXT-STS:vm_state | active | | OS-SRV-USG:launched_at | 2016-11-21T02:25:48.000000 |
从字段os-netease-extended-volumes:volumes_attached可以看出来,虚拟机挂载了一个云硬盘
接着看问题,首先计算节点查看nova-compute日志
2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp Traceback (most recent call last): 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py", line 465, i n _process_data 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp **args) 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/dispatcher.py", line 179, in dispatch 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp result = getattr(proxyobj, method)(ctxt, **kwargs) 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 413, in decorate d_function 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp return function(self, context, *args, **kwargs) 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 90, in wrapped 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp payload) 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 73, in wrapped 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp return f(self, context, *args, **kw) 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 303, in decorated_function 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp pass 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 289, in decorated_function 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp return function(self, context, *args, **kwargs) 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 354, in decorated_function 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp function(self, context, *args, **kwargs) 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 331, in decorated_function 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp e, sys.exc_info()) 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 318, in decorated_function 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp return function(self, context, *args, **kwargs) 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2219, in terminate_instance 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp do_terminate_instance(instance, bdms, clean_shutdown) 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/openstack/common/lockutils.py", line 248, in inner 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp return f(*args, **kwargs) 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2211, in do_terminate_instance 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp reservations=reservations) 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/hooks.py", line 105, in inner 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp rv = f(*args, **kwargs) 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2182, in _delete_instance 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp user_id=user_id) 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2154, in _delete_instance 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp clean_shutdown=clean_shutdown) 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2097, in _shutdown_instance 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp connector) 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/volume/cinder.py", line 185, in wrapper 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp res = method(self, ctx, volume_id, *args, **kwargs) 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/nova/volume/cinder.py", line 293, in terminate_connection 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp connector) 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/cinderclient/v1/volumes.py", line 368, in terminate_connection 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp {'connector': connector}) 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/cinderclient/v1/volumes.py", line 287, in _action 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp return self.api.client.post(url, body=body) 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/cinderclient/client.py", line 210, in post 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp return self._cs_request(url, 'POST', **kwargs) 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.7/dist-packages/cinderclient/client.py", line 199, in _cs_request 2016-12-06 14:37:11.453 148945 TRACE nova.openstack.common.rpc.amqp raise exceptions.ConnectionError(msg)
从错误日志可以看出来,是nova里调用了cinderclient的时候出错,因此接着需要查看的是cinder API的日志
2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault Traceback (most recent call last): 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/cinder/api/middleware/fault.py", line 77, in __cal l__ 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault return req.get_response(self.application) 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/webob/request.py", line 1296, in send 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault application, catch_exc_info=False) 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/webob/request.py", line 1260, in call_application 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault app_iter = application(self.environ, start_response) 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/webob/dec.py", line 144, in __call__ 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault return resp(environ, start_response) 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/keystoneclient/middleware/auth_token.py", line 598 , in __call__ 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault return self.app(env, start_response) 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/webob/dec.py", line 144, in __call__ 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault return resp(environ, start_response) 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/webob/dec.py", line 144, in __call__ 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault return resp(environ, start_response) 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/routes/middleware.py", line 131, in __call__ 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault response = self.app(environ, start_response) 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/webob/dec.py", line 144, in __call__ 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault return resp(environ, start_response) 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/webob/dec.py", line 130, in __call__ 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault resp = self.call_func(req, *args, **self.kwargs) 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/webob/dec.py", line 195, in call_func 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault return self.func(req, *args, **kwargs) 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/cinder/api/openstack/wsgi.py", line 898, in __call__ 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault content_type, body, accept) 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/cinder/api/openstack/wsgi.py", line 946, in _process_stack 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault action_result = self.dispatch(meth, request, action_args) 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/cinder/api/openstack/wsgi.py", line 1022, in dispatch 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault return method(req=request, **action_args) 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/cinder/api/contrib/volume_actions.py", line 172, in _terminate_connection 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault self.volume_api.terminate_connection(context, volume, connector) 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/cinder/volume/api.py", line 78, in wrapped 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault return func(self, context, target_obj, *args, **kwargs) 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/cinder/volume/api.py", line 473, in terminate_connection 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault force) 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/cinder/volume/rpcapi.py", line 147, in terminate_connection 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault volume['host'])) 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault File "/usr/lib/python2.7/dist-packages/cinder/openstack/common/rpc/proxy.py", line 129, in call 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault exc.info, real_topic, msg.get('method')) 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault Timeout: Timeout while waiting on RPC response - topic: "cinder-volume:nova43.openstack.org@ceph", RPC method: "terminate_connection" info: "" 2016-12-06 14:38:50.567 129759 TRACE cinder.api.middleware.fault 2016-12-06 14:38:50.568 129759 INFO cinder.api.middleware.fault [req-7a37b38e-1195-4cf2-874d-f96536296587 d8385714a8374c2395a899a6450ef22b 73ba41a9f57e4af8b6a362295ab92b4a] http://10.185.0.253:8776/v1/73ba41a9f57e4af8b6a362295ab92b4a/volumes/b3a7925e-e2de-4b4a-94e9-380dc10b0e36/action returned with HTTP 500
这里原因很清晰了,云硬盘所在的cinder-volume节点服务返回500,因此查一下对应节点服务状况
可以看到,服务的确DOWN了
lihui@MacBook ~/server/source_txt cinder service-list | grep nova43.openstack.org | 98 | cinder-volume | nova43.openstack.org@ceph | test1 | enabled | down | 2016-11-23T09:12:32.000000 |
所以根本原因是,收到删除虚拟机的请求后,需要卸载掉主机挂载的云硬盘,但是此刻cinder-volume节点服务是DOWN的,因此无法卸掉,导致返回失败,因此云主机也无法正常删除
如果想清理掉主机云硬盘资源,得需要修改云硬盘cinder-volume服务,也就是进行迁移(非正常接口)
lihui@MacBook ~/server/source_txt cinder host-volumes-migrate --force-volumes-migrate=True nova43.openstack.org@ceph nova34.openstack.org@ceph
迁移之后,可以看到cinder-volume服务成功变成了目标节点
lihui@MacBook ~/server/source_txt cinder show b3a7925e-e2de-4b4a-94e9-380dc10b0e36 +---------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Property | Value | +---------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | attachments | [{u'device': u'/dev/nbs/xdjo', u'server_id': u'83e35fec-6da8-44c0-9211-e72e8ab95c61', u'id': u'b3a7925e-e2de-4b4a-94e9-380dc10b0e36', u'host_name': None, u'volume_id': u'b3a7925e-e2de-4b4a-94e9-380dc10b0e36'}] | | availability_zone | test1 | | bootable | false | | created_at | 2016-11-21T06:49:16.000000 | | display_description | None | | display_name | lihui-ceph-100 | | id | b3a7925e-e2de-4b4a-94e9-380dc10b0e36 | | metadata | {u'readonly': u'False', u'attached_mode': u'rw'} | | os-vol-host-attr:host | nova34.openstack.org@ceph | | os-vol-mig-status-attr:migstat | None | | os-vol-mig-status-attr:name_id | None | | os-vol-provider-attr:provider_auth | None | | os-vol-provider-attr:provider_geometry | None | | os-vol-provider-attr:provider_location | None | | os-vol-provider-attr:provider_pool_location | None | | os-vol-tenant-attr:tenant_id | 9bcf446410594faf884aaab076f43cbf | | size | 100 | | snapshot_id | None | | source_volid | None | | status | in-use | | volume_qos | {u'read_bps': u'86558041', u'write_bps': u'86558041', u'read_iops': u'122', u'write_iops': u'204'} | | volume_type | ceph_bad | +---------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
最后再进行删除主机操作即可
lihui@MacBook ~/server/source_txt nova delete 83e35fec-6da8-44c0-9211-e72e8ab95c61 ✘ lihui@MacBook ~/server/source_txt nova show 83e35fec-6da8-44c0-9211-e72e8ab95c61 ERROR: No server with a name or ID of '83e35fec-6da8-44c0-9211-e72e8ab95c61' exists.