Python的multiprocessing引起的问题

刚在用到python的mutliprocessing的时候,遇到了一个问题

引申为一个小例子就是,比如多进程并发在不同的az上创建虚拟机

#!/usr/bin/env python

import multiprocessing

class Nvs(object):
    def __init__(self):
        self.az_list = ['xiaoshan', 'binjiang', 'beijing']
    def create_vm(self, az):
        print('Create vm in %s !!!' %az)
    def run(self):
        pool = multiprocessing.Pool(multiprocessing.cpu_count())
        pool.map(self.create_vm, self.az_list)
        pool.close()
        pool.join()

if __name__ == '__main__':
    nvs = Nvs()
    nvs.run()

通过map将所有az参数传入执行,结果一个奇怪的错误

Traceback (most recent call last):
  File "lihui.py", line 18, in
    nvs.run()
  File "lihui.py", line 12, in run
    pool.map(self.create_vm, self.az_list)
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 251, in map
    return self.map_async(func, iterable, chunksize).get()
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 558, in get
    raise self._value
cPickle.PicklingError: Can't pickle <type 'instancemethod'="">: attribute lookup __builtin__.instancemethod failed

这里的报错是对instancemethod进行pickle失败,这里的method应该是create_vm,网上搜了一下,其实原因是multiprocessing会对调用的函数进行序列化,但是类方法却不能进行序列化导致出错,关于序列化可以查看廖雪峰的主页

http://www.liaoxuefeng.com/wiki/001374738125095c955c1e6d8bb493182103fac9270762a000/00138683221577998e407bb309542d9b6a68d9276bc3dbe000

手动序列化一个类方法

>>> import pickle
>>> class My(object):
...     def foo(self):
...             pass
...
>>> my = My()
>>> pickle.dumps(my.foo)
Traceback (most recent call last):
  File "", line 1, in
  File "/usr/local/Cellar/python/2.7.10/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1374, in dumps
    Pickler(file, protocol).dump(obj)
  File "/usr/local/Cellar/python/2.7.10/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 224, in dump
    self.save(obj)
  File "/usr/local/Cellar/python/2.7.10/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 306, in save
    rv = reduce(self.proto)
  File "/usr/local/Cellar/python/2.7.10/Frameworks/Python.framework/Versions/2.7/lib/python2.7/copy_reg.py", line 70, in _reduce_ex
    raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle instancemethod objects

的确无法pickle,从python2的doc上可以找到,哪些是可以pickle的

The following types can be pickled:

None, True, and False
integers, long integers, floating point numbers, complex numbers
normal and Unicode strings
tuples, lists, sets, and dictionaries containing only picklable objects
functions defined at the top level of a module
built-in functions defined at the top level of a module
classes that are defined at the top level of a module
instances of such classes whose __dict__ or the result of calling __getstate__() is picklable (see section The pickle protocol for details).

参考链接:

https://docs.python.org/2/library/pickle.html#what-can-be-pickled-and-unpickled

想运行通过,将类方法改成全局函数肯定是可以通过的

#!/usr/bin/env python

import multiprocessing

def create_vm(az):
    print('Create vm in %s !!!' %az)

class Nvs(object):
    def __init__(self):
        self.az_list = ['xiaoshan', 'binjiang', 'beijing']
    def run(self):
        pool = multiprocessing.Pool(multiprocessing.cpu_count())
        pool.map(create_vm, self.az_list)
        pool.close()
        pool.join()

if __name__ == '__main__':
    nvs = Nvs()
    nvs.run()

 假如这个方法必须放到Nvs类里,那么可以借鉴网上流传的一种思路,定义一个全局的代理函数,将类名和要传入的参数作为参数列表传给这个代理函数,通过pool.apply_async方法,正好避免了序列化类方法

#!/usr/bin/env python

import multiprocessing

def f(cls, az):
    return cls.create_vm(az)

class Nvs(object):
    def __init__(self):
        self.az_list = ['xiaoshan', 'binjiang', 'beijing']
    def create_vm(self, az):
        print('Create vm in %s !!!' %az)
    def run(self):
        processes = len(self.az_list)
        pool = multiprocessing.Pool(processes)
        for i in range(processes):
            pool.apply_async(f, args = (self, self.az_list[i], ))
        pool.close()
        pool.join()

if __name__ == '__main__':
    nvs = Nvs()
    nvs.run()

其实最开始的map问题,在python3.4环境下,是能运行通过的

$ python3.4 lihui.py
Create vm in xiaoshan !!!
Create vm in binjiang !!!
Create vm in beijing !!!

 原因python3里做了修改,实现了一个什么__qualname__属性,详情

https://www.python.org/dev/peps/pep-3154/

试试序列化类方法,发现3.4的确是成功的

$ python3.4
Python 3.4.3 (default, May 25 2015, 18:48:21)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.56)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pickle
>>> class My(object):
...     def foo(self):
...         pass
...
>>> my = My()
>>> pickle.dumps(my.foo)
b'\x80\x03cbuiltins\ngetattr\nq\x00c__main__\nMy\nq\x01)\x81q\x02X\x03\x00\x00\x00fooq\x03\x86q\x04Rq\x05.'
>>> 

发表回复