刚在用到python的mutliprocessing的时候,遇到了一个问题
引申为一个小例子就是,比如多进程并发在不同的az上创建虚拟机
#!/usr/bin/env python import multiprocessing class Nvs(object): def __init__(self): self.az_list = ['xiaoshan', 'binjiang', 'beijing'] def create_vm(self, az): print('Create vm in %s !!!' %az) def run(self): pool = multiprocessing.Pool(multiprocessing.cpu_count()) pool.map(self.create_vm, self.az_list) pool.close() pool.join() if __name__ == '__main__': nvs = Nvs() nvs.run()
通过map将所有az参数传入执行,结果一个奇怪的错误
Traceback (most recent call last): File "lihui.py", line 18, in nvs.run() File "lihui.py", line 12, in run pool.map(self.create_vm, self.az_list) File "/usr/lib/python2.7/multiprocessing/pool.py", line 251, in map return self.map_async(func, iterable, chunksize).get() File "/usr/lib/python2.7/multiprocessing/pool.py", line 558, in get raise self._value cPickle.PicklingError: Can't pickle <type 'instancemethod'="">: attribute lookup __builtin__.instancemethod failed
这里的报错是对instancemethod进行pickle失败,这里的method应该是create_vm,网上搜了一下,其实原因是multiprocessing会对调用的函数进行序列化,但是类方法却不能进行序列化导致出错,关于序列化可以查看廖雪峰的主页
http://www.liaoxuefeng.com/wiki/001374738125095c955c1e6d8bb493182103fac9270762a000/00138683221577998e407bb309542d9b6a68d9276bc3dbe000
手动序列化一个类方法
>>> import pickle >>> class My(object): ... def foo(self): ... pass ... >>> my = My() >>> pickle.dumps(my.foo) Traceback (most recent call last): File "", line 1, in File "/usr/local/Cellar/python/2.7.10/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1374, in dumps Pickler(file, protocol).dump(obj) File "/usr/local/Cellar/python/2.7.10/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 224, in dump self.save(obj) File "/usr/local/Cellar/python/2.7.10/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 306, in save rv = reduce(self.proto) File "/usr/local/Cellar/python/2.7.10/Frameworks/Python.framework/Versions/2.7/lib/python2.7/copy_reg.py", line 70, in _reduce_ex raise TypeError, "can't pickle %s objects" % base.__name__ TypeError: can't pickle instancemethod objects
的确无法pickle,从python2的doc上可以找到,哪些是可以pickle的
The following types can be pickled: None, True, and False integers, long integers, floating point numbers, complex numbers normal and Unicode strings tuples, lists, sets, and dictionaries containing only picklable objects functions defined at the top level of a module built-in functions defined at the top level of a module classes that are defined at the top level of a module instances of such classes whose __dict__ or the result of calling __getstate__() is picklable (see section The pickle protocol for details).
参考链接:
https://docs.python.org/2/library/pickle.html#what-can-be-pickled-and-unpickled
想运行通过,将类方法改成全局函数肯定是可以通过的
#!/usr/bin/env python import multiprocessing def create_vm(az): print('Create vm in %s !!!' %az) class Nvs(object): def __init__(self): self.az_list = ['xiaoshan', 'binjiang', 'beijing'] def run(self): pool = multiprocessing.Pool(multiprocessing.cpu_count()) pool.map(create_vm, self.az_list) pool.close() pool.join() if __name__ == '__main__': nvs = Nvs() nvs.run()
假如这个方法必须放到Nvs类里,那么可以借鉴网上流传的一种思路,定义一个全局的代理函数,将类名和要传入的参数作为参数列表传给这个代理函数,通过pool.apply_async方法,正好避免了序列化类方法
#!/usr/bin/env python import multiprocessing def f(cls, az): return cls.create_vm(az) class Nvs(object): def __init__(self): self.az_list = ['xiaoshan', 'binjiang', 'beijing'] def create_vm(self, az): print('Create vm in %s !!!' %az) def run(self): processes = len(self.az_list) pool = multiprocessing.Pool(processes) for i in range(processes): pool.apply_async(f, args = (self, self.az_list[i], )) pool.close() pool.join() if __name__ == '__main__': nvs = Nvs() nvs.run()
其实最开始的map问题,在python3.4环境下,是能运行通过的
$ python3.4 lihui.py Create vm in xiaoshan !!! Create vm in binjiang !!! Create vm in beijing !!!
原因python3里做了修改,实现了一个什么__qualname__属性,详情
https://www.python.org/dev/peps/pep-3154/
试试序列化类方法,发现3.4的确是成功的
$ python3.4 Python 3.4.3 (default, May 25 2015, 18:48:21) [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.56)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import pickle >>> class My(object): ... def foo(self): ... pass ... >>> my = My() >>> pickle.dumps(my.foo) b'\x80\x03cbuiltins\ngetattr\nq\x00c__main__\nMy\nq\x01)\x81q\x02X\x03\x00\x00\x00fooq\x03\x86q\x04Rq\x05.' >>>