问题
今天想试一下之前在Windows 10上配置的Pytorch环境。准备模型训练的数据集,当我尝试遍历DataLoader的时候出现了以下报错信息。
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "D:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "D:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
AttributeError: Can't get attribute 'MyDataset' on <module '__main__' (built-in)>
Traceback (most recent call last):
File "D:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3331, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-5-e37105fe54f7>", line 1, in <module>
for i, j in train_iter:
File "D:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 279, in __iter__
return _MultiProcessingDataLoaderIter(self)
File "D:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 719, in __init__
w.start()
File "D:\ProgramData\Anaconda3\lib\multiprocessing\process.py", line 112, in start
self._popen = self._Popen(self)
File "D:\ProgramData\Anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "D:\ProgramData\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "D:\ProgramData\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 89, in __init__
reduction.dump(process_obj, to_child)
File "D:\ProgramData\Anaconda3\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe
好像涉及到多进程的一些问题,之前学python的时候记得Windows上没有fork()
系统调用,多进程好像需求特殊的处理。涉及问题的部分代码如下。
train_iter = DataLoader(dataset=train_set, batch_size=batch_size, shuffle=True, num_workers=10)
test_iter = DataLoader(dataset=test_set, batch_size=batch_size, shuffle=True, num_workers=10)
# %%
for i, j in train_iter:
print(i)
DataLoader的num_workers涉及多线程读取数据,而Python由于设计时有GIL全局锁,导致了多线程无法利用多核,这边实际上应该是用多进程实现多核利用的。问题大致就出于此。
解决方法
方案一
将num_workers设置为0
方案二
用以下代码包括你的其他代码
if __name__ == '__main__':