可选择:数据并行处理(文末有完整代码下载) 作者:Sung Kim 和 Jenny Kang

在这个教程中,我们将学习如何用 DataParallel 来使用多 GPU。 通过 PyTorch 使用多个 GPU 非常简单。你可以将模型放在一个 GPU:

  1. device = torch.device(“cuda:0”)
  2. model.to(device)
然后,你可以复制所有的张量到 GPU:

  1. mytensor = my_tensor.to(device)
请注意,只是调用 my_tensor.to(device) 返回一个 my_tensor 新的复制在GPU上,而不是重写 my_tensor。你需要分配给他一个新的张量并且在 GPU 上使用这个张量。

在多 GPU 中执行前馈,后馈操作是非常自然的。尽管如此,PyTorch 默认只会使用一个 GPU。通过使用 DataParallel 让你的模型并行运行,你可以很容易的在多 GPU 上运行你的操作。

  1. model = nn.DataParallel(model)
这是整个教程的核心,我们接下来将会详细讲解。 引用和参数

引入 PyTorch 模块和定义参数

  1. import torch
  2. import torch.nn as nn
  3. from torch.utils.data import Dataset, DataLoader

参数

  1. input_size = 5
  2. output_size = 2

  3. batch_size = 30

  4. data_size = 100

设备

  1. device = torch.device(“cuda:0 if torch.cuda.is_available() else cpu”)
实验(玩具)数据

生成一个玩具数据。你只需要实现 getitem.

  1. class RandomDataset(Dataset):

  2. <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">size</span><span class="p">,</span> <span class="n">length</span><span class="p">):</span>
  3.     <span class="bp">self</span><span class="o">.</span><span class="n">len</span> <span class="o">=</span> <span class="n">length</span>
  4.     <span class="bp">self</span><span class="o">.</span><span class="n">data</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">randn</span><span class="p">(</span><span class="n">length</span><span class="p">,</span> <span class="n">size</span><span class="p">)</span>
  5. <span class="k">def</span> <span class="fm">__getitem__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">index</span><span class="p">):</span>
  6.     <span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">data</span><span class="p">[</span><span class="n">index</span><span class="p">]</span>
  7. <span class="k">def</span> <span class="fm">__len__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
  8.     <span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">len</span>
  9. rand_loader = DataLoader(dataset=RandomDataset(input_size, data_size),batch_size=batch_size, shuffle=True)

简单模型

为了做一个小 demo,我们的模型只是获得一个输入,执行一个线性操作,然后给一个输出。尽管如此,你可以使用 DataParallel 在任何模型(CNN, RNN, Capsule Net 等等.)

我们放置了一个输出声明在模型中来检测输出和输入张量的大小。请注意在 batch rank 0 中的输出。

  1. class Model(nn.Module):
  2. # Our model

  3. <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">input_size</span><span class="p">,</span> <span class="n">output_size</span><span class="p">):</span>
  4.     <span class="nb">super</span><span class="p">(</span><span class="n">Model</span><span class="p">,</span> <span class="bp">self</span><span class="p">)</span><span class="o">.</span><span class="fm">__init__</span><span class="p">()</span>
  5.     <span class="bp">self</span><span class="o">.</span><span class="n">fc</span> <span class="o">=</span> <span class="n">nn</span><span class="o">.</span><span class="n">Linear</span><span class="p">(</span><span class="n">input_size</span><span class="p">,</span> <span class="n">output_size</span><span class="p">)</span>
  6. <span class="k">def</span> <span class="nf">forward</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="nb">input</span><span class="p">):</span>
  7.     <span class="n">output</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">fc</span><span class="p">(</span><span class="nb">input</span><span class="p">)</span>
  8.     <span class="k">print</span><span class="p">(</span><span class="s2">"</span><span class="se">\t</span><span class="s2">In Model: input size"</span><span class="p">,</span> <span class="nb">input</span><span class="o">.</span><span class="n">size</span><span class="p">(),</span>
  9.           <span class="s2">"output size"</span><span class="p">,</span> <span class="n">output</span><span class="o">.</span><span class="n">size</span><span class="p">())</span>
  10.     <span class="k">return</span> <span class="n">output</span></pre>
  11.  

  12. 创建模型并且数据并行处理

  13. 这是整个教程的核心。首先我们需要一个模型的实例,然后验证我们是否有多个 GPU。如果我们有多个 GPU,我们可以用 nn.DataParallel 包裹 我们的模型。然后我们使用 model.to(device) 把模型放到多 GPU 中。

  14.  

  15. model = Model(input_size, output_size)
  16. if torch.cuda.device_count() > 1:
  17.   print(Lets use, torch.cuda.device_count(), GPUs!”)
  18.   # dim = 0 [30, xxx] -> [10, …], [10, …], [10, …] on 3 GPUs
  19.   model = nn.DataParallel(model)

  20. model.to(device)

  21. 输出:

  22. Lets use 2 GPUs!
  23. 运行模型:

  24. 现在我们可以看到输入和输出张量的大小了。

  25. for data in rand_loader:
  26.     input = data.to(device)
  27.     output = model(input)
  28.     print(Outside: input size, input.size(),
  29.           output_size, output.size())
  30.  
  31. 输出:

  32. In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
  33.         In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
  34. Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
  35.         In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
  36.         In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
  37. Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
  38.         In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
  39.         In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
  40. Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
  41.         In Model: input size torch.Size([5, 5]) output size torch.Size([5, 2])
  42.         In Model: input size torch.Size([5, 5]) output size torch.Size([5, 2])
  43. Outside: input size torch.Size([10, 5]) output_size torch.Size([10, 2])
  44. 结果:

  45. 如果你没有 GPU 或者只有一个 GPU,当我们获取 30 个输入和 30 个输出,模型将期望获得 30 个输入和 30 个输出。但是如果你有多个 GPU ,你会获得这样的结果。

  46. GPU

  47. 如果你有 2 GPU,你会看到:

  48. # on 2 GPUs
  49. Lets use 2 GPUs!
  50.     In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
  51.     In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
  52. Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
  53.     In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
  54.     In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
  55. Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
  56.     In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
  57.     In Model: input size torch.Size([15, 5]) output size torch.Size([15, 2])
  58. Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
  59.     In Model: input size torch.Size([5, 5]) output size torch.Size([5, 2])
  60.     In Model: input size torch.Size([5, 5]) output size torch.Size([5, 2])
  61. Outside: input size torch.Size([10, 5]) output_size torch.Size([10, 2])
  62.  
  63.  

  64. 如果你有 3GPU,你会看到:

  65. Lets use 3 GPUs!
  66.     In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
  67.     In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
  68.     In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
  69. Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
  70.     In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
  71.     In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
  72.     In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
  73. Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
  74.     In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
  75.     In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
  76.     In Model: input size torch.Size([10, 5]) output size torch.Size([10, 2])
  77. Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
  78.     In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  79.     In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  80.     In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
  81. Outside: input size torch.Size([10, 5]) output_size torch.Size([10, 2])
  82. 如果你有 8GPU,你会看到:

  83. Lets use 8 GPUs!
  84.     In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  85.     In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  86.     In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
  87.     In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  88.     In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  89.     In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  90.     In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  91.     In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  92. Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
  93.     In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  94.     In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  95.     In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  96.     In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  97.     In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  98.     In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  99.     In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
  100.     In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  101. Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
  102.     In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  103.     In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  104.     In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  105.     In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  106.     In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  107.     In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  108.     In Model: input size torch.Size([4, 5]) output size torch.Size([4, 2])
  109.     In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
  110. Outside: input size torch.Size([30, 5]) output_size torch.Size([30, 2])
  111.     In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
  112.     In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
  113.     In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
  114.     In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
  115.     In Model: input size torch.Size([2, 5]) output size torch.Size([2, 2])
  116. Outside: input size torch.Size([10, 5]) output_size torch.Size([10, 2])

  117. 总结

  118. 数据并行自动拆分了你的数据并且将任务单发送到多个 GPU 上。当每一个模型都完成自己的任务之后,DataParallel 收集并且合并这些结果,然后再返回给你。

  119. 更多信息,请访问:

  120. https://pytorch.org/tutorials/beginner/former_torchies/parallelism_tutorial.html

  121. 下载 Python 版本完整代码:

  122. data_parallel_tutorial.py

  123. 下载 jupyter notebook 版本完整代码:

  124. data_parallel_tutorial.ipynb

  125. 加入 PyTorch 交流 QQ 群:

  126. 参数 - 图1

  127.  

  128.  

  129.  

  130.