作者:MeWTwo 链接:https://www.zhihu.com/question/345418003/answer/926213163

    首先看下你的设备有无cuda可用:

    1. device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    2. # Assuming that we are on a CUDA machine, this should print a CUDA device:
    3. print(device)

    如果有应该输出:

    1. cuda:0

    然后在定义的model/Net后面加上.to(device) ,这样这些方法将递归遍历所有模块,并将其参数和缓冲区转为CUDA tensors:

    1. model = ConvNet(num_classes).to(device) # 如前面已定义class ConvNet(nn.Module)

    记得也要在每一个step在inputs和targets(标签)后面加上.to(device):

    1. images = images.to(device)
    2. labels = labels.to(device)

    这样就可以用GPU加速了。
    至于为什么相比于用CPU没有明显的加速,应该是因为你定义的网络太小了吧( ̄▽ ̄)” Why dont I notice MASSIVE speedup compared to CPU? Because your network is really small. 首先看下你的设备有无cuda可用:

    1. device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    2. # Assuming that we are on a CUDA machine, this should print a CUDA device:
    3. print(device)

    如果有应该输出:

    1. cuda:0

    然后在定义的model/Net后面加上.to(device) ,这样这些方法将递归遍历所有模块,并将其参数和缓冲区转为CUDA tensors:

    1. model = ConvNet(num_classes).to(device) # 如前面已定义class ConvNet(nn.Module)

    记得也要在每一个step在inputs和targets(标签)后面加上.to(device):

    1. images = images.to(device)
    2. labels = labels.to(device)

    这样就可以用GPU加速了。
    至于为什么相比于用CPU没有明显的加速,应该是因为你定义的网络太小了吧( ̄▽ ̄)” Why dont I notice MASSIVE speedup compared to CPU? Because your network is really small.