采用 Relu 等加上 inplace = True 之后会就地进行训练。如果模型构建不恰当很容易造成以下问题:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [16, 64, 32, 251]], which is output 0 of ReluBackward1, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
调试时,根据提前开启自动求导的跟踪:
# 程序开头启动自动跟踪
torch.autograd.set_detect_anomaly(True)
# 跟踪
with torch.autograd.set_detect_anomaly(True):
# 与 with torch.no_grad(): 类似,下面跟上代码