Keys

Conv2d v.s. ConvTranspose2d

  1. import torch
  2. import torch.nn as nn

ConvTranspose2d
  • output_padding is used to solve the problem “two different inputs can get a same output in Conv2d”.
  1. x = torch.randn((4,256,7,7))
  2. op = nn.ConvTranspose2d(256,128,kernel_size=3,stride=2,padding=1,output_padding=1) # when k=3,p=o_p=1, upsample(2)
  3. y = op(x)
  4. print(x.shape,y.shape)
  1. torch.Size([4, 256, 7, 7]) torch.Size([4, 128, 14, 14])

Conv2d

Two different inputs can get a same output in Conv2d.(ceil)

  1. x = torch.randn((4,256,8,8))
  2. op = nn.Conv2d(256,128,kernel_size=3,stride=2)
  3. y = op(x)
  4. print(x.shape,y.shape)
  5. x = torch.randn((4,256,7,7))
  6. op = nn.Conv2d(256,128,kernel_size=3,stride=2)
  7. y = op(x)
  8. print(x.shape,y.shape)
  1. torch.Size([4, 256, 8, 8]) torch.Size([4, 128, 3, 3])
  2. torch.Size([4, 256, 7, 7]) torch.Size([4, 128, 3, 3])

BUG

torch.argmax

  1. RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

pytorch 中的 torch.argmax 不可微,所以无法计算梯度。

CUDA error

  1. balabla
  2. C:/w/1/s/windows/pytorch/aten/src/THCUNN/BCECriterion.cu:57: block: [3,0,0], thread: [347,0,0] Assertion `*input >= 0. && *input <= 1.` failed.
  3. balbala
  4. RuntimeError: CUDA error: device-side assert triggered

猜测:

  • 学习率设置过高,导致更新后计算loss时出现允许范围之外的数,以至于 cuda 无法计算
  • 梯度爆炸输出 nan

tensorboard graph

  • 输出只能有一个,不然会报一大串,提示错误

tensorboardX

  1. Warning: NaN or Inf found in input tensor.

模型运行过程中报这个错误,可能不是因为数据集的问题,而是因为 tansorboardX 中 add_scalar 时输入一个 nan or inf(计算指标时分母为0)。

Others

yacs

yaml 文件进行强制的数据类型转换

  1. str: !!str 3.14
  2. int: !!int "123"