1. AlexNet

卷积块:Conv2d—->ReLU—->MaxPool2d
全连接层Dense:Dropout—->Linear—->ReLU
源码中部分代码:model.py

  1. def forward(selfx):
  2. x= self.features(x)
  3. x= torch.flatten(x, start_dim=1)
  4. x= self.classifier(x)
  5. return x

在train.py里面用到了net.train( ) 和net.eval( ),管理dropout方法,如果是训练模式就启用dropout,如果是评估模式就关闭dropout层

  1. for epoch in range(10) :
  2. # train
  3. net.train() # 启用dropout方法
  4. running_loss = 0.0 # 统计训练过程的平均损失
  5. for step,data in enumerate(train_loader , start=0):
  6. images,labels = data
  7. optimizer.zero_grad() # 清空之前的梯度信息
  8. outputs = net(images.to(device))
  9. loss = loss_function(outputs,labels.to( device))
  10. loss.backward() # 反向传播
  11. optimizer.step() # 更新每个节点的参数
  12. running_loss += loss.item()
  13. # 打印训练进度
  14. rate = (step + 1) / len( train_loader)
  15. a = "*" * int(rate * 50)
  16. b = "." * int((1 - rate) * 50)
  17. print("\r train loss: {:^3.0f}%[{}->{}]{:.3f }".format(int(rate * 100),a,b,loss),end="")
  18. print()
  19. #validate
  20. net.eval() # 关闭dropout方法
  21. acc = 0.0 # accumulate accurate number / epoch
  22. # 使用torch.no_grad()禁止框架对参数进行跟踪
  23. with torch.no_grad() :
  24. for data_test in validate_loader :
  25. test_images, test_labels = data_test
  26. outputs = net(test_images.to(device))
  27. predict_y = torch.max(outputs, dim=1)[1]
  28. acc += ( predict_y == test_labels.to(device)).sum( ).item()
  29. accurate_test = acc / val_num
  30. if accurate_test > best_acc:
  31. best_acc = accurate_test
  32. torch.save(net.state_dict(), save_path)
  33. print( '[epoch %d] train_loss: %.3f test_accuracy: %.3f'
  34. % (epoch + 1, running_ loss / step, accurate_test))

2. GoogLeNet

源码中部分代码:model.py

  1. # Inception层
  2. class Inception(nn.Module):
  3. self.branch1 = BasicConv2d(in_channels,ch1×1kernel_size=1)
  4. self. branch2 = nn . Sequential(
  5. BasicConv2d(in_channels,ch3x3red,kernel_size=1),
  6. BasicConv2d(ch3x3red,ch3x3kernel_size=3,padding=1)
  7. )
  8. self. branch3 = nn. Sequential(
  9. BasicConv2d(in_channels,ch5x5red,kernel_size=1),
  10. BasicConv2d(ch5x5red,ch5x5kernel_size=5,padding=2)
  11. )
  12. self.branch4 = nn. Sequential(
  13. nn.MaxPool2d(kernel_size=3,stride=1 , padding=1),
  14. Basicconv2d(in_channelspool_proj, kernel_size=1)
  15. )
  16. def forward(self, ×):
  17. branch1 = self.branch1(x)
  18. branch2 = self.branch2(x)
  19. branch3 = self.branch3(x)
  20. branch4 = self.branch4(×)
  21. outputs = [branch1,branch2,branch3,branch4]
  22. return torch. cat(outputs1)
  23. # 辅助分类器
  24. class InceptionAux(nn.Module):
  25. def __init_(self, in_channels, num_classes) :
  26. super (InceptionAux, self).__init_ )
  27. self.averagePool = nn.AvgPool2d(kernel_size=5, stride=3)
  28. self.conv= BasicConv2d( in_channels,128 kernel_size=1)
  29. self.fc1 = nn.Linear( 20481024)
  30. self.fc2 = nn.Linear(1024,num_classes)
  31. def forward(self,x):
  32. x = self.averagePool(x)
  33. x = self.conv(x)
  34. × = torch.flatten(x,1)
  35. x = F.dropout(x, 0.5,training=self.training)
  36. x = F.relu(self.fc1(x),inplace=True)
  37. x = F.dropout(x, 0.5,training=self.training)
  38. x = self.fc2(x)
  39. return x

在train.py里面用到了net.train( ) 和net.eval( ),管理dropout方法,以及是否使用辅助分类器
train( )方法使用了aux_logits2, aux_logits1,所以得到了三个输出
按照原论文说的,三个输出权重最终的损失函数loss = loss0 + loss1 0.3 + loss2 0.3

  1. for epoch in range(2):
  2. #train
  3. net.train()
  4. running_loss = 0.0
  5. for step,data in enumerate( train_loader , start=0 ) :
  6. images,labels = data
  7. optimizer.zero_grad()
  8. logits, aux_logits2, aux_logits1 = net(images.to(device))
  9. loss0 = loss_function(logits, labels.to(device))
  10. loss1 = loss_function(aux_logits1, labels.to(device))
  11. loss2 = loss_function(aux_logits2, labels.to(device))
  12. loss = loss0 + loss1 * 0.3 + loss2 * 0.3
  13. loss.backward() # 反向传播
  14. optimizer .step() # 更新每个节点的参数
  15. # 更新平均损失函数
  16. running_loss += loss.item()
  17. #validate
  18. net.eval()
  19. acc = 0.0 #accumulate accurate number / epoch
  20. with torch.no_grad():
  21. for data_test in validate_loader :
  22. test_images,test_labels = data_test
  23. outputs = net(test_images.to(device)) # eval model only have last output layer
  24. predict_y = torch.max(outputs, dim=1)[1]
  25. acc += (predict_y == test_labels.to(device)).sum().item()
  26. accurate_test = acc / val_num
  27. if accurate_test > best_acc:
  28. best_acc = accurate_test
  29. torch.save(net.state_dict(), save_path)
  30. print( '[epoch %d] train_loss: %.3f test_accuracy: %.3f'
  31. % (epoch + 1, running_loss / step, accurate_test))

3. ResNet

  1. 使用Batch Normalization加速训练 (丢弃dropout)

image.png
BN层目的是让feature map的每一个维度都满足均值为0,方差为1的分布规律。

  1. 提出residual模块

image.png

网络结构:
image.png
image.png
注意:在上述结构中的conv3_1, conv4_1, and conv5_1第一层做了修改
捷径分支如果是“实线”,则identity mapping和residual mapping通道数相同,“虚线”就是两者通道数不同,所以在捷径分支上需要使用1x1卷积下采样调整通道维度,使其可以相加。
另外重复部分的残差块第一块也做了部分修改,将stride设置为2,将上一层输出的高和宽调整为当前层的高和宽,不然输入和输出对不上。
原作者提出了好几个方法,最后经过验证发现option B的效果最好。
ResNet-18和34的改正部分,只改了conv3, conv4, and conv5三个块
image.png

ResNet-50,101,152的改正部分,改了conv2,conv3, conv4, conv5四个块
image.png

部分源码:model.py
关于BN笔记:1,使用BN的同时,卷积中的参数bias置为False;2,BN层放在conv层和relu层中间

  1. def forward(self, x):
  2. identity = x # identity是捷径分支,如果没有下采样,就直接输出
  3. if self.downsample is not None:
  4. identity = self.downsample(x)
  5. out = self.conv1(x)
  6. out = self.bn1(out)
  7. out = self.relu(out)
  8. out = self.conv2(out)
  9. out = self.bn2(out)
  10. out += identity
  11. out = self.relu(out)
  12. return out