ResNet Paper
Kaiming He 何恺明 领域:cv & DL
Residual Networks (ResNet) – Deep Learning
7.6. Residual Networks (ResNet)


背景

深度网络的退化问题

当神经网络的层数不断增多时,会产生梯度爆炸/消失(Vanishing/Exploding gradient)的问题,如果层数再增加,训练和测试误差也会跟着增加。

ResNet,2015年,微软研究院,何恺明引进了残差网络(Residual Network)概念
[CNN]-ResNet-Kaiming He - 图1

  1. instead of layers learn the underlying mapping, we allow network fit the residual mapping. So, instead of say H(x), initial mapping, _let the network fit, _F(x) := H(x) – x _which gives _H(x) := F(x) + x.
  2. if any layer hurt the performance of architecture then it will be skipped by regularization

网络架构
image.png

ResNet50的TensorFlow实现

  1. class ResNet50(object):
  2. def __init__(self, inputs, num_classes=1000, is_training=True,
  3. scope="resnet50"):
  4. self.inputs =inputs
  5. self.is_training = is_training
  6. self.num_classes = num_classes
  7. with tf.variable_scope(scope):
  8. # construct the model
  9. net = conv2d(inputs, 64, 7, 2, scope="conv1") # -> [batch, 112, 112, 64]
  10. net = tf.nn.relu(batch_norm(net, is_training=self.is_training, scope="bn1"))
  11. net = max_pool(net, 3, 2, scope="maxpool1") # -> [batch, 56, 56, 64]
  12. net = self._block(net, 256, 3, init_stride=1, is_training=self.is_training,
  13. scope="block2") # -> [batch, 56, 56, 256]
  14. net = self._block(net, 512, 4, is_training=self.is_training, scope="block3")
  15. # -> [batch, 28, 28, 512]
  16. net = self._block(net, 1024, 6, is_training=self.is_training, scope="block4")
  17. # -> [batch, 14, 14, 1024]
  18. net = self._block(net, 2048, 3, is_training=self.is_training, scope="block5")
  19. # -> [batch, 7, 7, 2048]
  20. net = avg_pool(net, 7, scope="avgpool5") # -> [batch, 1, 1, 2048]
  21. net = tf.squeeze(net, [1, 2], name="SpatialSqueeze") # -> [batch, 2048]
  22. self.logits = fc(net, self.num_classes, "fc6") # -> [batch, num_classes]
  23. self.predictions = tf.nn.softmax(self.logits)
  24. def _block(self, x, n_out, n, init_stride=2, is_training=True, scope="block"):
  25. with tf.variable_scope(scope):
  26. h_out = n_out // 4
  27. out = self._bottleneck(x, h_out, n_out, stride=init_stride,
  28. is_training=is_training, scope="bottlencek1")
  29. for i in range(1, n):
  30. out = self._bottleneck(out, h_out, n_out, is_training=is_training,
  31. scope=("bottlencek%s" % (i + 1)))
  32. return out
  33. def _bottleneck(self, x, h_out, n_out, stride=None, is_training=True, scope="bottleneck"):
  34. """ A residual bottleneck unit"""
  35. n_in = x.get_shape()[-1]
  36. if stride is None:
  37. stride = 1 if n_in == n_out else 2
  38. with tf.variable_scope(scope):
  39. h = conv2d(x, h_out, 1, stride=stride, scope="conv_1")
  40. h = batch_norm(h, is_training=is_training, scope="bn_1")
  41. h = tf.nn.relu(h)
  42. h = conv2d(h, h_out, 3, stride=1, scope="conv_2")
  43. h = batch_norm(h, is_training=is_training, scope="bn_2")
  44. h = tf.nn.relu(h)
  45. h = conv2d(h, n_out, 1, stride=1, scope="conv_3")
  46. h = batch_norm(h, is_training=is_training, scope="bn_3")
  47. if n_in != n_out:
  48. shortcut = conv2d(x, n_out, 1, stride=stride, scope="conv_4")
  49. shortcut = batch_norm(shortcut, is_training=is_training, scope="bn_4")
  50. else:
  51. shortcut = x
  52. return tf.nn.relu(shortcut + h)