摘要:

我们干了什么?
在120w张图片的数据集上,训练了CNN实现1000分类问题
We trained a large,deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes.
网络效果如何?
比前人工作好
On the test data,we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art.
网络的形状?
600w参数,65w神经元,5个卷积层,3个全连接层
The neutal network,which has 60 million parameters and 65000neurons,consists of five convolutional layers,some of which are followed by max-polling layers,andthreefully-connected layers with a final 1000-way softmax.
参数这么多,怎么加速训练?
使用non-saturating neurons+GPU实现卷积运算
we used non-saturating neutons and a very efficient GPU implementation of the convolution operation.
参数这么多,怎么防止过拟合?
使用一个dropout的方法
To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called “dropout” that proved to be very effective.

总结:

这篇文章并没有提到总结,而是有一个讨论的过程,不算非常的常规。
提到了例如去掉一层会降低准确率,并且针对一个基于监督学习的模型进行讨论。

主要的贡献:

使用了一种non-saturating neutons的函数,并取名为relu。
因为当时3G内存不够,而导致无法训练模型,最后使用了一种将模型切开分块训练的方式