Structure of Multilayer neural network
- The inputs to the network correspond to the attributes measured for each training tuple
- Inputs are fed simultaneously into the units making up the input layer
- They are then weighted and fed simultaneously to a hidden layer
- The number of hidden layers is arbitrary (1 hidden layer in the example above).
- The number of hidden units is arbitrary (3 hidden units in the example above).
- The weighted outputs of the last hidden layer are input to units making up the output layer, which emits the network’s prediction
- The network is feed-forward: none of the weights cycles back to an input unit or to an output unit of a previous layer 网络是前馈的:没有一个权值循环回上一层的输入单位或输出单位
- From a statistical point of view, networks perform nonlinear regression: given enough hidden units and enough training samples, they can closely approximate any function (output) 从统计学的角度来看,网络执行非线性回归:给定足够的隐藏单元和足够的训练样本,它们可以接近任何函数(输出)
Defining a network topology
- Decide the network topology
- Specify the number of units in the input layer: usually _one input unit per attribute _in the data (but nominal attributes can have one input per value). 指定输入层中的单元数:通常数据中的每个属性都有一个输入单元(但标称属性可以每个值有一个输入)。
- the number of hidden layers (at least one, and commonly only one)
- the number of units in each hidden layer
- and the number of units in the output layer
- Output, one unit per response variable. In a typical binary classification problem only one output unit is used and a threshold is applied to the output value to select the class label. The value can be interpreted as a probability of belonging to the positive class, and in this way a neural network can be used as a probablistic model. For classification of more than two classes, one output unit per class is used and the highest-valued class is selected for the label.在典型的二值分类问题中,只使用一个输出单元,并对输出值应用一个阈值来选择类标签。这个值可以解释为属于正类的概率,这样神经网络就可以作为一个概率模型。对于多于两个类的分类,每个类使用一个输出单元,并为标签选择值最高的类。
- Choose an activation function for each hidden and output unit (explained later) 为每个隐藏和输出单元选择一个激活函数
- Usually randomly, determine initial values for the weights. 通常是随机的,确定权重的初始值。
- Once a network has been trained, if its accuracy is unacceptable, try a different network topology or a different set of initial weights or maybe different activation functions. Use trial-and-error or there are methods that systematically search for a high-performing topology. 一旦一个网络被训练过,如果它的准确性是不可接受的,那么尝试不同的网络拓扑或不同的初始权值集,或者可能是不同的激活函数。使用试错法,或者有一些系统地搜索高性能拓扑的方法。