image.png

Structure of Multilayer neural network

  • The inputs to the network correspond to the attributes measured for each training tuple
  • Inputs are fed simultaneously into the units making up the input layer
  • They are then weighted and fed simultaneously to a hidden layer
    • The number of hidden layers is arbitrary (1 hidden layer in the example above).
    • The number of hidden units is arbitrary (3 hidden units in the example above).
  • The weighted outputs of the last hidden layer are input to units making up the output layer, which emits the network’s prediction
  • The network is feed-forward: none of the weights cycles back to an input unit or to an output unit of a previous layer 网络是前馈的:没有一个权值循环回上一层的输入单位或输出单位
  • From a statistical point of view, networks perform nonlinear regression: given enough hidden units and enough training samples, they can closely approximate any function (output) 从统计学的角度来看,网络执行非线性回归:给定足够的隐藏单元和足够的训练样本,它们可以接近任何函数(输出)

Defining a network topology

  • Decide the network topology
    • Specify the number of units in the input layer: usually _one input unit per attribute _in the data (but nominal attributes can have one input per value). 指定输入层中的单元数:通常数据中的每个属性都有一个输入单元(但标称属性可以每个值有一个输入)。
    • the number of hidden layers (at least one, and commonly only one)
      • the number of units in each hidden layer
    • and the number of units in the output layer
  • Output, one unit per response variable. In a typical binary classification problem only one output unit is used and a threshold is applied to the output value to select the class label. The value can be interpreted as a probability of belonging to the positive class, and in this way a neural network can be used as a probablistic model. For classification of more than two classes, one output unit per class is used and the highest-valued class is selected for the label.在典型的二值分类问题中,只使用一个输出单元,并对输出值应用一个阈值来选择类标签。这个值可以解释为属于正类的概率,这样神经网络就可以作为一个概率模型。对于多于两个类的分类,每个类使用一个输出单元,并为标签选择值最高的类。
  • Choose an activation function for each hidden and output unit (explained later) 为每个隐藏和输出单元选择一个激活函数
  • Usually randomly, determine initial values for the weights. 通常是随机的,确定权重的初始值。
  • Once a network has been trained, if its accuracy is unacceptable, try a different network topology or a different set of initial weights or maybe different activation functions. Use trial-and-error or there are methods that systematically search for a high-performing topology. 一旦一个网络被训练过,如果它的准确性是不可接受的,那么尝试不同的网络拓扑或不同的初始权值集,或者可能是不同的激活函数。使用试错法,或者有一些系统地搜索高性能拓扑的方法。