首先什么是parameter?

    1. W: weight
    2. b: bias

    什么是HyperParameter?

    1. alpha: learning rate
    2. number of iteration
    3. number of hidden layers
    4. numebr of hidden units
    5. choices of activation function

    以及

    1. momentum
    2. minibatch size
    3. regularization