CE / MSE
- why using CrossEntropy for classification problem, while Mean Square Error for regression problem?
CE loss:
where. For
, if
is closer to 0, the
is getting smaller and the reducing speed is increasing.
. That means that more distance between p and y is , more and more the loss will be.
For MSE, the loss is correspond to the bias between p and y.LSTM
GRU