Softmax函数及其梯度求导

已剪辑自: https://www.cnblogs.com/alexanderkun/p/8098781.html

Softmax函数定义

假设一个数组Softmax函数详解 - 图1,那么Softmax函数详解 - 图2的softmax函数输出值为:
Softmax函数详解 - 图3
Softmax函数详解 - 图4,K为数组长度

Pytorch中softmax函数

torch.nn.Softmax(dim=None)
Applies the Softmax function to an n-dimensional input Tensor rescaling them so that the elements of the n-dimensional output Tensor lie in the range [0,1] and sum to 1.
Softmax is defined as:
Softmax函数详解 - 图5

  • Shape:

  • Input: (*) where * means, any number of additional dimensions
    - Output: (*), same shape as the input

  • Returns:

a Tensor of the same dimension and shape as the input with values in the range [0, 1]

  • Arguments:

dim (int): A dimension along which Softmax will be computed (so every slice along dim will sum to 1).

Note:
This module doesn’t work directly with NLLLoss, which expects the Log to be computed between the Softmax and itself. Use LogSoftmax instead (it’s faster and has better numerical properties).

  • Examples
    1. >>> m = nn.Softmax(dim=1)
    2. # softmax只针对于某一个维度进行rescale
    3. >>> input = torch.randn(2, 3)
    4. >>> output = m(input)