标准RNN

https://pytorch.org/docs/stable/generated/torch.nn.RNN.html?highlight=rnn#torch.nn.RNN

API介绍

循环神经网络的RNN实现 - 图1

where ht is the hidden state _at time t, xt is the **_input at time t, and h(t−1) is the hidden state of the previous layer **at time t-1 or the initial hidden state at time 0. If nonlinearity is ‘relu’, then ReLU is used instead of tanh

torch.nn.RNN(*args, **kwargs**)

input_size – The number of expected features in the input x 输入xt的特征维度
hidden_size – The number of features in the hidden state h ht的特征维度
num_layers – Number of recurrent layers. E.g., setting num_layers=2 would mean stacking two RNNs together to form a stacked RNN, with the second RNN taking in outputs of the first RNN and computing the final results. Default: 1 表示网络层数
nonlinearity – The non-linearity to use. Can be either ‘tanh’ or ‘relu’. Default: ‘tanh’
bias – If False, then the layer does not use bias weights b_ih and b_hh. Default: True

网络的输入和输出

网络会接收一个序列输入xt和记忆输入h0
xt 的维度（Seq, batch, input_size）
分别表示：序列长度 sequence length, 批量batch size, 输入的特征维度
h0 的维度（layers*direction, batch, hidden_size）

卷积神经网络中输入batch在前面，循环网络中batch在中间

basic_rnn = nn.RNN(input_size=20, hidden_size=50, num_layers=2)

输入是一个长为100，批量为32，维度为20的张量

toy_input = Variable(torch.randn(100,32,20))
h_0 = Variable(torch.randn(2,32,50))
toy_output, h_n = basic_rnn(toy_input, h_0)

LSTM

Example

将图片数据转化为一个序列数据，MNIST手写数字的图片大小是28x28,可以将每张图片看作是长为28的序列，序列中每个元素的特征维度是28