Rate this Page

LSTMCell#

classtorch.nn.modules.rnn.LSTMCell(input_size,hidden_size,bias=True,device=None,dtype=None)[source]#

A long short-term memory (LSTM) cell.

i=σ(Wiix+bii+Whih+bhi)f=σ(Wifx+bif+Whfh+bhf)g=tanh(Wigx+big+Whgh+bhg)o=σ(Wiox+bio+Whoh+bho)c=fc+igh=otanh(c)\begin{array}{ll}i = \sigma(W_{ii} x + b_{ii} + W_{hi} h + b_{hi}) \\f = \sigma(W_{if} x + b_{if} + W_{hf} h + b_{hf}) \\g = \tanh(W_{ig} x + b_{ig} + W_{hg} h + b_{hg}) \\o = \sigma(W_{io} x + b_{io} + W_{ho} h + b_{ho}) \\c' = f \odot c + i \odot g \\h' = o \odot \tanh(c') \\\end{array}

whereσ\sigma is the sigmoid function, and\odot is the Hadamard product.

Parameters
  • input_size (int) – The number of expected features in the inputx

  • hidden_size (int) – The number of features in the hidden stateh

  • bias (bool) – IfFalse, then the layer does not use bias weightsb_ih andb_hh. Default:True

Inputs: input, (h_0, c_0)
  • input of shape(batch, input_size) or(input_size): tensor containing input features

  • h_0 of shape(batch, hidden_size) or(hidden_size): tensor containing the initial hidden state

  • c_0 of shape(batch, hidden_size) or(hidden_size): tensor containing the initial cell state

    If(h_0, c_0) is not provided, bothh_0 andc_0 default to zero.

Outputs: (h_1, c_1)
  • h_1 of shape(batch, hidden_size) or(hidden_size): tensor containing the next hidden state

  • c_1 of shape(batch, hidden_size) or(hidden_size): tensor containing the next cell state

Variables
  • weight_ih (torch.Tensor) – the learnable input-hidden weights, of shape(4*hidden_size, input_size)

  • weight_hh (torch.Tensor) – the learnable hidden-hidden weights, of shape(4*hidden_size, hidden_size)

  • bias_ih – the learnable input-hidden bias, of shape(4*hidden_size)

  • bias_hh – the learnable hidden-hidden bias, of shape(4*hidden_size)

Note

All the weights and biases are initialized fromU(k,k)\mathcal{U}(-\sqrt{k}, \sqrt{k})wherek=1hidden_sizek = \frac{1}{\text{hidden\_size}}

On certain ROCm devices, when using float16 inputs this module will usedifferent precision for backward.

Examples:

>>>rnn=nn.LSTMCell(10,20)# (input_size, hidden_size)>>>input=torch.randn(2,3,10)# (time_steps, batch, input_size)>>>hx=torch.randn(3,20)# (batch, hidden_size)>>>cx=torch.randn(3,20)>>>output=[]>>>foriinrange(input.size()[0]):...hx,cx=rnn(input[i],(hx,cx))...output.append(hx)>>>output=torch.stack(output,dim=0)