Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Gated recurrent unit

From Wikipedia, the free encyclopedia
Memory unit used in neural networks
Part of a series on
Machine learning
anddata mining

Inartificial neural networks, thegated recurrent unit (GRU) is agating mechanism used inrecurrent neural networks, introduced in 2014 byKyunghyun Cho et al.[1] The GRU is like along short-term memory (LSTM) with a gating mechanism to input or forget certain features,[2] but lacks a context vector or output gate, resulting in fewer parameters than LSTM.[3] GRU's performance on certain tasks of polyphonic music modeling, speech signal modeling and natural language processing was found to be similar to that of LSTM.[4][5] GRUs showed that gating is indeed helpful in general, andBengio's team came to no concrete conclusion on which of the two gating units was better.[6][7]

Architecture

[edit]

There are several variations on the full gated unit, with gating done using the previous hidden state and the bias in various combinations, and a simplified form called minimal gated unit.[8]

In the following, the operator{\displaystyle \odot } denotes theHadamard product.

Fully gated unit

[edit]
Gated Recurrent Unit, fully gated version

Initially, fort=0{\displaystyle t=0}, the output vector ish0=0{\displaystyle h_{0}=0}.

zt=σ(Wzxt+Uzht1+bz)rt=σ(Wrxt+Urht1+br)h^t=ϕ(Whxt+Uh(rtht1)+bh)ht=(1zt)ht1+zth^t{\displaystyle {\begin{aligned}z_{t}&=\sigma (W_{z}x_{t}+U_{z}h_{t-1}+b_{z})\\r_{t}&=\sigma (W_{r}x_{t}+U_{r}h_{t-1}+b_{r})\\{\hat {h}}_{t}&=\phi (W_{h}x_{t}+U_{h}(r_{t}\odot h_{t-1})+b_{h})\\h_{t}&=(1-z_{t})\odot h_{t-1}+z_{t}\odot {\hat {h}}_{t}\end{aligned}}}

Variables (d{\displaystyle d} denotes the number of input features ande{\displaystyle e} the number of output features):

Activation functions

Alternative activation functions are possible, provided thatσ(x)[0,1]{\displaystyle \sigma (x)\in [0,1]}.

Type 1
Type 2
Type 3

Alternate forms can be created by changingzt{\displaystyle z_{t}} andrt{\displaystyle r_{t}}[9]

Minimal gated unit

[edit]

The minimal gated unit (MGU) is similar to the fully gated unit, except the update and reset gate vector is merged into a forget gate. This also implies that the equation for the output vector must be changed:[10]

ft=σ(Wfxt+Ufht1+bf)h^t=ϕ(Whxt+Uh(ftht1)+bh)ht=(1ft)ht1+fth^t{\displaystyle {\begin{aligned}f_{t}&=\sigma (W_{f}x_{t}+U_{f}h_{t-1}+b_{f})\\{\hat {h}}_{t}&=\phi (W_{h}x_{t}+U_{h}(f_{t}\odot h_{t-1})+b_{h})\\h_{t}&=(1-f_{t})\odot h_{t-1}+f_{t}\odot {\hat {h}}_{t}\end{aligned}}}

Variables

Light gated recurrent unit

[edit]

The light gated recurrent unit (LiGRU)[4] removes the reset gate altogether, replacestanh with theReLU activation, and appliesbatch normalization (BN):

zt=σ(BN(Wzxt)+Uzht1)h~t=ReLU(BN(Whxt)+Uhht1)ht=ztht1+(1zt)h~t{\displaystyle {\begin{aligned}z_{t}&=\sigma (\operatorname {BN} (W_{z}x_{t})+U_{z}h_{t-1})\\{\tilde {h}}_{t}&=\operatorname {ReLU} (\operatorname {BN} (W_{h}x_{t})+U_{h}h_{t-1})\\h_{t}&=z_{t}\odot h_{t-1}+(1-z_{t})\odot {\tilde {h}}_{t}\end{aligned}}}

LiGRU has been studied from a Bayesian perspective.[11] This analysis yielded a variant called light Bayesian recurrent unit (LiBRU), which showed slight improvements over the LiGRU onspeech recognition tasks.

References

[edit]
  1. ^Cho, Kyunghyun; van Merrienboer, Bart; Gulcehre, Caglar; Bahdanau, Dzmitry; Bougares, Fethi; Schwenk, Holger; Bengio, Yoshua (2014). "Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation".Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP):1724–1734.arXiv:1406.1078.doi:10.3115/v1/D14-1179.
  2. ^Felix Gers;Jürgen Schmidhuber; Fred Cummins (1999). "Learning to forget: Continual prediction with LSTM".9th International Conference on Artificial Neural Networks: ICANN '99. Vol. 1999. pp. 850–855.doi:10.1049/cp:19991218.ISBN 0-85296-721-7.
  3. ^"Recurrent Neural Network Tutorial, Part 4 – Implementing a GRU/LSTM RNN with Python and Theano – WildML".Wildml.com. 2015-10-27. Archived fromthe original on 2021-11-10. RetrievedMay 18, 2016.
  4. ^abRavanelli, Mirco; Brakel, Philemon; Omologo, Maurizio;Bengio, Yoshua (2018). "Light Gated Recurrent Units for Speech Recognition".IEEE Transactions on Emerging Topics in Computational Intelligence.2 (2):92–102.arXiv:1803.10225.Bibcode:2018ITECI...2...92R.doi:10.1109/TETCI.2017.2762739.S2CID 4402991.
  5. ^Su, Yuahang; Kuo, Jay (2019). "On extended long short-term memory and dependent bidirectional recurrent neural network".Neurocomputing.356:151–161.arXiv:1803.01686.doi:10.1016/j.neucom.2019.04.044.S2CID 3675055.
  6. ^Chung, Junyoung; Gulcehre, Caglar; Cho, KyungHyun; Bengio, Yoshua (2014). "Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling".arXiv:1412.3555 [cs.NE].
  7. ^Gruber, N.; Jockisch, A. (2020), "Are GRU cells more specific and LSTM cells more sensitive in motive classification of text?",Frontiers in Artificial Intelligence,3 40,doi:10.3389/frai.2020.00040,PMC 7861254,PMID 33733157,S2CID 220252321
  8. ^Chung, Junyoung; Gulcehre, Caglar; Cho, KyungHyun; Bengio, Yoshua (2014). "Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling".arXiv:1412.3555 [cs.NE].
  9. ^Dey, Rahul; Salem, Fathi M. (2017-01-20). "Gate-Variants of Gated Recurrent Unit (GRU) Neural Networks".arXiv:1701.05923 [cs.NE].
  10. ^Heck, Joel; Salem, Fathi M. (2017-01-12). "Simplified Minimal Gated Unit Variations for Recurrent Neural Networks".arXiv:1701.03452 [cs.NE].
  11. ^Bittar, Alexandre; Garner, Philip N. (May 2021)."A Bayesian Interpretation of the Light Gated Recurrent Unit".ICASSP 2021. 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Toronto, ON, Canada: IEEE. pp. 2965–2969. 10.1109/ICASSP39728.2021.9414259.
Concepts
Applications
Implementations
Audio–visual
Text
Decisional
People
Architectures
Political
Social and economic
Retrieved from "https://en.wikipedia.org/w/index.php?title=Gated_recurrent_unit&oldid=1324038139"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2026 Movatter.jp