Inartificial neural networks, thegated recurrent unit (GRU) is agating mechanism used inrecurrent neural networks, introduced in 2014 byKyunghyun Cho et al.[1] The GRU is like along short-term memory (LSTM) with a gating mechanism to input or forget certain features,[2] but lacks a context vector or output gate, resulting in fewer parameters than LSTM.[3] GRU's performance on certain tasks of polyphonic music modeling, speech signal modeling and natural language processing was found to be similar to that of LSTM.[4][5] GRUs showed that gating is indeed helpful in general, andBengio's team came to no concrete conclusion on which of the two gating units was better.[6][7]
There are several variations on the full gated unit, with gating done using the previous hidden state and the bias in various combinations, and a simplified form called minimal gated unit.[8]
The minimal gated unit (MGU) is similar to the fully gated unit, except the update and reset gate vector is merged into a forget gate. This also implies that the equation for the output vector must be changed:[10]
The light gated recurrent unit (LiGRU)[4] removes the reset gate altogether, replacestanh with theReLU activation, and appliesbatch normalization (BN):
LiGRU has been studied from a Bayesian perspective.[11] This analysis yielded a variant called light Bayesian recurrent unit (LiBRU), which showed slight improvements over the LiGRU onspeech recognition tasks.
^Cho, Kyunghyun; van Merrienboer, Bart; Gulcehre, Caglar; Bahdanau, Dzmitry; Bougares, Fethi; Schwenk, Holger; Bengio, Yoshua (2014). "Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation".Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP):1724–1734.arXiv:1406.1078.doi:10.3115/v1/D14-1179.
^Chung, Junyoung; Gulcehre, Caglar; Cho, KyungHyun; Bengio, Yoshua (2014). "Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling".arXiv:1412.3555 [cs.NE].
^Dey, Rahul; Salem, Fathi M. (2017-01-20). "Gate-Variants of Gated Recurrent Unit (GRU) Neural Networks".arXiv:1701.05923 [cs.NE].
^Heck, Joel; Salem, Fathi M. (2017-01-12). "Simplified Minimal Gated Unit Variations for Recurrent Neural Networks".arXiv:1701.03452 [cs.NE].
^Bittar, Alexandre; Garner, Philip N. (May 2021)."A Bayesian Interpretation of the Light Gated Recurrent Unit".ICASSP 2021. 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Toronto, ON, Canada: IEEE. pp. 2965–2969. 10.1109/ICASSP39728.2021.9414259.