DOI:10.1613/jair.4992 - Corpus ID: 8273530
A Primer on Neural Network Models for Natural Language Processing
@article{Goldberg2015APO, title={A Primer on Neural Network Models for Natural Language Processing}, author={Yoav Goldberg}, journal={ArXiv}, year={2015}, volume={abs/1510.00726}, url={https://api.semanticscholar.org/CorpusID:8273530}}- Yoav Goldberg
- Published inJournal of Artificial…2 October 2015
- Computer Science, Linguistics
This tutorial surveys neural network models from the perspective of natural language processing research, in an attempt to bring natural-language researchers up to speed with the neural techniques.
1,181 Citations
Figures from this paper
Topics
Natural Language Processing (opens in a new tab)Recursive Networks (opens in a new tab)Speech Processing (opens in a new tab)Neural Network Models (opens in a new tab)Convolutional Networks (opens in a new tab)Recurrent Networks (opens in a new tab)Image Recognition (opens in a new tab)Feed-forward Network (opens in a new tab)Neural Network (opens in a new tab)
1,181 Citations
Neural Network Methods for Natural Language Processing
- Yoav Goldberg
- 2017
Computer Science, Linguistics
This book focuses on the application of neural network models to natural language data, and introduces more specialized neural network architectures, including 1D convolutional neural networks, recurrent neural Networks, conditioned-generation models, and attention-based models.
An analysis of convolutional neural networks for sentence classification
- João Paulo Albuquerque VieiraR. Moura
- 2017
Computer Science
A series of experiments with Convolutional Neural Networks for sentence-level classification tasks with different hyperparameter settings and how sensitive model performance is to changes in these configurations are shown.
WORD BASED PREDICTION USING LSTM NEURAL NETWORKS FOR LANGUAGE MODELING
- Aswini ThotaP. S. Reddy
- 2021
Computer Science, Linguistics
This work analyzes the Long Short-Term Memory neural network architecture on an English and a large French language modeling task and gains considerable improvements in WER on top of a state-of-the-art speech recognition system.
Natural language generation as neural sequence learning and beyond
- Xingxing Zhang
- 2017
Computer Science
A deep reinforcement learning framework is proposed to inject prior knowledge into neural based NLG models and apply it to sentence simplification, and a novel hierarchical recurrent neural network architecture to represent and generate multiple sentences is proposed.
Empirical Exploration of Novel Architectures and Objectives for Language Models
- Gakuto KurataA. SethyB. RamabhadranG. Saon
- 2017
Computer Science, Linguistics
An empirical comparison of LSTM and CNN language models on English broadcast news and various conversational telephone speech transcription tasks is presented and a novel criterion for training language models that combines word and class prediction in a multi-task learning framework is proposed.
Deep Learning for Natural Language Processing: A Survey
- E. O. ArkhangelskayaS. I. Nikolenko
- 2023
Computer Science, Linguistics
Various deep architectures that have either arisen specifically for NLP tasks or have become a method of choice for them are surveyed; the tasks include sentiment analysis, dependency parsing, machine translation, dialog and conversational models, question answering, and other applications.
Chapter 1 Analyzing Neural Network Optimization with Gradient Tracing
- Brian DuSell
- 2018
Computer Science
In this chapter, the effect of neural network components on trainability is explored with a graph-traversal technique dubbed “gradient tracing,” which analyzes the extent to which certain components influence parameter updates and facilitate learning.
The Implications of Linguistic Characteristics in the Definition of a Learning Model
- Diego UribeEnrique CuanE. Urquizo
- 2021
Linguistics, Computer Science
2021 International Conference on Mechatronics…
This article shows how to determine the linguistic characteristics from text is fundamental to define the deep learning model to be used and uses quantitative methods and complexity indicators to provide empirical evidence for the adequacy of recurrent neural models and their corresponding variants.
Natural Language Understanding with Distributed Representation
- Kyunghyun Cho
- 2015
Computer Science, Linguistics
This lecture note introduces readers to a neural network based approach to natural language understanding/processing and spends much time on describing basics of machine learning and neural networks, only after which how they are used for natural languages is introduced.
Deep learning models to study sentence comprehension in the human brain
- S. AranaJacques Pesnot LerousseauP. Hagoort
- 2023
Computer Science, Linguistics
Two main results emerge: the neural representation of word meaning aligns with the context-dependent, dense word vectors used by the artificial neural networks, and the processing hierarchy that emerges within artificial neural networks broadly matches the brain, but is surprisingly inconsistent across studies.
...
184 References
LSTM Neural Networks for Language Modeling
- M. SundermeyerRalf SchlüterH. Ney
- 2012
Computer Science, Linguistics
This work analyzes the Long Short-Term Memory neural network architecture on an English and a large French language modeling task and gains considerable improvements in WER on top of a state-of-the-art speech recognition system.
Statistical Language Models Based on Neural Networks
- Vysoké UčeníTechnické V BrněGrafiky A MultimédiíDisertační Práce
- 2012
Computer Science, Linguistics
Although these models are computationally more expensive than N -gram models, with the presented techniques it is possible to apply them to state-of-the-art systems efficiently and achieves the best published performance on well-known Penn Treebank setup.
Recurrent neural network based language model
- Tomas MikolovM. KarafiátL. BurgetJ. ČernockýS. Khudanpur
- 2010
Computer Science
Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model.
A unified architecture for natural language processing: deep neural networks with multitask learning
- R. CollobertJ. Weston
- 2008
Computer Science, Linguistics
We describe a single convolutional neural network architecture that, given a sentence, outputs a host of language processing predictions: part-of-speech tags, chunks, named entity tags, semantic…
Character-Aware Neural Language Models
- Yoon KimYacine JerniteD. SontagAlexander M. Rush
- 2016
Computer Science, Linguistics
A simple neural language model that relies only on character-level inputs that is able to encode, from characters only, both semantic and orthographic information and suggests that on many languages, character inputs are sufficient for language modeling.
Natural Language Understanding with Distributed Representation
- Kyunghyun Cho
- 2015
Computer Science, Linguistics
This lecture note introduces readers to a neural network based approach to natural language understanding/processing and spends much time on describing basics of machine learning and neural networks, only after which how they are used for natural languages is introduced.
Learning Longer Memory in Recurrent Neural Networks
- Tomas MikolovArmand JoulinS. ChopraMichaël MathieuMarc'Aurelio Ranzato
- 2015
Computer Science
This paper shows that learning longer term patterns in real data, such as in natural language, is perfectly possible using gradient descent, by using a slight structural modification of the simple recurrent neural network architecture.
A Convolutional Neural Network for Modelling Sentences
- Nal KalchbrennerEdward GrefenstettePhil Blunsom
- 2014
Computer Science
A convolutional architecture dubbed the Dynamic Convolutional Neural Network (DCNN) is described that is adopted for the semantic modelling of sentences and induces a feature graph over the sentence that is capable of explicitly capturing short and long-range relations.
On the Properties of Neural Machine Translation: Encoder–Decoder Approaches
- Kyunghyun ChoB. V. MerrienboerDzmitry BahdanauYoshua Bengio
- 2014
Computer Science
SSST@EMNLP
It is shown that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase.
Joint Language and Translation Modeling with Recurrent Neural Networks
- Michael AuliMichel GalleyChris QuirkG. Zweig
- 2013
Computer Science, Linguistics
This work presents a joint language and translation model based on a recurrent neural network which predicts target words based on an unbounded history of both source and target words which shows competitive accuracy compared to the traditional channel model features.
...
Related Papers
Showing 1 through 3 of 0 Related Papers













