A Primer on Neural Network Models for Natural Language Processing

@article{Goldberg2015APO,  title={A Primer on Neural Network Models for Natural Language Processing},  author={Yoav Goldberg},  journal={ArXiv},  year={2015},  volume={abs/1510.00726},  url={https://api.semanticscholar.org/CorpusID:8273530}}

Yoav Goldberg
Published inJournal of Artificial…2 October 2015
Computer Science, Linguistics

This tutorial surveys neural network models from the perspective of natural language processing research, in an attempt to bring natural-language researchers up to speed with the neural techniques.

View PDF on arXiv

1,181 Citations

Highly Influential Citations

Background Citations

515

Methods Citations

190

Results Citations

Figures from this paper

Topics

Natural Language Processing (opens in a new tab)Recursive Networks (opens in a new tab)Speech Processing (opens in a new tab)Neural Network Models (opens in a new tab)Convolutional Networks (opens in a new tab)Recurrent Networks (opens in a new tab)Image Recognition (opens in a new tab)Feed-forward Network (opens in a new tab)Neural Network (opens in a new tab)

1,181 Citations

Neural Network Methods for Natural Language Processing

Yoav Goldberg

Computer Science, Linguistics

Synthesis Lectures on Human Language Technologies

2017

This book focuses on the application of neural network models to natural language data, and introduces more specialized neural network architectures, including 1D convolutional neural networks, recurrent neural Networks, conditioned-generation models, and attention-based models.

An analysis of convolutional neural networks for sentence classification

João Paulo Albuquerque VieiraR. Moura

Computer Science

2017 XLIII Latin American Computer Conference…

2017

A series of experiments with Convolutional Neural Networks for sentence-level classification tasks with different hyperparameter settings and how sensitive model performance is to changes in these configurations are shown.

WORD BASED PREDICTION USING LSTM NEURAL NETWORKS FOR LANGUAGE MODELING

Aswini ThotaP. S. Reddy

Computer Science, Linguistics

2021

This work analyzes the Long Short-Term Memory neural network architecture on an English and a large French language modeling task and gains considerable improvements in WER on top of a state-of-the-art speech recognition system.

Natural language generation as neural sequence learning and beyond

Xingxing Zhang

Computer Science

2017

A deep reinforcement learning framework is proposed to inject prior knowledge into neural based NLG models and apply it to sentence simplification, and a novel hierarchical recurrent neural network architecture to represent and generate multiple sentences is proposed.

Empirical Exploration of Novel Architectures and Objectives for Language Models

Gakuto KurataA. SethyB. RamabhadranG. Saon

Computer Science, Linguistics

INTERSPEECH

2017

An empirical comparison of LSTM and CNN language models on English broadcast news and various conversational telephone speech transcription tasks is presented and a novel criterion for training language models that combines word and class prediction in a multi-task learning framework is proposed.

Deep Learning for Natural Language Processing: A Survey

E. O. ArkhangelskayaS. I. Nikolenko

Computer Science, Linguistics

Journal of Mathematical Sciences

2023

Various deep architectures that have either arisen specifically for NLP tasks or have become a method of choice for them are surveyed; the tasks include sentiment analysis, dependency parsing, machine translation, dialog and conversational models, question answering, and other applications.

Chapter 1 Analyzing Neural Network Optimization with Gradient Tracing

Brian DuSell

Computer Science

2018

In this chapter, the effect of neural network components on trainability is explored with a graph-traversal technique dubbed “gradient tracing,” which analyzes the extent to which certain components influence parameter updates and facilitate learning.

The Implications of Linguistic Characteristics in the Definition of a Learning Model

Diego UribeEnrique CuanE. Urquizo

Linguistics, Computer Science

2021 International Conference on Mechatronics…

2021

This article shows how to determine the linguistic characteristics from text is fundamental to define the deep learning model to be used and uses quantitative methods and complexity indicators to provide empirical evidence for the adequacy of recurrent neural models and their corresponding variants.

Natural Language Understanding with Distributed Representation

Kyunghyun Cho

Computer Science, Linguistics

ArXiv

2015

This lecture note introduces readers to a neural network based approach to natural language understanding/processing and spends much time on describing basics of machine learning and neural networks, only after which how they are used for natural languages is introduced.

Deep learning models to study sentence comprehension in the human brain

S. AranaJacques Pesnot LerousseauP. Hagoort

Computer Science, Linguistics

Language, Cognition and Neuroscience

2023

Two main results emerge: the neural representation of word meaning aligns with the context-dependent, dense word vectors used by the artificial neural networks, and the processing hierarchy that emerges within artificial neural networks broadly matches the brain, but is surprisingly inconsistent across studies.

[PDF]

...

184 References

LSTM Neural Networks for Language Modeling

M. SundermeyerRalf SchlüterH. Ney

Computer Science, Linguistics

INTERSPEECH

2012

Statistical Language Models Based on Neural Networks

Vysoké UčeníTechnické V BrněGrafiky A MultimédiíDisertační Práce

Computer Science, Linguistics

2012

Although these models are computationally more expensive than N -gram models, with the presented techniques it is possible to apply them to state-of-the-art systems efficiently and achieves the best published performance on well-known Penn Treebank setup.

Recurrent neural network based language model

Tomas MikolovM. KarafiátL. BurgetJ. ČernockýS. Khudanpur

Computer Science

INTERSPEECH

2010

Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model.

A unified architecture for natural language processing: deep neural networks with multitask learning

R. CollobertJ. Weston

Computer Science, Linguistics

ICML '08

2008

We describe a single convolutional neural network architecture that, given a sentence, outputs a host of language processing predictions: part-of-speech tags, chunks, named entity tags, semantic…

Character-Aware Neural Language Models

Yoon KimYacine JerniteD. SontagAlexander M. Rush

Computer Science, Linguistics

AAAI

2016

A simple neural language model that relies only on character-level inputs that is able to encode, from characters only, both semantic and orthographic information and suggests that on many languages, character inputs are sufficient for language modeling.

1,696

[PDF]

Natural Language Understanding with Distributed Representation

Kyunghyun Cho

Computer Science, Linguistics

ArXiv

2015

Learning Longer Memory in Recurrent Neural Networks

Tomas MikolovArmand JoulinS. ChopraMichaël MathieuMarc'Aurelio Ranzato

Computer Science

ICLR

2015

This paper shows that learning longer term patterns in real data, such as in natural language, is perfectly possible using gradient descent, by using a slight structural modification of the simple recurrent neural network architecture.

[PDF]

A Convolutional Neural Network for Modelling Sentences

Nal KalchbrennerEdward GrefenstettePhil Blunsom

Computer Science

ACL

2014

A convolutional architecture dubbed the Dynamic Convolutional Neural Network (DCNN) is described that is adopted for the semantic modelling of sentences and induces a feature graph over the sentence that is capable of explicitly capturing short and long-range relations.

[PDF]

On the Properties of Neural Machine Translation: Encoder–Decoder Approaches

Kyunghyun ChoB. V. MerrienboerDzmitry BahdanauYoshua Bengio

Computer Science

SSST@EMNLP

2014

It is shown that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase.

7,165

[PDF]

Joint Language and Translation Modeling with Recurrent Neural Networks

Michael AuliMichel GalleyChris QuirkG. Zweig

Computer Science, Linguistics

EMNLP

2013

This work presents a joint language and translation model based on a recurrent neural network which predicts target words based on an unbounded history of both source and target words which shows competitive accuracy compared to the traditional channel model features.

...

Related Papers

Showing 1 through 3 of 0 Related Papers

Movatterモバイル変換

A Primer on Neural Network Models for Natural Language Processing

Figures from this paper

Topics

1,181 Citations

Neural Network Methods for Natural Language Processing

An analysis of convolutional neural networks for sentence classification

WORD BASED PREDICTION USING LSTM NEURAL NETWORKS FOR LANGUAGE MODELING

Natural language generation as neural sequence learning and beyond

Empirical Exploration of Novel Architectures and Objectives for Language Models

Deep Learning for Natural Language Processing: A Survey

Chapter 1 Analyzing Neural Network Optimization with Gradient Tracing

The Implications of Linguistic Characteristics in the Definition of a Learning Model

Natural Language Understanding with Distributed Representation

Deep learning models to study sentence comprehension in the human brain

184 References

LSTM Neural Networks for Language Modeling

Statistical Language Models Based on Neural Networks

Recurrent neural network based language model

A unified architecture for natural language processing: deep neural networks with multitask learning

Character-Aware Neural Language Models

Natural Language Understanding with Distributed Representation

Learning Longer Memory in Recurrent Neural Networks

A Convolutional Neural Network for Modelling Sentences

On the Properties of Neural Machine Translation: Encoder–Decoder Approaches

Joint Language and Translation Modeling with Recurrent Neural Networks

Related Papers