A Primer on Neural Network Models for Natural Language Processing

@article{Goldberg2015APO,  title={A Primer on Neural Network Models for Natural Language Processing},  author={Yoav Goldberg},  journal={ArXiv},  year={2015},  volume={abs/1510.00726},  url={https://api.semanticscholar.org/CorpusID:8273530}}
This tutorial surveys neural network models from the perspective of natural language processing research, in an attempt to bring natural-language researchers up to speed with the neural techniques.

1,181 Citations

Neural Network Methods for Natural Language Processing

This book focuses on the application of neural network models to natural language data, and introduces more specialized neural network architectures, including 1D convolutional neural networks, recurrent neural Networks, conditioned-generation models, and attention-based models.

An analysis of convolutional neural networks for sentence classification

A series of experiments with Convolutional Neural Networks for sentence-level classification tasks with different hyperparameter settings and how sensitive model performance is to changes in these configurations are shown.

WORD BASED PREDICTION USING LSTM NEURAL NETWORKS FOR LANGUAGE MODELING

This work analyzes the Long Short-Term Memory neural network architecture on an English and a large French language modeling task and gains considerable improvements in WER on top of a state-of-the-art speech recognition system.

Natural language generation as neural sequence learning and beyond

A deep reinforcement learning framework is proposed to inject prior knowledge into neural based NLG models and apply it to sentence simplification, and a novel hierarchical recurrent neural network architecture to represent and generate multiple sentences is proposed.

Empirical Exploration of Novel Architectures and Objectives for Language Models

An empirical comparison of LSTM and CNN language models on English broadcast news and various conversational telephone speech transcription tasks is presented and a novel criterion for training language models that combines word and class prediction in a multi-task learning framework is proposed.

Deep Learning for Natural Language Processing: A Survey

Various deep architectures that have either arisen specifically for NLP tasks or have become a method of choice for them are surveyed; the tasks include sentiment analysis, dependency parsing, machine translation, dialog and conversational models, question answering, and other applications.

Chapter 1 Analyzing Neural Network Optimization with Gradient Tracing

    Brian DuSell
    Computer Science
  • 2018
In this chapter, the effect of neural network components on trainability is explored with a graph-traversal technique dubbed “gradient tracing,” which analyzes the extent to which certain components influence parameter updates and facilitate learning.

The Implications of Linguistic Characteristics in the Definition of a Learning Model

This article shows how to determine the linguistic characteristics from text is fundamental to define the deep learning model to be used and uses quantitative methods and complexity indicators to provide empirical evidence for the adequacy of recurrent neural models and their corresponding variants.

Natural Language Understanding with Distributed Representation

This lecture note introduces readers to a neural network based approach to natural language understanding/processing and spends much time on describing basics of machine learning and neural networks, only after which how they are used for natural languages is introduced.

Deep learning models to study sentence comprehension in the human brain

Two main results emerge: the neural representation of word meaning aligns with the context-dependent, dense word vectors used by the artificial neural networks, and the processing hierarchy that emerges within artificial neural networks broadly matches the brain, but is surprisingly inconsistent across studies.
...

184 References

LSTM Neural Networks for Language Modeling

This work analyzes the Long Short-Term Memory neural network architecture on an English and a large French language modeling task and gains considerable improvements in WER on top of a state-of-the-art speech recognition system.

Statistical Language Models Based on Neural Networks

Although these models are computationally more expensive than N -gram models, with the presented techniques it is possible to apply them to state-of-the-art systems efficiently and achieves the best published performance on well-known Penn Treebank setup.

Recurrent neural network based language model

Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model.

A unified architecture for natural language processing: deep neural networks with multitask learning

We describe a single convolutional neural network architecture that, given a sentence, outputs a host of language processing predictions: part-of-speech tags, chunks, named entity tags, semantic

Character-Aware Neural Language Models

A simple neural language model that relies only on character-level inputs that is able to encode, from characters only, both semantic and orthographic information and suggests that on many languages, character inputs are sufficient for language modeling.

Natural Language Understanding with Distributed Representation

This lecture note introduces readers to a neural network based approach to natural language understanding/processing and spends much time on describing basics of machine learning and neural networks, only after which how they are used for natural languages is introduced.

Learning Longer Memory in Recurrent Neural Networks

This paper shows that learning longer term patterns in real data, such as in natural language, is perfectly possible using gradient descent, by using a slight structural modification of the simple recurrent neural network architecture.

A Convolutional Neural Network for Modelling Sentences

A convolutional architecture dubbed the Dynamic Convolutional Neural Network (DCNN) is described that is adopted for the semantic modelling of sentences and induces a feature graph over the sentence that is capable of explicitly capturing short and long-range relations.

On the Properties of Neural Machine Translation: Encoder–Decoder Approaches

It is shown that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase.

Joint Language and Translation Modeling with Recurrent Neural Networks

This work presents a joint language and translation model based on a recurrent neural network which predicts target words based on an unbounded history of both source and target words which shows competitive accuracy compared to the traditional channel model features.
...

Related Papers

Showing 1 through 3 of 0 Related Papers