- Notifications
You must be signed in to change notification settings - Fork9
Nativeatom/NaturalLanguageProcessing
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
This repository includes basic concepts of Natural Language Processing, textbooks and blogs of good reputation, popular papers and so on.
This is also the Natural Language Processing part ofMachine Learning Resources created by a group of people includingjindongwang.
Contributors are welcomed to work together and make it BETTER!
Linear Algebra
Matrix Analysis
Convex Optimization
- The Elements of Statistical Learning(ESL) - HTF
- CS228 Probabilistic Graphical Model - Stanford
- 10708 Probabilistic Graphical Model - CMU
- Deep Learning - Ian Goodfellow, Yoshua Bengio, Aaron Courville
- CS231n Convolutional Neural Networks for Visual Recognition - Stanford
- Foundations of Statistical Natural Language Processing -Chris Manning
- Speech and Language Processing - Daniel Jurafsky and James H. Martin
- 统计学习方法 - 李航
- Advanced Natural Language Processing - MIT
- CS 224n Natural Language Processing with Deep Learning - Stanford
- Deep Learning for NLP at Oxford with Deepmind - Oxford
- 11-747 NN4NLP
- 11-737 Multilingual NLP
- Some Knowledge about Machine Learning
- A list of datasets
Probalistic Graphical Model
- Hidden Markov Model
- Conditional Random Fields
Topic Model
- Latent Dirichlet Allocation(paper)
Deep Learning Model
- Long Short Term Memory(LSTM)Sepp Hochreiter, 1997
- InterpretationOmer Levy, UWashington, 2018
- Recurrent Neuron Network- Seq2Seq(Tensorflow Tutorial)-Machine Translation Tensorflow implement
- Convolutional Neuron Network
- Attention Model
- Overview(Chinese)
- Generative Adversial Network(GAN)
- Transformer
- Training Tips
- Bidirectional Encoder Representation from Transformers(BERT)Jacob Devlin, Google 2018
- Long Short Term Memory(LSTM)Sepp Hochreiter, 1997
- Tensorflow implement on RNN and undocumented features
- The Unreasonable Effectiveness of Recurrent Neural Networks
Category of areas is based on tracks in ACL 2018, ACL 2020, EMNLP 2020
- Task
- Summerization
- Opinion Summarization
- Evaluation
- Model
- Extractive
- Generative
- Hybrid
- Dataset
- XSum, EMNLP2018 [paper]
- CNN/DailyMail
- NEWSROOM
- Multi-News
- Gigaword
- arXiv
- PubMed
- BIGPATENT
- WikiHow
- Reddit TIFU (long, short)
- AESLC
- BillSum
- Model
- Word2Vec
- Pre-trained Embedding
- Glove
- word2vec
- FastText
- Contextual Word Embedding
- ELMo
- GPT
- BERT
- XLNet
- BART
- T-5
- Task
- Word Segmentation
- Syntactic Parsing
- Model
- Hidden Markov Model (HMM)
- Conditional Random Fields (CRFs)
- Finetuned Language Models
- Task
- Constituency Parsing
- Dependency Parsing
- Visual Grounded Syntactic Aquisition
- Model
- Dataset
Tasks
- Semantic Parsing
- AMR-to-text
- Text-to-AMR
- Table-to-text
- Code Generation
- Semantic Parsing
Model
Dataset
- Tasks
- Word Sense Disambiguation
- Tasks
- Topic Extraction
- Sentimental Extraction
- Aspect Extraction
- Task
- Machine Translation
- Non-autogressive Machine Translation
- Word-alignment
- Model
- Dataset
- WMT
- Task
- SPAM Classification
- Sentiment Analysis
- Model
- Dataset
- Task
- Dataset
- CNN/DailyMail
- SQuAD
- Benchmark: F1-86.967BERT + Synthetic Self-Training (ensemble)Jan 10, 2019
- RACE
- Benchmark: RACE-83.2 RACEC-M-86.5 RACE-H-81.3RoBERTaJuly 2019
- Task
- Code-Switching
- Mutilingual Translation
- Model
- Dataset
- Tasks
- Model
- N-gram
- ELMo, NAACL2018
- GPT
- GPT-2, arXiv2019
- GPT-3, NeurIPS2020
- BERT, NAACL2019
- RoBERTa, arXiv 2019
- SpanBERT, TACL 2020
- Efficient
- Domain Specific
- Langauge Specific [Latin BERT,German BERT,Italian BERT,Chinese BERT]
- BERTology, TACL 2020
- XLNet, NeurIPS2019
- MASS, ICML2019 [code]
- ELECTRA, ICLR2020 [code]
- T5, JMLR2020
- BART, ACL2020
- Finetuning
- Invasive (LM not fixed)
- Regular finetuning
- Re-initlization for few-shot learning ICLR2021
- Non-invasive (LM fixed)
- Prefix-tuning, arXiv2021
- Invasive (LM not fixed)
- Language Model as
- BERTScore, ICLR2020
- Few-shot learner
- Bias in few-shot examples, arXiv2021
- Knowledge baseEMNLP2019,Tutorial@AAAI2021
- Dataset
- CommonCrawl
- Wiki-Text
- STORIES
- C4 [huggingface]
Tasks
- Fact Verification
- Commonsense Reasoning
- Word-level Rationales
- Factually Consistent Generation
Model
Dataset
- Tasks
- Grammartical Error Correction (GEC) [BEA@NAACL2018,BEA@ACL2019,BEA@ACL2020,BEA@EACL2021]
- Lexical Substitution
- Lexical Simplification
- Model
- Dataset
- Machine Learning Package and Framework
- sciki-learn
- Tensorflow
- Caffe2
- Pytorch
- MXNet
- NLTK
- gensim
- jieba
- Stanford NLP
- Transformers (huggingface)
如果你对本项目感兴趣,非常欢迎你加入!
- 正常参与:请直接fork、pull都可以
- 如果要上传文件:请不要直接上传到项目中,否则会造成git版本库过大。正确的方法是上传它的超链接。如果你要上传的文件本身就在网络中(如paper都会有链接),直接上传即可;如果是自己想分享的一些文件、数据等,鉴于国内网盘的情况,请按照如下方式上传:
快速了解github协同工作 Learn how to collaborate through github
及时更新fork项目 Update through fork
About
Natural Language Procesing
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Contributors2
Uh oh!
There was an error while loading.Please reload this page.