- Notifications
You must be signed in to change notification settings - Fork2.6k
YSDA course in Natural Language Processing
License
NotificationsYou must be signed in to change notification settings
yandexdataschool/nlp_course
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
- This is the 2024 version. For previous year' course materials, go tothis branch
- Lecture and seminar materials for each week are in ./week* folders, see README.md for materials and instructions
- Any technical issues, ideas, bugs in course materials, contribution ideas - add anissue
- Installing libraries and troubleshooting:this thread.
week01Word Embeddings
- Lecture: Word embeddings. Distributional semantics. Count-based (pre-neural) methods. Word2Vec: learn vectors. GloVe: count, then learn. Evaluation: intrinsic vs extrinsic. Analysis and Interpretability.Interactive lecture materials and more.
- Seminar: Playing with word and sentence embeddings
- Homework: Embedding-based machine translation system
week02Text Classification
- Lecture: Text classification: introduction and datasets. General framework: feature extractor + classifier. Classical approaches: Naive Bayes, MaxEnt (Logistic Regression), SVM. Neural Networks: General View, Convolutional Models, Recurrent Models. Practical Tips: Data Augmentation. Analysis and Interpretability.Interactive lecture materials and more.
- Seminar: Text classification with convolutional NNs.
- Homework: Statistical & neural text classification.
week03Language Modeling
- Lecture: Language Modeling: what does it mean? Left-to-right framework. N-gram language models. Neural Language Models: General View, Recurrent Models, Convolutional Models. Evaluation. Practical Tips: Weight Tying. Analysis and Interpretability.Interactive lecture materials and more.
- Seminar: Build a N-gram language model from scratch
- Homework: Neural LMs & smoothing in count-based models.
week04Seq2seq and Attention
- Lecture: Seq2seq Basics: Encoder-Decoder framework, Training, Simple Models, Inference (e.g., beam search). Attention: general, score functions, models. Transformer: self-attention, masked self-attention, multi-head attention; model architecture. Subword Segmentation (BPE). Analysis and Interpretability: functions of attention heads; probing for linguistic structure.Interactive lecture materials and more.
- Seminar: Basic sequence to sequence model
- Homework: Machine translation with attention
week05Transfer Learning
- Lecture: What is Transfer Learning? Great idea 1: From Words to Words-in-Context (CoVe, ELMo). Great idea 2: From Replacing Embeddings to Replacing Models (GPT, BERT). (A Bit of) Adaptors. Analysis and Interpretability.Interactive lecture materials and more.
- Homework: fine-tuning a pre-trained BERT model
week06LLMs and Prompting
- Lecture: Scaling laws. Emergent abilities. Prompting (aka "in-context learning"): techiques that work; questioning whether model "understands" prompts. Hypotheses for why and how in-context learning works. Analysis and Interpretability.
- Homework: manual prompt engneering and chain-of-thought reasoning
week07Transformer architecture and training
- Lecture: training tips for transformers; the evolution of transformer architecture from Vaswani et al (2017) to modern LLMs; parameter-efficient fine-tuning (PEFT)
- Homework: fine-tuning a large language model with PEFT algorithms
week08Reinforcement Learning from Human Feedback
- Lecture: model alignment, RLHF, case study of InstructGPT and ChatGPT
- Homework: fine-tune your own language model with RL (using HuggingFace
trl
)
week09 (extra)Domain Adaptation in NLP
- Lecture: why do domain adaptation? Methods: reweighting, proxy labels, adversarial domain adaptation
- Optional homework: implement domain adaptation when fine-tuning BERT models
week10_Efficient Inference in NLP
- Lecture: how NLP models are deployed, a survey of compression and acceleration: quantization, sparsification, ACT & more
- Practice: implement RTN and GPTQ for 4-bit LLM quantization
week11 (extra)_Retrieval Augmented Language Models
- Guest lecture: retrieval in LMs, token-level retrieval (KNNLM & more), RAG, RETRO, tools: langchain , HF Agents, open problems
Course materials and teaching performed by
- Elena Voita - course author
- [Mikhail Diskin] [Ignat Romanov] [Ruslan Svirschevski] - lectures
- Valentina Broner - course admin for on-campus students
- Boris Kovarsky,David Talbot,Sergey Gubanov,Just Heuristic - help build course materials and/or held some classes
- 30+ volunteers who contributed and refined the notebooks and course materials. Without their help, the course would not be what it is today
- A mighty host of TAs who stoically grade hundreds of homework submissions from on-campus students each year