forked fromkarpathy/LLM101n
- Notifications
You must be signed in to change notification settings - Fork0
flypythoncom/LLM101n_CN
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
What I cannot create, I do not understand. -Richard Feynman
In this course we will build a Storyteller AI Large Language Model (LLM). Hand in hand, you'll be able create, refine and illustrate littlestories with the AI. We are going to build everything end-to-end from basics to a functioning web app similar to ChatGPT, from scratch in Python, C and CUDA, and with minimal computer science prerequisits. By the end you should have a relatively deep understanding of AI, LLMs, and deep learning more generally.
Syllabus
- Chapter 01Bigram Language Model (language modeling)
- Chapter 02Micrograd (machine learning, backpropagation)
- Chapter 03N-gram model (multi-layer perceptron, matmul, gelu)
- Chapter 04Attention (attention, softmax, positional encoder)
- Chapter 05Transformer (transformer, residual, layernorm, GPT-2)
- Chapter 06Tokenization (minBPE, byte pair encoding)
- Chapter 07Optimization (initialization, optimization, AdamW)
- Chapter 08Need for Speed I: Device (device, CPU, GPU, ...)
- Chapter 09Need for Speed II: Precision (mixed precision training, fp16, bf16, fp8, ...)
- Chapter 10Need for Speed III: Distributed (distributed optimization, DDP, ZeRO)
- Chapter 11Datasets (datasets, data loading, synthetic data generation)
- Chapter 12Inference I: kv-cache (kv-cache)
- Chapter 13Inference II: Quantization (quantization)
- Chapter 14Finetuning I: SFT (supervised finetuning SFT, PEFT, LoRA, chat)
- Chapter 15Finetuning II: RL (reinforcement learning, RLHF, PPO, DPO)
- Chapter 16Deployment (API, web app)
- Chapter 17Multimodal (VQVAE, diffusion transformer)
Appendix
Further topics to work into the progression above:
- Programming languages: Assembly, C, Python
- Data types: Integer, Float, String (ASCII, Unicode, UTF-8)
- Tensor: shapes, views, strides, contiguous, ...
- Deep Learning frameowrks: PyTorch, JAX
- Neural Net Architecture: GPT (1,2,3,4), Llama (RoPE, RMSNorm, GQA), MoE, ...
- Multimodal: Images, Audio, Video, VQVAE, VQGAN, diffusion
About
LLM101n 中文翻译版
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
No releases published
Packages0
No packages published
