- Notifications
You must be signed in to change notification settings - Fork10
A collection of tricks and tools to speed up transformer models
License
OpenMachine-ai/transformer-tricks
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
A collection of tricks to simplify and speed up transformer models:
- Slim attention:paper,video,podcast,notebook,code-readme, 🤗article,reddit
- FlashNorm:paper,video,podcast,notebook,code-readme
- Matrix-shrink [work in progress]:paper
- Precomputing the first layer:paper,video,podcast
- KV-weights only for skipless transformers:paper,video,podcast,notebook
These transformer tricks extend a recent trend in neural network design toward architectural parsimony, in which unnecessary components are removed to create more efficient models. Notable examples includeRMSNorm’s simplification of LayerNorm by removing mean centering,PaLM's elimination of bias parameters, anddecoder-only transformer's omission of the encoder stack. This trend began with the originaltransformer model's removal of recurrence and convolutions.
For example, ourFlashNorm removes the weights from RMSNorm and merges them with the next linear layer. Andslim attention removes the entire V-cache from the context memory for MHA transformers.
Install the transformer tricks package:
pip install transformer-tricks
Alternatively, to run from latest repo:
git clone https://github.com/OpenMachine-ai/transformer-tricks.gitpython3 -m venv .venvsource .venv/bin/activatepip3 install --quiet -r requirements.txtFollow the links below for documentation of the python code in this directory:
The papers are accompanied by the following Jupyter notebooks:
Please subscribe to ournewsletter on substack to get the latest news about this project. We will never send you more than one email per month.
We pay cash for high-impact contributions. Please check outCONTRIBUTING for how to get involved.
The Transformer Tricks project is currently sponsored byOpenMachine. We'd love to hear from you if you'd like to join us in supporting this project.
Please give us a ⭐ if you like this repo, and check outTinyFive
About
A collection of tricks and tools to speed up transformer models
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors5
Uh oh!
There was an error while loading.Please reload this page.



