You signed in with another tab or window.Reload to refresh your session.You signed out in another tab or window.Reload to refresh your session.You switched accounts on another tab or window.Reload to refresh your session.Dismiss alert
In the command above we set the Triton library to the commit used in our experiments. Feel free to experiment with later Triton versions.
Reproducing our LM experiments on OpenWebText2
GPU requirements: Preferably, you need at least one A100. Some of our experiments use data-parallelism with up to 3 A100s. You should have no problem running those experiments on any GPU supportingbfloat16, you might have to change the model parameters to adapt to the memory available.
Go in theopenwebtext2-experiments folder and run thescript/train-LMs.sh command.
Reproducing our runtime results
GPU requirements: We used one A100.
For the Hash-sparse and QK-sparse results, go in theruntime-experiments folder and check thetimeperf-hash-and-qk-sparse.ipynb notebook.