Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings
IST-DASLab

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
@IST-DASLab

IST Austria Distributed Algorithms and Systems Lab

Popular repositoriesLoading

  1. gptqgptqPublic

    Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".

    Python 2.1k 176

  2. marlinmarlinPublic

    FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

    Python 851 70

  3. sparsegptsparsegptPublic

    Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".

    Python 812 107

  4. PanzaMailPanzaMailPublic

    Python 291 19

  5. qmoeqmoePublic

    Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".

    Python 277 22

  6. QUIKQUIKPublic

    Repository for the QUIK project, enabling the use of 4bit kernels for generative inference - EMNLP 2024

    C++ 180 13

Repositories

Loading
Type
Select type
Language
Select language
Sort
Select order
Showing 10 of 61 repositories

Top languages

Loading…

Most used topics

Loading…


[8]ページ先頭

©2009-2025 Movatter.jp