Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
#

llm-compression

Here are 9 public repositories matching this topic...

A curated list for Efficient Large Language Models

  • UpdatedMar 14, 2025
  • Python

Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for LLMs

  • UpdatedNov 25, 2024
  • Python

D^2-MoE: Delta Decompression for MoE-based LLMs Compression

  • UpdatedMar 3, 2025
  • Python

LLM Compression Benchmark

  • UpdatedFeb 11, 2025
  • Python

[ICLR 2024] Jaiswal, A., Gan, Z., Du, X., Zhang, B., Wang, Z., & Yang, Y. Compressing llms: The truth is rarely pure and never simple.

  • UpdatedMar 13, 2024
  • Python

LLM Inference on AWS Lambda

  • UpdatedJun 3, 2024
  • Python

[CAAI AIR'24] Minimize Quantization Output Error with Bias Compensation

  • UpdatedMar 12, 2025
  • Python

This repository contains the official implementation of "iShrink: Making 1B Models Even Smaller and Faster". iShrink is a structured pruning approach that effectively compresses 1B-parameter language models while maintaining their performance and improving efficiency.

  • UpdatedJan 14, 2025
  • Python

Improve this page

Add a description, image, and links to thellm-compression topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with thellm-compression topic, visit your repo's landing page and select "manage topics."

Learn more


[8]ページ先頭

©2009-2025 Movatter.jp