Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
#

mla

Here are 34 public repositories matching this topic...

Awesome-LLM-Inference

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

  • UpdatedNov 28, 2025
  • Python

TransMLA: Multi-Head Latent Attention Is All You Need (NeurIPS 2025 Spotlight)

  • UpdatedSep 23, 2025
  • Python
ffpa-attn

🤖FFPA: Extend FlashAttention-2 with Split-D, ~O(1) SRAM complexity for large headdim, 1.8x~3x↑🎉 vs SDPA EA.

  • UpdatedNov 18, 2025
  • Cuda

[ICLR 2025] Palu: Compressing KV-Cache with Low-Rank Projection

  • UpdatedFeb 20, 2025
  • Python

Decoding Attention is specially optimized for MHA, MQA, GQA and MLA using CUDA core for the decoding stage of LLM inference.

  • UpdatedJun 11, 2025
  • C++

xKV: Cross-Layer SVD for KV-Cache Compression

  • UpdatedNov 30, 2025
  • Python

MLA-style citations and bibliographies using Biblatex

  • UpdatedJul 29, 2022
  • TeX

🍺 CLI for quickly generating citations for websites and books

  • UpdatedNov 14, 2018
  • JavaScript

An efficient and scalable attention module designed to reduce memory usage and improve inference speed in large language models. Designed and implemented the Multi-Head Latent Attention (MLA) module as a drop-in replacement for traditional multi-head attention (MHA) in large language models.

  • UpdatedJun 25, 2025
  • Python

MHA, MQA, GQA, MLA 相关原理及简要实现

  • UpdatedJan 23, 2025
  • Python

In this section, predicting the energy efficiency of buildings with machine learning algorithms.

  • UpdatedMar 9, 2023
  • Jupyter Notebook

Exam-focused study notes for the AWS Certified Machine Learning - Associate (MLA-C01) certification exam.

  • UpdatedOct 22, 2025

Provided is a Google Apps Script that's soul purpose is to help make MLA writing easier

  • UpdatedFeb 28, 2023
  • HTML

Make your BibTeX perfect. Auto-clean entries, unify conference names (e.g., CVPR, NeurIPS), and generate citation keys for LaTeX & Word.

  • UpdatedDec 12, 2025
  • JavaScript

A Mixture of Experts model with latent attention designed for efficient training and inference.

  • UpdatedOct 19, 2025
  • Python
mla-terminal

A recreation of the terminal interface from the video game The Talos Principle.

  • UpdatedFeb 25, 2023
  • C#

Code examples from the Graphics, Touch, Sound and USB book ported to the PIC32Mikromedia board

  • UpdatedSep 6, 2019
  • C

Automatic, ad-free citations

  • UpdatedSep 13, 2017
  • JavaScript

Improve this page

Add a description, image, and links to themla topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with themla topic, visit your repo's landing page and select "manage topics."

Learn more


[8]ページ先頭

©2009-2025 Movatter.jp