mla
Here are 34 public repositories matching this topic...
Language:All
Sort:Most stars
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
- Updated
Nov 28, 2025 - Python
🤖FFPA: Extend FlashAttention-2 with Split-D, ~O(1) SRAM complexity for large headdim, 1.8x~3x↑🎉 vs SDPA EA.
- Updated
Nov 18, 2025 - Cuda
Light-field imaging application for plenoptic cameras
- Updated
Sep 10, 2025 - Python
[ICLR 2025] Palu: Compressing KV-Cache with Low-Rank Projection
- Updated
Feb 20, 2025 - Python
Decoding Attention is specially optimized for MHA, MQA, GQA and MLA using CUDA core for the decoding stage of LLM inference.
- Updated
Jun 11, 2025 - C++
Light field geometry estimator for plenoptic cameras
- Updated
Mar 11, 2025 - Python
xKV: Cross-Layer SVD for KV-Cache Compression
- Updated
Nov 30, 2025 - Python
🍺 CLI for quickly generating citations for websites and books
- Updated
Nov 14, 2018 - JavaScript
An efficient and scalable attention module designed to reduce memory usage and improve inference speed in large language models. Designed and implemented the Multi-Head Latent Attention (MLA) module as a drop-in replacement for traditional multi-head attention (MHA) in large language models.
- Updated
Jun 25, 2025 - Python
In this section, predicting the energy efficiency of buildings with machine learning algorithms.
- Updated
Mar 9, 2023 - Jupyter Notebook
Exam-focused study notes for the AWS Certified Machine Learning - Associate (MLA-C01) certification exam.
- Updated
Oct 22, 2025
Provided is a Google Apps Script that's soul purpose is to help make MLA writing easier
- Updated
Feb 28, 2023 - HTML
Make your BibTeX perfect. Auto-clean entries, unify conference names (e.g., CVPR, NeurIPS), and generate citation keys for LaTeX & Word.
- Updated
Dec 12, 2025 - JavaScript
A Mixture of Experts model with latent attention designed for efficient training and inference.
- Updated
Oct 19, 2025 - Python
A recreation of the terminal interface from the video game The Talos Principle.
- Updated
Feb 25, 2023 - C#
Code examples from the Graphics, Touch, Sound and USB book ported to the PIC32Mikromedia board
- Updated
Sep 6, 2019 - C
Improve this page
Add a description, image, and links to themla topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with themla topic, visit your repo's landing page and select "manage topics."