moe

:electron: An unofficialhttps://bgm.tv ui first app client for Android and iOS, built with React Native. 一个无广告、以爱好为驱动、不以盈利为目的、专门做 ACG 的类似豆瓣的追番记录，bgm.tv 第三方客户端。为移动端重新设计，内置大量加强的网页端难以实现的功能，且提供了相当的自定义选项。目前已适配 iOS / Android / WSA、mobile / 简单 pad、light / dark theme、移动端网页。

react android ios design react-native mobx ios-app moe bangumi android-app expo

UpdatedApr 18, 2025
TypeScript

PKU-YuanGroup /MoE-LLaVA

Star2.2k

Mixture-of-Experts for Large Vision-Language Models

moe multi-modal mixture-of-experts large-vision-language-model

UpdatedDec 3, 2024
Python

MoonshotAI /MoBA

Star1.8k

MoBA: Mixture of Block Attention for Long-Context LLMs

pytorch transformer moe llm llm-serving llm-training flash-attention

UpdatedApr 3, 2025
Python

davidmrau /mixture-of-experts

Star1.1k

PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al.https://arxiv.org/abs/1701.06538

pytorch moe re-implementation mixture-of-experts sparsely-gated-mixture-of-experts

UpdatedApr 19, 2024
Python

pjlab-sys4nlp /llama-moe

Star956

⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)

moe llama mixture-of-experts llm continual-pre-training expert-partition

UpdatedDec 6, 2024
Python

microsoft /Tutel

Star803

Tutel MoE: Optimized Mixture-of-Experts Library, Support DeepSeek FP8/FP4

pytorch moe mixture-of-experts llm deepseek

UpdatedApr 22, 2025
Python

sail-sg /Adan

Star787

Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

deep-learning optimizer pytorch artificial-intelligence moe resnet vit diffusion mae fairseq cuda-programming bert-model gpt2 transformer-xl timm convnext adan llms dreamfusion llm-training

UpdatedJul 2, 2024
Python

open-compass /MixtralKit

Star769

A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI

moe mistral llm

UpdatedDec 15, 2023
Python

ScienceOne-AI /DeepSeek-671B-SFT-Guide

Star655

An open-source solution for full parameter fine-tuning of DeepSeek-V3/R1 671B, including complete code and scripts from training to inference, as well as some practical experiences and conclusions. (DeepSeek-V3/R1 满血版 671B 全参数微调的开源解决方案，包含从训练到推理的完整代码和脚本，以及实践中积累一些经验和结论。)

python moe sft llm deepseek-r1

UpdatedMar 13, 2025
Python

ymcui /Chinese-Mixtral

Star604

中文Mixtral混合专家大模型（Chinese Mixtral MoE LLMs）

nlp moe 64k mixture-of-experts 32k large-language-models llm mixtral

UpdatedApr 30, 2024
Python

mindspore-courses /step_into_llm

Star458

MindSpore online courses: Step into LLM

nlp natural-language-processing parallel-computing moe llama gpt bert peft gpt2 mindspore prompt-tuning large-language-models llm chatgpt rlhf instruction-tuning codegeex chatglm chatglm2 llama2

UpdatedJan 6, 2025
Jupyter Notebook

kokororin /pixiv.moe

Star364

😘 A pinterest-style layout site, shows illusts on pixiv.net order by popularity.

react redux website typescript comic comics lovelive webapp moe pixiv illust illusts

UpdatedMar 8, 2023
TypeScript

LISTEN-moe /android-app

Star262

Official LISTEN.moe Android app

android kotlin music music-player anime jpop japan moe kpop android-auto

UpdatedApr 20, 2025
Kotlin

inferflow /inferflow

Star242

Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).

bloom falcon moe gemma mistral mixture-of-experts model-quantization multi-gpu-inference m2m100 llamacpp llm-inference internlm llama2 qwen baichuan2 mixtral phi-2 deepseek minicpm

UpdatedMar 15, 2024
C++

SkyworkAI /MoH

Star237

MoH: Multi-Head Attention as Mixture-of-Head Attention

transformer moe attention vit dit mixture-of-experts llms

UpdatedOct 29, 2024
Python

libgdx /gdx-pay

Star228

A libGDX cross-platform API for InApp purchasing.

android java ios libgdx moe robovm iap in-app-purchase multi-os-engine gdx-pay

UpdatedJan 2, 2025
Java

IBM /ModuleFormer

Star220

ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward experts. We released a collection of ModuleFormer-based Language Models (MoLM) ranging in scale from 4 billion to 8 billion parameters.

lm moe

UpdatedApr 10, 2024
Python

SkyworkAI /MoE-plus-plus

Star209

[ICLR 2025] MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts

moe mixture-of-experts large-language-models llms

UpdatedOct 16, 2024
Python

Improve this page

Add a description, image, and links to themoe topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with themoe topic, visit your repo's landing page and select "manage topics."

Learn more

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

moe

Here are 149 public repositories matching this topic...

hiyouga /LLaMA-Factory

sgl-project /sglang

czy0729 /Bangumi