Posts by Jun Chen

Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X

12 November 2025

Peng Sun ,Andy Luo ,Gilbert Lei ,Lingpeng Jin ,Carlus Huang ,Duyi Wang ,Mingzhi Liu ,Di Tian ,Bill He ,Jun Chen ,Yutong Wu ,Jiahao Zhou ,Niko Ma

English

Software tools & optimizations

AI/ML Performance Optimization Serving

As large scale LLM inference moves beyond a single server, engineering teams face a familiar trifecta of challenges: performance, fault isolation, and operational efficiency. DeepSeek‑V3/R1’s high‑sparsity Mixture‑of‑Experts (MoE) architecture can deliver excellent throughput, but only when computation, memory, and communication are orchestrated with care—especially across multiple nodes [1].

Read more ...

Movatterモバイル変換

Posts by Jun Chen

Practical, Fault‑Robust Distributed Inference for DeepSeek on AMD MI300X