Highlights
- Pro
👋 Hi, I’mXiao and I recently graduated fromSJTU. I'm currently seeking aUSA PhD position startingFall 2025.
- Cloud Computing
- Machine Learning Systems
- Currently working onLLM Serving Systems.
- ICSE-SEIP'23
- Eurosys'24
- ASPLOS'24
- RagInfer (OSDI'25 submission)
- AgentServing (OSDI'25 submission, co-first author)
- Aceso: Auto Parallel DNN Training
- Raginfer: low latency RAG inference system
- Autellix: high throuhput LLM agent serving system
- DeepScaler: RL LLM training
📫 Feel free to email me atlambda7xx@gmail.com if you are interested in my work.
PinnedLoading
- chinese-poetry
chinese-poetry PublicForked fromLC-John/chinese-poetry
最全中华古诗词数据库, 唐宋两朝近一万四千古诗人, 接近5.5万首唐诗加26万宋诗. 两宋时期1564位词人,21050首词。
Python
- vllm-project/vllm
vllm-project/vllm PublicA high-throughput and memory-efficient inference and serving engine for LLMs
- deepscaler
deepscaler PublicForked fromagentica-project/deepscaler
Democratizing Reinforcement Learning for LLMs
Python
- sgl-project/sglang
sgl-project/sglang PublicSGLang is a fast serving framework for large language models and vision language models.
- TensorRT-LLM
TensorRT-LLM PublicForked fromNVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
C++
If the problem persists, check theGitHub status page orcontact support.