distributed-inference
Here are 6 public repositories matching this topic...
FlashInfer: Kernel Library for LLM Serving
- Updated
Jul 18, 2025 - Cuda
Simple, scalable AI model deployment on GPU clusters
- Updated
Jul 19, 2025 - Python
prima.cpp: Speeding up 70B-scale LLM inference on low-resource everyday home clusters
- Updated
Jul 18, 2025 - C++
Source code of the paper "Private Collaborative Edge Inference via Over-the-Air Computation".
- Updated
Jan 14, 2025 - Python
Official impl. of ACM MM paper "Identity-Aware Attribute Recognition via Real-Time Distributed Inference in Mobile Edge Clouds". A distributed inference model for pedestrian attribute recognition with re-ID in an MEC-enabled camera monitoring system. Jointly training of pedestrian attribute recognition and Re-ID.
- Updated
Apr 26, 2020 - Python
Improve this page
Add a description, image, and links to thedistributed-inference topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thedistributed-inference topic, visit your repo's landing page and select "manage topics."