inference-scaling
Here are 6 public repositories matching this topic...
Language:All
Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening
- Updated
May 18, 2025 - Python
[NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection
- Updated
Oct 29, 2024 - Python
Stable Latent Reasoning --- Enhancing Inference in Large Language Models through Iterative Latent Space Refinement
- Updated
Dec 22, 2024
Implemented a recurrent-depth LLM (PyTorch) based on arXiv:2502.05171. Demonstrated that scaling inference compute increased arithmetic reasoning accuracy from 8% to 100% without additional parameters.
- Updated
Nov 27, 2025 - Jupyter Notebook
Deep Research capability with reasoning models, CoT prompting, and inference-time scaling
- Updated
Feb 7, 2026 - Python
Production-ready test-time compute optimization framework for LLM inference. Implements Best-of-N, Sequential Revision, and Beam Search strategies. Validated with models up to 7B parameters.
- Updated
Jan 27, 2026 - Python
Improve this page
Add a description, image, and links to theinference-scaling topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with theinference-scaling topic, visit your repo's landing page and select "manage topics."