🎯
Focusing
Incoming Ph.D. @ HKU & Master's student @ THU
- Tsinghua University
- Qingdao
- 01:59
(UTC +08:00) - https://ryanliu112.github.io
- https://scholar.google.com/citations?user=LiIfGakAAAAJ
Highlights
- Pro
PinnedLoading
- compute-optimal-tts
compute-optimal-tts PublicOfficial codebase for "Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling".
- Awesome-Process-Reward-Models
Awesome-Process-Reward-Models PublicA comprehensive collection of process reward models.
- wizard-III/ArcherCodeR
wizard-III/ArcherCodeR PublicArcherCodeR is an open-source initiative enhancing code reasoning in large language models through scalable, rule-governed reinforcement learning.
Python 5
- ChangWinde/RAT
ChangWinde/RAT Public[AAAI 2025 Oral] Official code for "RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted Behaviors"
Python 14
Something went wrong, please refresh the page to try again.
If the problem persists, check theGitHub status page orcontact support.
If the problem persists, check theGitHub status page orcontact support.
Uh oh!
There was an error while loading.Please reload this page.