PKU-Alignment

Loves Sharing and Open-Source, Making AI Safer.

PKU-Alignment Team

Large language models (LLMs) have immense potential in the field of general intelligence but come with significant risks. As a research team at Peking University, we actively focus on alignment techniques for LLMs, such as safety alignment, to enhance the model's safety and reduce toxicity.

Welcome to follow our AI Safety project:

PinnedLoading

omnisafeomnisafePublic
JMLR: OmniSafe is an infrastructural framework for accelerating SafeRL research.
Python 919 123
safety-gymnasiumsafety-gymnasiumPublic
NeurIPS 2023: Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark
Python 435 59
safe-rlhfsafe-rlhfPublic
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
Python 1.4k 118
Safe-Policy-OptimizationSafe-Policy-OptimizationPublic
NeurIPS 2023: Safe Policy Optimization: A benchmark repository for safe reinforcement learning algorithms
Python 345 49

Repositories

Showing 10 of 19 repositories

align-anything Public
Align Anything: Training All-modality Model with Feedback
PKU-Alignment/align-anything’s past year of commit activity
Python 2,868Apache-2.0 372 16 0 UpdatedMar 18, 2025
omnisafe Public
JMLR: OmniSafe is an infrastructural framework for accelerating SafeRL research.
PKU-Alignment/omnisafe’s past year of commit activity
Python 919Apache-2.0 123 13 3 UpdatedMar 17, 2025
s1-m Public Forked fromPKU-Alignment/align-anything
S1-M: Simple Test-time Scaling in Multimodal Reasoning
PKU-Alignment/s1-m’s past year of commit activity
Python0Apache-2.0 380 0 0 UpdatedMar 15, 2025
SafeVLA Public
PKU-Alignment/SafeVLA’s past year of commit activity
Python 22 1 1 0 UpdatedMar 7, 2025
ProAgent Public
AAAI24(Oral) ProAgent: Building Proactive Cooperative Agents with Large Language Models
PKU-Alignment/ProAgent’s past year of commit activity
JavaScript 75MIT 8 0 0 UpdatedMar 4, 2025
safety-gymnasium Public
NeurIPS 2023: Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark
PKU-Alignment/safety-gymnasium’s past year of commit activity
Python 435Apache-2.0 59 4 1 UpdatedFeb 27, 2025
Beaver-zh-hk Public
PKU-Alignment/Beaver-zh-hk’s past year of commit activity
Python00 0 0 UpdatedFeb 23, 2025
aligner Public
[NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct
PKU-Alignment/aligner’s past year of commit activity
Python 165 8 1 0 UpdatedJan 16, 2025
.github Public
PKU-Alignment/.github’s past year of commit activity
00 0 0 UpdatedJan 16, 2025
ProgressGym Public
Alignment with a millennium of moral progress. Spotlight@NeurIPS 2024 Track on Datasets and Benchmarks.
PKU-Alignment/ProgressGym’s past year of commit activity
Python 22MIT 3 0 0 UpdatedJan 2, 2025