#
sft-data
Here are 4 public repositories matching this topic...
GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation
qaknowledge-graphdata-generationquestion-answeringdata-synthesissftpretrainpretraininggraphgenai4sciencellmllm-trainingqwenxtunerllama-factorysft-data
- Updated
Dec 17, 2025 - Python
[NeurIPS 2025 Spotlight] ReasonFlux (long-CoT), ReasonFlux-PRM (process reward model) and ReasonFlux-Coder (code generation)
- Updated
Sep 27, 2025 - Python
代码大模型 预训练&微调&DPO 数据处理 业界处理pipeline sota
- Updated
Jul 25, 2024 - Python
SyGra - Graph-oriented Synthetic data generation Pipeline
pythonopen-sourceaimultimodalitysynthetic-datasynthetic-dataset-generationdpoimage-datasetslow-code-no-codellm-datasetsllm-frameworksft-datallm-training-data
- Updated
Dec 17, 2025 - Python
Improve this page
Add a description, image, and links to thesft-data topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thesft-data topic, visit your repo's landing page and select "manage topics."