data-augmentation
Here are 1,241 public repositories matching this topic...
Language:All
Sort:Most stars
A system for quickly generating training data with weak supervision
- Updated
May 2, 2024 - Python
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
- Updated
Jul 14, 2025 - C++
🔥🔥High-Performance Face Recognition Library on PaddlePaddle & PyTorch🔥🔥
- Updated
Mar 20, 2025 - Python
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLPhttps://textattack.readthedocs.io/en/master/
- Updated
Jul 10, 2025 - Python
A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.
- Updated
Jun 19, 2025 - Python
Medical imaging processing for AI applications.
- Updated
Jul 17, 2025 - Python
A Python library for audio data augmentation. Useful for making audio ML models work well in the real world, not just in the lab.
- Updated
Jul 4, 2025 - Python
一键中文数据增强包 ; NLP数据增强、bert数据增强、EDA:pip install nlpcda
- Updated
Mar 18, 2025 - Python
fastdup is a powerful, free tool designed to rapidly generate valuable insights from image and video datasets. It helps enhance the quality of both images and labels, while significantly reducing data operation costs, all with unmatched scalability.
- Updated
Jul 17, 2025 - Python
Data augmentation for NLP, presented at EMNLP 2019
- Updated
Mar 19, 2023 - Python
List of useful data augmentation resources. You will find here some not common techniques, libraries, links to GitHub repos, papers, and others.
- Updated
Aug 14, 2024
自然语言处理(nlp),小姜机器人(闲聊检索式chatbot),BERT句向量-相似度(Sentence Similarity),XLNET句向量-相似度(text xlnet embedding),文本分类(Text classification), 实体提取(ner,bert+bilstm+crf),数据增强(text augment, data enhance),同义句同义词生成,句子主干提取(mainpart),中文汉语短文本相似度,文本特征工程,keras-http-service调用
- Updated
Sep 23, 2021 - Python
Code for TKDE paper "Self-supervised learning on graphs: Contrastive, generative, or predictive"
- Updated
Aug 15, 2024
An implement of the paper of EDA for Chinese corpus.中文语料的EDA数据增强工具。NLP数据增强。论文阅读笔记。
- Updated
May 31, 2022 - Python
Data Augmentation For Object Detection
- Updated
Apr 14, 2020 - Jupyter Notebook
This repository collects papers for "A Survey on Knowledge Distillation of Large Language Models". We break down KD into Knowledge Elicitation and Distillation Algorithms, and explore the Skill & Vertical Distillation of LLMs.
- Updated
Mar 9, 2025
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
- Updated
Jan 15, 2025 - Python
Awesome papers about generative Information Extraction (IE) using Large Language Models (LLMs)
- Updated
Nov 18, 2024
Natural Language Toolkit for Indic Languages aims to provide out of the box support for various NLP tasks that an application developer might need
- Updated
Jan 20, 2024 - Python
Collection of papers and resources for data augmentation for NLP.
- Updated
Aug 12, 2022
Improve this page
Add a description, image, and links to thedata-augmentation topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thedata-augmentation topic, visit your repo's landing page and select "manage topics."