data-filtering
Here are 135 public repositories matching this topic...
Language:All
Sort:Most stars
Official Repository of "LLM × DATA" Survey Paper
- Updated
Jan 28, 2026
DSIR large-scale data selection framework for language model training
- Updated
Apr 7, 2024 - Python
Graphical tool for data manipulation written in C++/Qt.
- Updated
Jan 18, 2026 - C++
GUNDAM is a data management system that prioritizes data using language models.
- Updated
Aug 2, 2025 - Python
⏳ Provide filtering, sanitizing, and conversion of Golang data. 提供对Golang数据的过滤,净化,转换。
- Updated
Dec 25, 2025 - Go
A GraphQL like interface to map a request to eloquent query with data transformation for Laravel.
- Updated
Sep 3, 2017 - PHP
Official implementation of our paper "Finetuned Multimodal Language Models are High-Quality Image-Text Data Filters".
- Updated
Apr 14, 2025 - Python
Exponentially Weighted Moving Average Filter
- Updated
Jul 20, 2024 - C++
一个用于模块化管理前端请求的工具
- Updated
Dec 3, 2022 - JavaScript
[ACL 2025 main] SCAR: Data Selection via Style Consistency-Aware Response Ranking for Efficient Instruction-Tuning of Large Language Models
- Updated
Aug 6, 2025 - Python
Framework for processing and filtering datasets
- Updated
Aug 1, 2024 - Python
JSON MCP server to filter only relevant data for your LLM
- Updated
Oct 23, 2025 - JavaScript
R Tutorial: useful R codes for cleaning and filtering data from Qualtrics surveys, and for creating new variables in the dataframe. With step-by-step explanations.
- Updated
Jan 18, 2023 - R
This repository contains all (Python 3) code and libraries required for the 2022-2023 Notre Dame Rocketry Team (NDRT) Apogee Control System (ACS). It also contains sensor/actuator example code and flight data.
- Updated
Apr 30, 2023 - Python
EpiMethEx (Epigenetic Methylation and Expression), a R package to perform a large-scale integrated analysis by cyclic correlation analyses between methylation and gene expression data.
- Updated
Oct 10, 2018 - R
Data extraction from smartphones and GPS and Accelerometer data "fusion" with Kalman filter.
- Updated
Nov 22, 2022 - Java
Base-call error-filtering and read preprocessing pipeline for fastq libraries
- Updated
Jun 8, 2021 - Python
Anonymises data inside text files and in sheet files. It recognises and removes various sorts of personally identifiable information (PII). Each removed part is replaced with suitable pseudonyms, depending on the type of removed data. Currently English and Russian languages are supported. Russian works both with Cyrillic and Latin characters.
- Updated
Jan 15, 2026 - Python
A powerful tool that allows users to query JSON data using SQL-like syntax. Effortlessly search, filter, and manipulate your JSON data with familiar SQL queries.
- Updated
Oct 4, 2023 - Python
CDC Connect is a cross-platform mobile application built in React Native using JavaScript. The app is designed for data collection with a focus on surveys.
- Updated
Mar 11, 2024 - JavaScript
Improve this page
Add a description, image, and links to thedata-filtering topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thedata-filtering topic, visit your repo's landing page and select "manage topics."