Movatterモバイル変換


[0]ホーム

URL:


leakR

Welcome toleakR, an R package designed to helpresearchers, data scientists, and machine learning practitionersrigorously detect and diagnose data leakage in their workflows.

Data leakage is a pervasive yet often overlooked issue thatundermines the integrity and reproducibility of predictive models byallowing unintended information to “leak” between training and testingphases. leakR provides a modular, extensible toolkit for detecting themost common and impactful forms of leakage, starting with tabular datacontamination, target leakage, and temporal misalignments, while layingthe foundation for a universal leakage detection framework acrossdiverse data domains.

Installation

From CRAN (Recommended)

install.packages("leakr")

From GitHub (DevelopmentVersion)

For the latest features and bug fixes:

# Install devtools if you don't have itinstall.packages("devtools")# Install leakR from GitHubdevtools::install_github("cherylisabella/leakR")

Quick Start

library(leakr)# Basic audit of your datasetreport<-leakr_audit(iris,target ="Species")# View summary of issues foundleakr_summarise(report)# Generate diagnostic visualizationsleakr_plot(report)# Access detailed resultsprint(report)

Main Functions

FunctionPurpose
leakr_audit()Main auditing function - detects leakage across your dataset
leakr_summarise()Generate human-readable summaries of detected issues
leakr_plot()Create diagnostic visualizations highlighting problems
leakr_from_caret()Import and audit caret workflow objects
leakr_from_tidymodels()Import and audit tidymodels workflow objects
leakr_from_mlr3()Import and audit mlr3 workflow objects

Learn More

Get started with the comprehensive vignettes:

# Getting started guidevignette("getting-started",package ="leakr")# Advanced detection techniquesvignette("advanced-detection",package ="leakr")# Framework integration examplesvignette("framework-integration",package ="leakr")

Why leakR?

What leakR Detects

Key Features

Development Roadmap

Citation

If you use leakR in your research, please cite:

@Manual{leakr2025,  title = {leakR: Data Leakage Detection Tools for Machine Learning},  author = {Cheryl Isabella Lim},  year = {2025},  note = {R package version 0.1.0},  url = {https://github.com/cherylisabella/leakR},}

License

This project is licensed under the MIT License - see theLICENSE file for details.

leakR is currently under development. Feedback and contributions arewelcome from the community!


[8]ページ先頭

©2009-2025 Movatter.jp