Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Streamlining the algorithm design and testing process of data linkage by removing the programmatic requirements from data analysts through R scripts and a database of linkage tests.

License

NotificationsYou must be signed in to change notification settings

CHIMB/autolink

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation




Introduction

autolink provides an easy and user friendly way for data analysts todevelop linkage algorithms and use them to perform data linkage tests.With the package allowing for testing out multiple algorithms perdataset, to help data analysts achieve an ideal and successful linkagerate.

This package would be most beneficial in the field of data science,specifically data-linkage and data analysis as thedatalink packagewould help make algorithm design and the testing process morestreamlined, as the end-user of the package would not be required tomake any programmatic changes to an R script and instead would only needto mix-and-match different blocking and matching variables, and whatrules they would like for each. Doing so until a desired linkage rate isachieved.

Installation

R Studio Installation

To installautolink from GitHub, begin by installing and loading thedevtools package:

# install.packages("devtools")library(devtools)

Afterwards, you may install the automated data linkage package usinginstall_github():

devtools::install_github("CHIMB/autolink")

Local Installation

To installautolink locally from GitHub, select the most recentrelease from the right-hand tab on the GitHub repository page. DownloadtheSource code (zip) file, then move over to RStudio. You maythen run the code:

path_to_pkg<- file.choose()# Select the unmodified package you downloaded from GitHub.devtools::install_local(path_to_pkg)

Usage

Generating Empty Metadata File

To begin working with theautolink package, begin by creating an emptylinkage metadata file:

output_dir<- choose.dir()# Select the output directory where the .SQLite file should go.autolink::create_new_metadata("linkage_metadata",output_dir)

Working With The GUI

With an empty file, you may begin adding datasets, algorithms, anditeration specific iteration to the metadata file by using the providedR Shiny application. To begin using the application, make the followingcall in your R environment:

linkage_file<- file.choose()# Select the .SQLite file you wish to modify.autolink::start_linkage_metadata_ui(linkage_file,"Data Analyst")

Within the GUI, you may first add the file paths to the data sets youwish to use for the linkage process. Once uploaded, you can select apair of uploaded data sets to add algorithms to, of which you can add,modify, and disable any number of passes you wish. If you are uncertainwith what exactly the GUI has to offer, considering reading theUserDocumentation on the package found below.

Running Algorithms

Once your algorithms have been created, you may run it either throughthe GUI, or by calling it programmatically as such:

left_dataset<- file.choose()# The left dataset you plan on using.right_dataset<- file.choose()# The right dataset (spine) you plan on using.metadata_file<- file.choose()# The .SQLite file that contains all saved information.algorithm_ids<- c(1,3,4)# The algorithm(s) ID you want to run under the dataset pair.extra_params<- create_extra_parameters_list(...)# Any number of optional/extra parameters you may want (export options & data).

Additional Information & Documentation

For more details on how the architecture of the package is structuredand how the stored algorithms are pulled and used to link data, considerreading theDeveloper Facing Documentation(474KB).

For more details on how to work function calls, how to navigate thepages of the user interface, and how to make changes, or add newinformation to the metadata, consider reading theUser FacingDocumentation(978KB).

About

Streamlining the algorithm design and testing process of data linkage by removing the programmatic requirements from data analysts through R scripts and a database of linkage tests.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp