- Notifications
You must be signed in to change notification settings - Fork0
Highlight Text Based on Note Frequency
License
Unknown, MIT licenses found
Licenses found
rachelesrogers/highlightr
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
This package can be used to create a highlighted source document basedon the frequency of phrases found in single or multiple note sheets. Thegoal of this method is to indicate the portions of the source documentthat individuals felt was most worth copying into notes, based on phrasefrequency. The inputs necessary for this procedure are a notes documentand a source document. The output will be HTML code for generating thehighlighted text.
This work was funded (or partially funded) by the Center for Statisticsand Applications in Forensic Evidence (CSAFE) through CooperativeAgreements 70NANB15H176 and 70NANB20H019 between NIST and Iowa StateUniversity, which includes activities carried out at Carnegie MellonUniversity, Duke University, University of California Irvine, Universityof Virginia, West Virginia University, University of Pennsylvania,Swarthmore College and University of Nebraska, Lincoln.
You can install from CRAN with:
install.packages("highlightr")You can install the development version of highlightr fromGitHub with:
# install.packages("devtools")devtools::install_github("rachelesrogers/highlightr")
# load librarylibrary(highlightr)# rename desired column of derivative documents to 'page_notes'comment_example_rename<-dplyr::rename(comment_example,page_notes=Notes)# tokenize derivative documentstoks_comment<- token_comments(comment_example_rename)# rename desired column of source document to 'text'transcript_example_rename<-dplyr::rename(transcript_example,text=Text)# tokenize source documenttoks_transcript<- token_transcript(transcript_example_rename)# use fuzzy matching in collocationcollocation_object<- collocate_comments_fuzzy(toks_transcript,toks_comment)#> Warning in join_func(a = a, b = b, by_a = by_a, by_b = by_b, block_by_a = block_by_a, : A pair of records at the threshold (0.7) have only a 95% chance of being compared.#> Please consider changing `n_bands` and `band_width`.# connect collocation frequencies to source documentmerged_frequency<- transcript_frequency(transcript_example_rename,collocation_object)# create `ggplot` object of the transcriptfreq_plot<- collocation_plot(merged_frequency)# add html tags to source documentpage_highlight<- highlighted_text(freq_plot)
page_highlightpage_highlight will produce HTML output that can then be rendered intohighlighted text. This can be done in R Markdown by specifying theobject outside of a code chunk as`r page_highlight`, and knittingthe document to HTML.
Alternatively, thexml2 package can be used to save the output as anhtml file, as shown in the following code:
# load `xml2` librarylibrary(xml2)# save html output to desired locationxml2::write_html(xml2::read_html(page_highlight),"filename.html")
The below image is generated through the resulting html output (as seenin thevignette("highlightr")).
About
Highlight Text Based on Note Frequency
Resources
License
Unknown, MIT licenses found
Licenses found
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.
