You signed in with another tab or window.Reload to refresh your session.You signed out in another tab or window.Reload to refresh your session.You switched accounts on another tab or window.Reload to refresh your session.Dismiss alert
LBDiscover is an R package for literature-based discovery (LBD) inbiomedical research. It provides a comprehensive suite of tools forretrieving scientific articles, extracting biomedical entities, buildingco-occurrence networks, and applying various discovery models to uncoverhidden connections in the scientific literature.
The package implements several literature-based discovery approachesincluding:
ABC model (Swanson’s discovery model)
AnC model (improved version with better biomedical term filtering)
Latent Semantic Indexing (LSI)
BITOLA-style approaches
LBDiscover also features powerful visualization tools for exploringdiscovered connections using networks, heatmaps, and interactivediagrams.
Installation
# Install from CRANinstall.packages("LBDiscover")# Or install the development version from GitHub# install.packages("devtools")devtools::install_github("chaoliu-cl/LBDiscover")
Key Features
LBDiscover provides a complete workflow for literature-based discovery:
Data Retrieval: Query and retrieve scientific articles fromPubMed and other NCBI databases
Text Preprocessing: Clean and prepare text for analysis
Entity Extraction: Identify biomedical entities in text(diseases, drugs, genes, etc.)
Co-occurrence Analysis: Build networks of entity co-occurrences
Discovery Models: Apply various discovery algorithms to findhidden connections
Validation: Validate discoveries through statistical tests
Visualization: Explore results through network graphs, heatmaps,and more
Quick Start Example
library(LBDiscover)# Retrieve articles from PubMedarticles<- pubmed_search("migraine treatment",max_results=100)# Preprocess article textpreprocessed<- vec_preprocess(articles,text_column="abstract",remove_stopwords=TRUE)# Extract biomedical entitiesentities<- extract_entities_workflow(preprocessed,text_column="abstract",entity_types= c("disease","drug","gene"))# Create co-occurrence matrixco_matrix<- create_comat(entities,doc_id_col="doc_id",entity_col="entity",type_col="entity_type")# Apply the ABC model to find new connectionsabc_results<- abc_model(co_matrix,a_term="migraine",n_results=50,scoring_method="combined")# Visualize the resultsvis_abc_network(abc_results,top_n=20)
Discovery Models
ABC Model
The ABC model is based on Swanson’s discovery paradigm. If concept A isrelated to concept B, and concept B is related to concept C, but A and Care not directly connected in the literature, then A may have a hiddenrelationship with C.
# Apply the ABC modelabc_results<- abc_model(co_matrix,a_term="migraine",min_score=0.1,n_results=50)# Visualize as a networkvis_abc_network(abc_results)# Or as a heatmapvis_heatmap(abc_results)
AnC Model
The AnC model is an extension of the ABC model that uses multiple Bterms to establish stronger connections between A and C.
# Apply the AnC modelanc_results<- anc_model(co_matrix,a_term="migraine",n_b_terms=5,min_score=0.1)
LSI Model
The Latent Semantic Indexing model identifies semantically related termsusing dimensionality reduction techniques.
If you use LBDiscover in your research, please cite:
Liu, C. (2025). LBDiscover: Literature-Based Discovery Tools for Biomedical Research. R package version 0.1.0. https://github.com/chaoliu-cl/LBDiscover
License
This project is licensed under the GPL-3 License - see the LICENSE filefor details.
About
Literature-Based Discovery Tools for Biomedical Research