- Notifications
You must be signed in to change notification settings - Fork7
Turbocharge a PubMed literature search rather than clicking and clicking and clicking on Google Scholar
License
dvklopfenstein/pmidcite
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Turbocharge aPubMed literature search with the command,icite
, rather than clicking and clicking and clicking onGoogle Scholar "Cited by N" links.
This open-source project is part ofa peer-reviewedcommentary that was invited by the editors ofResearch Synthesis Methods.PleaseCite if you usepmidcite in your research or literature search.
Contact:dvklopfenstein@protonmail.com
PubMed contains peer-reviewed research papersin biomedicine, biochemistry, chemistry, behavioral science, and other life sciences.Citation data is downloadedeach timeicite
is runfrom theNational Institutes of Health (NIH) and includes:
- Citation counts of all papers and clinical papers
- Performance of a paper among its peer papers
- Existence of MeSH terms for the human, animal, and molecular/cellular categories
- Quickstart on thecommand line
- 1) Download citation counts and data for a research paper
- 2) Forward citation search: following a paper'sCited by links orForward snowballing
- 3) Backward citation search: following the links to a paper's references orBackward snowballing
- 4) Summarize a group of citations
- 5) Download citations for all papers returned from a PubMed search
- Examples in Jupyter notebooks using thepmidcite Python library
- Installation & citation:
- References
$ icite -H 26032263
- This paper (PMID 26032263) has
25
citations,10
references, and4
authors. - This paper is performing well (
74
th percentile in column%
) compared to itspeers.
This paper is performing well (74
th percentile) compared to itspeers (column%
).
The NIH percentile grouping (columnG
) helps tohighlight the better performing papers in groups2
,3
, and4
bysorting the citing papers by group first, then publication year.
The sort places the lower performing papers in groups0
or1
at the back.
New papers appear at the beginning of a sorted list,no matter how many citations they have tobetter facilitate researchers in finding the latest discoveries.
The grouping of papers by NIH percentile grouping is a novel feature created bydvklopfenstein for this project.
Also known as following a paper'sCited by links orForward snowballing
icite -H; icite 26032263 --load_citations | sort -k6 -r
oricite -H; icite 26032263 -c | sort -k6 -r
Also known as following links to a paper's references orBackward snowballing
$ icite -H; icite 26032263 --load_references | sort -k6 -r
or$ icite -H; icite 26032263 -r | sort -k6 -r
Create a file containing numerous PMIDs annotated with icite info
$ icite 30022098 -c -o goatools_cites.txt WROTE: goatools_cites.txt
Count the number of lines in the file
$ wc -l goatools_cites.txt468 goatools_cites.txt
Summarize the papers in "goatools_cites.txt"
$ sumpaps goatools_cites.txti=026.9% 4=003.0% 3=018.9% 2=028.8% 1=015.9% 0=006.5% 6 years:2018-2024 465 papers goatools_cites.txt
- The output is on one line so many files containing sets of PMIDs may be compared
- The groups are from newest(
i
) to top-performing(4
), great(3
), very good(2
), and overlooked(1
and0
)
- Do a search in PubMed
- Save all results into a file containing all PMIDs found by the search
- Download the list of PMIDs
- Run icite to analyze all the PMIDs
1. Do a search inPubMed
$ icite -i pmid-HIVANDDNAm-set.txt -o pmid-HIVANDDNAm-icite.txt$ grep TOP pmid-HIVANDDNAm-icite.txt | sort -k6
A Command-Line Interface (CLI) can be preferableto a Graphical User Interface (GUI) because:
- processing can be automated from a script
- time-consuming mouse clicking is reduced
- more data can be seen at once on a text screenthan in a browser, giving the researchera better overall impression of the full set of information[1]
Researchers who use Linux or Mac already work from the command line.Researchers who use Windows can get that Linux-like command line feelingwhile still running native Windows programs bydownloading Cygwin fromhttps://www.cygwin.com/[1].
In 2013, Boeker et al. [6]recommended that a scientific search interface contain five integrated search criteria.PubMed implements all five, while Google did not in 2013 or today.
Google's highly popular implementation of the forward citation search through their ubiquitous "Cited by N" linksis a "Better" experience than the PubMed's "forward citation search" implementation.
But if your research is in the health sciences andyou are amenable to working from thecommand line,you can use PubMed in your browser pluscitation data downloaded from the NIH using the command-line usingpmidcite.The NIH's citation data includes a paper's ranking among its co-citation network.
What is inPubMed? Take aquick tour
PubMed is a search interface and toolset used to access over 30.5 million article records from databases such as:
- MEDLINE: a highly selective database started in the 1960s
- PubMed Central (PMC): an open-access database for full-text papers that are free of cost
- Additional content such as books and articles published before the 1960s
To install fromPyPI$ pip3 install pmidcite
To install locally
$ git clone https://github.com/dvklopfenstein/pmidcite.git$ cd ./pmidcite$ pip3 install .
Save your literature search in a GitHub repo.
1. Add apmidcite init file
Add a .pmidciterc init file to a non-git managed directory, such as home (~)
$ icite --generate-rcfile | tee ~/.pmidciterc[pmidcite]email = myname@email.edu# To download PubMed search results, get an NCBI API key here:# https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilitiesapikey = MY_LONG_HEX_NCBI_API_KEYtool = my_scripts
$ export PMIDCITECONF=~/.pmidciterc
Do not version manage the.pmidciterc
using a tool such as GitHub because itcontains your personal email and your private NCBI API key.
To download PubMed abstracts and PubMed search results using NCBI's E-Utils,get an NCBI API key using these instructions:
https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilities
Set theapikey
value in the config file:~/.pmidciterc
See thecontributing guide for detailed instructions on how to get started contributing to thepmidcite project.
email:dvklopfenstein@protonmail.com
https://orcid.org/0000-0003-0161-7603
If you usepmidcite in your research or literature search, please cite paper 1 (pmidcite) and paper 3 (NIH citation data).
Please also consider reading and citing Gusenbauer's response (paper 2) about improving search for all during the information avalanche of these times:
Thepmidcite paper:
Commentary to Gusenbauer and Haddaway 2020: Evaluating Retrieval Qualities of PubMed and Google Scholar
Klopfenstein DV and Dampier W
2020 |Research Synthesis Methods | PMID:33031632 | DOI:10.1002/jrsm.1456 |pdfGusenbauer's response to thepmidcite paper:
What every Researcher should know about Searching – Clarified Concepts, Search Advice, and an Agenda to improve Finding in Academia
Gusenbauer M and Haddaway N
2020 |Research Synthesis Methods | PMID:33031639 | DOI:10.1002/jrsm.1457 |pdfThe NIH citation data used bypmidcite -- Scientific Influence, Translation, and Citation counts:
The NIH Open Citation Collection: A public access, broad coverage resource
Hutchins BI ... Santangelo GM
2019 |PLoS Biology | PMID:31600197 | DOI:10.1371/journal.pbio.3000385
Please consider reading and citing the paper [4] which inspired the creation ofpmidcite [1] and the authors' response to our paper [2]:
- Which Academic Search Systems are Suitable for Systematic Reviews or Meta-Analyses? Evaluating Retrieval Qualities of Google Scholar, PubMed and 26 other Resources
Gusenbauer M and Haddaway N
2019 |Research Synthesis Methods | PMID:31614060 | DOI:10.1002/jrsm.1378
Mentioned in this README are also these outstanding contributions:
Relative Citation Ratio (RCR): A New Metric That Uses Citation Rates to Measure Influence at the Article Level
Hutchins BI, Xin Yuan, Anderson JM, and Santangelo, George M.
2016 |PLoS Biology | PMID:27599104 | DOI:10.1371/journal.pbio.1002541Google Scholar as replacement for systematic literature searches: good relative recall and precision are not enough
Boeker M et al.
2013 | BMC Medical Research Methodology | PMID:24160679 | DOI:10.1186/1471-2288-13-131Best Match: New relevance search for PubMed
Fiorini N ... Lu Zhiyong
2018 | PLoS Biology | PMID:30153250 | DOI:10.1371/journal.pbio.2005343
- PMIDCITE Manuscript with the original text box formatting
- Gusenbauer's Response
dvklopfenstein@protonmail.com
https://orcid.org/0000-0003-0161-7603
Copyright (C) 2019-presentpmidcite, DV Klopfenstein, PhD. All rights reserved.
About
Turbocharge a PubMed literature search rather than clicking and clicking and clicking on Google Scholar