- Notifications
You must be signed in to change notification settings - Fork1
A Reference Manager in R
License
Unknown, MIT licenses found
Licenses found
oeysan/c2z
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
The observant reader has already identified the brilliant word play onPsalm 72:8 (King James Version): “He shall have dominion also fromseato sea, and from the river unto the ends of the earth”.c2z aimsat obtaining total dominion overCristin(Current Research Information SysTem in Norway) andZotero. The package enables manipulatingZotero libraries usingR. Import, inbatch, references from Cristin, regjeringen.no, CRAN, ISBN (currently,Alma and Library of Congress), and DOI (currently, CrossRef andDataCite) to a Zotero library. Add, edit, copy, or delete items,including attachments and collections, and export references to BibLaTeX(and other formats) directly inR (see Figure 1).
Figure 1.c2z flowchart.
Anyone using Zotero, or similar reference management software. However,the project is probably of extra interest to researchers, students,bibliomaniacs and others working in library-type services. Though theproject is grounded in a Norwegian context (with apologizes to Åse Wetåsfor writing the documentation in (American) English), internationalpublications are easily available through DOI and ISBN, and the Zoterofunctions are independent of acquiring metadata from external services.
Should you require a specific international/national/regional library ordatabase, please make a requesthere or open apullrequest. The only requirement isthat the library services have open access and serve MARC 21 / DOI typemetadata (or has fairly structured XML/JSON). Okay, okay, there are norequirements, I’ll look into any request and try to make it work.
Hoarding references in Zotero, obviously. However,c2z also has morepractical purposes, especially in combination with other packages. Youare probably the right kind of weirdo (since you are reading this) andyou could usec2z to easily handle your references while writing andpreparing manuscripts (e.g.,papaja),or for use on a (personal) webpage (e.g.,blogdown orbookdown). If you need to workfor a living, or just like to show off, you could automate thepublication list for your résumé (e.g.,vitae). If the Man paysyou to keep track of publications, you could schedule a script (e.g.,cronR,taskscheduleR or GithubActions) to keep track of new publications from an institution orresearch group and email you (or the Man) recent publications on amonthly or weekly (or hourly) basis (e.g.,emayili ormailR). If you really feel likeit you could useHomeAssistant to playTina Turner - The Best (Official MusicVideo) whenever one ofyour publications is registered on Cristin.
The sky is the limit!
- Add, edit, copy, and delete (nested) Zotero collections.
- Add, edit, copy, and delete Zotero items, including attachments.
- Export Zotero items inR as BibLaTeX (and other formats).
- Batch import common references from Cristin.
- Currently supported formats: books (e.g., anthologies), bookchapters, journal articles, presentations (e.g., lectures), andopinions pieces.
- Batch import references from ISBN and DOI.
- Currently supported formats (CrossRef`): books, book chapters,conference papers, journal articles.
- DataCite references are treated as preprints and stores referencetype (.e.g, dataset) as Genre.
- Batch import Norwegian white papers and official Norwegian reports.
- Batch importR packages from CRAN.
- Search CrossRef, automatically and manually, by author(s), title, andyear.
- Augment Cristin references through ISBN, DOI, or CrossRef search.
- Create month-to-month newsletter for registered publications inCristin.
The project strives at keeping the number of dependencies at a minimum.However,c2z is highly dependent ondplyr,httr,purrr,rvest,tibble, andjsonlite.
Dependencies are automatically installed from CRAN. By default, outdateddependencies are automatically upgraded.
You probably want to access a restricted Zotero library. Please seetheshort tutorialon how to create a Zotero API key and how to define it in your.Renviron.
You can installc2z from GitHub. If you already have a previousversion ofc2z installed, using the command below will update to thelatest development version.
Development version (GitHub)
devtools::install_github("oeysan/c2z")
Please note that stable versions are hosted at CRAN, whereas GitHubversions are in active development.
Stable version (CRAN)
utils::install.packages("c2z")
Also, please see themagnificentvignette andotherdocumentation.
I work as an associate professor at a department of teacher education inNorway. Doing so, one of my responsibilities is surprisingly enoughteaching. Even more surprising, most of the literature is in Norwegian,and in the form of monographs or anthologies. Unfortunately, Zotero isnot well-adapted to importing Norwegian books through ISBN (see Figure2). In the example below, Imsen (2020) is imported using the Zoteromagic wand (left) andc2z (right). Similarly, Zotero is unable toimport Johannessen et al. (2021) using ISBN (cf. lookup failed).Evidently, Alma (47BIBSYS) is superior to Open WorldCat and similar whenit comes to identifying (most) Norwegian books.
Figure 2. Zotero vs. c2z example.
The following example ofc2z addresses this issue, and theZoterofunction act as a wrapper by 1) connecting to the Zotero API, 2)creating a collection called “c2z-example”, 3) search for items usingtwo ISBN identifiers (i.e. Imsen, 2020; Johannessen et al., 2021), 4)posting the items to the defined collection, 5) and creating abibliography in HTML format using the APA7 reference style (could alsobe exported) to any supported Zotero export (e.g., BibLaTeX), and 6)cleaning up the example by deleting the collection and the two items.TheR output is rather noisy and can be disabled by addingsilent = TRUE.
library(c2z)example<- Zotero(collection.names="c2z-example",library=TRUE,library.type="data,bib",create=TRUE,isbn= c("9788215040561","9788279354048"),post=TRUE,post.collections=FALSE,style="apa-single-spaced",delete=TRUE,delete.collections=TRUE,delete.items=TRUE,index=TRUE,post.token=TRUE)#> Searching for collections#> Found 0 collections#> Adding 1 collection to library using 1 POST request#> —————————————————Process: 100.00% (1/1). Elapsed time: 00:00:00—————————————————#> $post.status.collections#> # A tibble: 1 × 2#> status key#> <fct> <chr>#> 1 success EA952IJ4#>#> $post.summary.collections#> # A tibble: 1 × 2#> status summary#> <fct> <int>#> 1 success 1#>#>#> The Zotero list contains: 1 collection, 0 items, and 0 attachments#> Searching 2 items using ISBN#> Adding 2 items to library using 1 POST request#> —————————————————Process: 100.00% (1/1). Elapsed time: 00:00:00—————————————————#> $post.status.items#> # A tibble: 2 × 2#> status key#> <fct> <chr>#> 1 success HDQQQVFR#> 2 success 7HVP3JGJ#>#> $post.summary.items#> # A tibble: 1 × 2#> status summary#> <fct> <int>#> 1 success 2#>#>#> Searching for items using 1 collection#> Found 2 items#> The Zotero list contains: 1 collection, 2 items, and 0 attachments#> Deleting 1 collection using 1 DELETE request#> —————————————————Process: 100.00% (1/1). Elapsed time: 00:00:00—————————————————#> Deleting 2 items using 1 DELETE request#> —————————————————Process: 100.00% (1/1). Elapsed time: 00:00:00—————————————————#> Creating index for items
The example will yield the following HTML output:
Imsen, G. (2020).Elevens verden: innføring i pedagogiskpsykologi (6th ed.). Universitetsforlaget.
Johannessen, A., Christoffersen, L., & Tufte, P. A. (2021).Introduksjon til samfunnsvitenskapelig metode (6th ed.). Abstraktforlag.
Johannessen et al. (2021) is an interesting (well, perhaps notinteresting to all people) example of the nasty business that ismetadata. In Cristin the authors are listed asChristoffersen,Johannessen, and Tufte, in Alma the authors are listed asJohannessen,Christoffersen, and Tufte, whereas the book itself list the authors asJohannessen, Tufte, and Christoffersen (interesting, right?).c2zamends the conflicting results provided by Cristin and Alma by parsingthe statement of responsibility field (if it exists) in MARC 21.
Despite several innovative, creative and valiant efforts to mitigatecommon weaknesses in CrossRef, DataCite, MARC 21, and especiallyCristin,c2z cannot always create order in a chaotic metadata world.A major limitation of any reference management software scrapingmetadata through databases is poorly registered data.GIGO willhappen and manual inspection is required to assure that the referencesare correct.
Moreover, the project stands or falls by its relationship with theAPI’s, meaning thatc2z is likely a high maintenance project. Forinstance, Cris/NVA is planned to replace Cristin during 2023, which islikely to cause some headache.
Finally (not really, there are probably several other limitations),c2z is not built for speed. The project tries to wrangle data fromstrange and exotic beasts, while simultaneously hoping to avoidexploding kittens. Isolated, wrangling data from Cristin, ISBN, or DOIis not very time-consuming (though downloading the entire Cristindatabase (> 300 MB) and importing to Zotero will take some time). Onereason is that Cristin for some reason keeps a separate table containingcontributors, meaning that each reference needs two API calls. Bookchapters are even more time-consuming, as Cristin also keeps the bookmetadata in another table, totaling four (4) API calls.
Enabling data-augmentation through DOI or ISBN demands even more APIcalls, and if Crossref search is enabled, with no prior identificationthrough DOI or ISBN, the process can take a long, long time. (… andtotally hammer the Crossref API, please don’t do it!). For example,downloading and converting 50 random items (n = 1600) from each ofthe, for now, supported Cristin categories (k = 32), takesapproximately 3.12 minutes without any augmentation, 39.02 minutes withDOI/ISBN look-up, and 177.54 minutes with Crossref search enabled.Please note that run-time is dependent on bandwidth and theresponse-time for the API’s (Alma has especially high latency), and thatc2z uses exponential backoff depending on the API response.
Please report any bugs/issues/requestshere, and feel free to make apull request.
YourR code seems to be a mash-up of different styles, not adhering toGoogle’sR styleguide orTidyverse’sstyle guide. In addition, you combineboth HTML/CSS/JS and Markdown, violating theMarkdownphilosophy.What’s your thought on this breach of tradition?
–“Thank you, but I prefer it myway.”
Don’t be evil. Please read theCode ofConduct
This project is licensed under the MIT License - seeLICENSE for details
Henrik Karlstrøm for his work onrcristin
About
A Reference Manager in R
Topics
Resources
License
Unknown, MIT licenses found
Licenses found
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.

