Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitd9d85c2

Browse files
committed
Precompute vignette following advice on "https://ropensci.org/blog/2019/12/08/precompute-vignettes/". This avoids calling the Unpaywall API on CRAN.
1 parentb541014 commitd9d85c2

File tree

3 files changed

+202
-22
lines changed

3 files changed

+202
-22
lines changed

‎.Rbuildignore‎

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,3 +12,4 @@ cran-comments.md
1212
^revdep$
1313
^codemeta\.json$
1414
^\.github$
15+
^vignettes/intro\.Rmd\.orig$

‎vignettes/intro.Rmd‎

Lines changed: 73 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title:"Introduction"
33
author:"Najko Jahn"
4-
date:"`r Sys.Date()`"
4+
date:"2022-02-28"
55
vignette:>
66
%\VignetteIndexEntry{Introduction}
77
%\VignetteEngine{knitr::rmarkdown}
@@ -24,13 +24,29 @@ See [Piwowar et al. (2018)](https://doi.org/10.7717/peerj.4375) for a comprehens
2424

2525
There is one major function to talk with Unpaywall,`oadoi_fetch()`, taking a character vector of DOIs and your email address as required arguments.
2626

27-
```{r}
27+
28+
```r
2829
library(roadoi)
2930
roadoi::oadoi_fetch(dois= c("10.1186/s12864-016-2566-9",
30-
"10.1103/physreve.88.012814"),
31+
"10.1103/physreve.88.012814"),
3132
email="najko.jahn@gmail.com")
3233
```
3334

35+
```
36+
## # A tibble: 2 × 21
37+
## doi best_oa_location oa_locations oa_locations_emba…
38+
## <chr> <list> <list> <list>
39+
## 1 10.1186/s128… <tibble [1 × 10]> <tibble [5 ×… <tibble [0 × 0]>
40+
## 2 10.1103/phys… <tibble [1 × 10]> <tibble [2 ×… <tibble [0 × 0]>
41+
## # … with 17 more variables: data_standard <int>, is_oa <lgl>,
42+
## # is_paratext <lgl>, genre <chr>, oa_status <chr>,
43+
## # has_repository_copy <lgl>, journal_is_oa <lgl>,
44+
## # journal_is_in_doaj <lgl>, journal_issns <chr>,
45+
## # journal_issn_l <chr>, journal_name <chr>, publisher <chr>,
46+
## # published_date <chr>, year <chr>, title <chr>,
47+
## # updated_resource <chr>, authors <list>
48+
```
49+
3450
####What's returned?
3551

3652
The client supports API version 2. According to the[Unpaywall Data Format](https://unpaywall.org/data-format), the following variables with the following definitions are returned:
@@ -39,24 +55,24 @@ The client supports API version 2. According to the [Unpaywall Data Format](http
3955
|:------------|:----------------------------------------------
4056
`doi`|DOI (always in lowercase)
4157
`best_oa_location`|list-column describing the best OA location. Algorithm prioritizes publisher hosted content (e.g. Hybrid or Gold)
42-
`oa_locations`|list-column of all the OA locations.
58+
`oa_locations`|list-column of all the OA locations.
4359
`oa_locations_embargoed` | list-column of locations expected to be available in the future based on information like license metadata and journals' delayed OA policies
44-
`data_standard`|Indicates the data collection approaches used for this resource.`1` mostly uses Crossref for hybrid detection.`2` uses more comprehensive hybrid detection methods.
45-
`is_oa`|Is there an OA copy (logical)?
46-
`is_paratext`| Is the item an ancillary part of a journal, like a table of contents? See here for more information<https://support.unpaywall.org/support/solutions/articles/44001894783>.
60+
`data_standard`|Indicates the data collection approaches used for this resource.`1` mostly uses Crossref for hybrid detection.`2` uses more comprehensive hybrid detection methods.
61+
`is_oa`|Is there an OA copy (logical)?
62+
`is_paratext`| Is the item an ancillary part of a journal, like a table of contents? See here for more information<https://support.unpaywall.org/support/solutions/articles/44001894783>.
4763
`genre`|Publication type
4864
`oa_status`|Classifies OA resources by location and license terms as one of: gold, hybrid, bronze, green or closed. See here for more information<https://support.unpaywall.org/support/solutions/articles/44001777288-what-do-the-types-of-oa-status-green-gold-hybrid-and-bronze-mean->.
4965
`has_repository_copy`|Is a full-text available in a repository?
50-
`journal_is_oa`|Is the article published in a fully OA journal? Uses the Directory of Open Access Journals (DOAJ) as source.
66+
`journal_is_oa`|Is the article published in a fully OA journal? Uses the Directory of Open Access Journals (DOAJ) as source.
5167
`journal_is_in_doaj`|Is the journal listed in the Directory of Open Access Journals (DOAJ).
5268
`journal_issns`|ISSNs, i.e. unique code to identify journals.
5369
`journal_issn_l`|Linking ISSN.
5470
`journal_name`|Journal title
5571
`publisher`|Publisher
5672
`published_date`|Date published
57-
`year`|Year published.
58-
`title`|Publication title.
59-
`updated_resource`|Time when the data for this resource was last updated.
73+
`year`|Year published.
74+
`title`|Publication title.
75+
`updated_resource`|Time when the data for this resource was last updated.
6076
`authors`|Lists authors (if available)
6177

6278
The columns`best_oa_location` and`oa_locations` are list-columns that contain useful metadata about the OA sources found by Unpaywall These are
@@ -65,7 +81,7 @@ The columns `best_oa_location` and `oa_locations` are list-columns that contai
6581
|:------------|:----------------------------------------------
6682
`endpoint_id`|Unique repository identifier
6783
`evidence`|How the OA location was found and is characterized by Unpaywall?
68-
`host_type`|OA full-text provided by`publisher` or`repository`.
84+
`host_type`|OA full-text provided by`publisher` or`repository`.
6985
`is_best`|Is this location the \code{best_oa_location} for its resource?
7086
`license`|The license under which this copy is published
7187
`oa_date`|When this document first became available at this location
@@ -79,31 +95,46 @@ The columns `best_oa_location` and `oa_locations` are list-columns that contai
7995

8096
The Unpaywall schema is also described here:<https://unpaywall.org/data-format>.
8197

82-
The columns`best_oa_location`.`oa_locations` and`oa_locations_embargoed` are list-columns that contain useful metadata about the OA sources found by Unpaywall.
98+
The columns`best_oa_location`.`oa_locations` and`oa_locations_embargoed` are list-columns that contain useful metadata about the OA sources found by Unpaywall.
8399

84100
If`.flatten = TRUE` the list-column`oa_locations` will be restructured in a long format where each OA fulltext is represented by one row, which allows to take into account all OA locations found by Unpaywall in a data analysis.
85101

86-
```{r}
102+
103+
```r
87104
library(dplyr)
88105
roadoi::oadoi_fetch(dois= c("10.1186/s12864-016-2566-9",
89106
"10.1103/physreve.88.012814",
90107
"10.1093/reseval/rvaa038",
91108
"10.1101/2020.05.22.111294",
92-
"10.1093/bioinformatics/btw541"),
109+
"10.1093/bioinformatics/btw541"),
93110
email="najko.jahn@gmail.com",.flatten=TRUE) %>%
94-
dplyr::count(is_oa, evidence, is_best)
111+
dplyr::count(is_oa,evidence,is_best)
112+
```
113+
114+
```
115+
## # A tibble: 8 × 4
116+
## is_oa evidence is_best n
117+
## <lgl> <chr> <lgl> <int>
118+
## 1 FALSE <NA> NA 1
119+
## 2 TRUE oa journal (via doaj) TRUE 2
120+
## 3 TRUE oa repository (semantic scholar lookup) FALSE 1
121+
## 4 TRUE oa repository (via OAI-PMH doi match) FALSE 7
122+
## 5 TRUE oa repository (via page says license) FALSE 1
123+
## 6 TRUE oa repository (via pmcid lookup) FALSE 2
124+
## 7 TRUE open (via crossref license, author manuscri… TRUE 1
125+
## 8 TRUE open (via page says license) TRUE 1
95126
```
96127

97128

98129
####Any API restrictions?
99130

100-
There are no API restrictions. However, Unpaywall requires an email address when using its API. If you are too tired to type in your email address every time, you can store the email in the`.Renviron` file with the option`roadoi_email`
131+
There are no API restrictions. However, Unpaywall requires an email address when using its API. If you are too tired to type in your email address every time, you can store the email in the`.Renviron` file with the option`roadoi_email`
101132

102133
```
103134
roadoi_email = "najko.jahn@gmail.com"
104135
```
105136

106-
You can open your`.Renviron` file calling
137+
You can open your`.Renviron` file calling
107138

108139
```r
109140
file.edit("~/.Renviron")`
@@ -115,14 +146,34 @@ Save the file and restart your R session. To stop sharing the email when using r
115146

116147
To follow your API call, and to estimate the time until completion, use the`.progress` parameter inherited from`plyr` to display a progress bar.
117148

118-
```{r}
149+
150+
```r
119151
roadoi::oadoi_fetch(dois= c("10.1186/s12864-016-2566-9",
120-
"10.1103/physreve.88.012814"),
121-
email = "najko.jahn@gmail.com",
152+
"10.1103/physreve.88.012814"),
153+
email="najko.jahn@gmail.com",
122154
.progress="text")
123155
```
124156

157+
```
158+
## | | | 0% | |============================ | 50% | |========================================================| 100%
159+
```
160+
161+
```
162+
## # A tibble: 2 × 21
163+
## doi best_oa_location oa_locations oa_locations_emba…
164+
## <chr> <list> <list> <list>
165+
## 1 10.1186/s128… <tibble [1 × 10]> <tibble [5 ×… <tibble [0 × 0]>
166+
## 2 10.1103/phys… <tibble [1 × 10]> <tibble [2 ×… <tibble [0 × 0]>
167+
## # … with 17 more variables: data_standard <int>, is_oa <lgl>,
168+
## # is_paratext <lgl>, genre <chr>, oa_status <chr>,
169+
## # has_repository_copy <lgl>, journal_is_oa <lgl>,
170+
## # journal_is_in_doaj <lgl>, journal_issns <chr>,
171+
## # journal_issn_l <chr>, journal_name <chr>, publisher <chr>,
172+
## # published_date <chr>, year <chr>, title <chr>,
173+
## # updated_resource <chr>, authors <list>
174+
```
175+
125176

126-
###References
177+
###References
127178

128179
Piwowar, H., Priem, J., Larivière, V., Alperin, J. P., Matthias, L., Norlander, B., … Haustein, S. (2018). The state of OA: a large-scale analysis of the prevalence and impact of Open Access articles. PeerJ, 6, e4375.<https://doi.org/10.7717/peerj.4375>

‎vignettes/intro.Rmd.orig‎

Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
---
2+
title: "Introduction"
3+
author: "Najko Jahn"
4+
date: "`r Sys.Date()`"
5+
vignette: >
6+
%\VignetteIndexEntry{Introduction}
7+
%\VignetteEngine{knitr::rmarkdown}
8+
%\VignetteEncoding{UTF-8}
9+
---
10+
11+
### About Unpaywall
12+
13+
[Unpaywall](https://unpaywall.org/), developed and maintained by the [team of OurResearch](https://ourresearch.org/team/about), is a non-profit service that finds open access copies of scholarly literature by looking up a DOI (Digital Object Identifier). It not only returns open access full-text links, but also helpful metadata about the open access status of a publication such as licensing or provenance information.
14+
15+
Unpaywall uses different data sources to find open access full-texts including:
16+
17+
- [Crossref](https://www.crossref.org/): a DOI registration agency serving major scholarly publishers.
18+
- [Directory of Open Access Journals (DOAJ)](https://doaj.org/): a registry of open access journals
19+
- Various OAI-PMH metadata sources. OAI-PMH is a protocol often used by open access journals and repositories such as arXiv and PubMed Central.
20+
21+
See [Piwowar et al. (2018)](https://doi.org/10.7717/peerj.4375) for a comprehensive overview of Unpaywall.
22+
23+
### Basic usage
24+
25+
There is one major function to talk with Unpaywall, `oadoi_fetch()`, taking a character vector of DOIs and your email address as required arguments.
26+
27+
```{r}
28+
library(roadoi)
29+
roadoi::oadoi_fetch(dois = c("10.1186/s12864-016-2566-9",
30+
"10.1103/physreve.88.012814"),
31+
email = "najko.jahn@gmail.com")
32+
```
33+
34+
#### What's returned?
35+
36+
The client supports API version 2. According to the [Unpaywall Data Format](https://unpaywall.org/data-format), the following variables with the following definitions are returned:
37+
38+
**Column**|**Description**
39+
|:------------|:----------------------------------------------
40+
`doi`|DOI (always in lowercase)
41+
`best_oa_location`|list-column describing the best OA location. Algorithm prioritizes publisher hosted content (e.g. Hybrid or Gold)
42+
`oa_locations`|list-column of all the OA locations.
43+
`oa_locations_embargoed` | list-column of locations expected to be available in the future based on information like license metadata and journals' delayed OA policies
44+
`data_standard`|Indicates the data collection approaches used for this resource. `1` mostly uses Crossref for hybrid detection. `2` uses more comprehensive hybrid detection methods.
45+
`is_oa`|Is there an OA copy (logical)?
46+
`is_paratext`| Is the item an ancillary part of a journal, like a table of contents? See here for more information <https://support.unpaywall.org/support/solutions/articles/44001894783>.
47+
`genre`|Publication type
48+
`oa_status`|Classifies OA resources by location and license terms as one of: gold, hybrid, bronze, green or closed. See here for more information <https://support.unpaywall.org/support/solutions/articles/44001777288-what-do-the-types-of-oa-status-green-gold-hybrid-and-bronze-mean->.
49+
`has_repository_copy`|Is a full-text available in a repository?
50+
`journal_is_oa`|Is the article published in a fully OA journal? Uses the Directory of Open Access Journals (DOAJ) as source.
51+
`journal_is_in_doaj`|Is the journal listed in the Directory of Open Access Journals (DOAJ).
52+
`journal_issns`|ISSNs, i.e. unique code to identify journals.
53+
`journal_issn_l`|Linking ISSN.
54+
`journal_name`|Journal title
55+
`publisher`|Publisher
56+
`published_date`|Date published
57+
`year`|Year published.
58+
`title`|Publication title.
59+
`updated_resource`|Time when the data for this resource was last updated.
60+
`authors`|Lists authors (if available)
61+
62+
The columns `best_oa_location` and `oa_locations` are list-columns that contain useful metadata about the OA sources found by Unpaywall These are
63+
64+
**Column**|**Description**
65+
|:------------|:----------------------------------------------
66+
`endpoint_id`|Unique repository identifier
67+
`evidence`|How the OA location was found and is characterized by Unpaywall?
68+
`host_type`|OA full-text provided by `publisher` or `repository`.
69+
`is_best`|Is this location the \code{best_oa_location} for its resource?
70+
`license`|The license under which this copy is published
71+
`oa_date`|When this document first became available at this location
72+
`pmh_id`|OAI-PMH endpoint where we found this location
73+
`repository_institution`|Hosting institution of the repository.
74+
`updated`|Time when the data for this location was last updated
75+
`url`|The URL where you can find this OA copy.
76+
`url_for_landing_page`| The URL for a landing page describing this OA copy.
77+
`url_for_pdf`|The URL with a PDF version of this OA copy.
78+
`versions`|The content version accessible at this location following the DRIVER 2.0 Guidelines (<https://wiki.surfnet.nl/display/DRIVERguidelines/DRIVER-VERSION+Mappings>)
79+
80+
The Unpaywall schema is also described here: <https://unpaywall.org/data-format>.
81+
82+
The columns `best_oa_location`. `oa_locations` and `oa_locations_embargoed` are list-columns that contain useful metadata about the OA sources found by Unpaywall.
83+
84+
If `.flatten = TRUE` the list-column `oa_locations` will be restructured in a long format where each OA fulltext is represented by one row, which allows to take into account all OA locations found by Unpaywall in a data analysis.
85+
86+
```{r, message = FALSE}
87+
library(dplyr)
88+
roadoi::oadoi_fetch(dois = c("10.1186/s12864-016-2566-9",
89+
"10.1103/physreve.88.012814",
90+
"10.1093/reseval/rvaa038",
91+
"10.1101/2020.05.22.111294",
92+
"10.1093/bioinformatics/btw541"),
93+
email = "najko.jahn@gmail.com", .flatten = TRUE) %>%
94+
dplyr::count(is_oa, evidence, is_best)
95+
```
96+
97+
98+
#### Any API restrictions?
99+
100+
There are no API restrictions. However, Unpaywall requires an email address when using its API. If you are too tired to type in your email address every time, you can store the email in the `.Renviron` file with the option `roadoi_email`
101+
102+
```
103+
roadoi_email = "najko.jahn@gmail.com"
104+
```
105+
106+
You can open your `.Renviron` file calling
107+
108+
```r
109+
file.edit("~/.Renviron")`
110+
```
111+
112+
Save the file and restart your R session. To stop sharing the email when using roadoi, delete it from your `.Renviron` file.
113+
114+
#### Keeping track of crawling
115+
116+
To follow your API call, and to estimate the time until completion, use the `.progress` parameter inherited from `plyr` to display a progress bar.
117+
118+
```{r}
119+
roadoi::oadoi_fetch(dois = c("10.1186/s12864-016-2566-9",
120+
"10.1103/physreve.88.012814"),
121+
email = "najko.jahn@gmail.com",
122+
.progress = "text")
123+
```
124+
125+
126+
### References
127+
128+
Piwowar, H., Priem, J., Larivière, V., Alperin, J. P., Matthias, L., Norlander, B., … Haustein, S. (2018). The state of OA: a large-scale analysis of the prevalence and impact of Open Access articles. PeerJ, 6, e4375. <https://doi.org/10.7717/peerj.4375>

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp