Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

OAI-PMH R client

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
NotificationsYou must be signed in to change notification settings

ropensci/oai

Repository files navigation

Project Status: Active – The project has reached a stable, usable state and is being actively developed.R-checkcran checkscodecov.iorstudio mirror downloadscran version

oai is an R client to work with OAI-PMH (Open Archives InitiativeProtocol for Metadata Harvesting) services, a protocol developed by theOpen Archives Initiative(https://en.wikipedia.org/wiki/Open_Archives_Initiative). OAI-PMH usesXML data format transported over HTTP.

OAI-PMH Info:

oai is built onxml2 andhttr. In addition, we give backdata.frame’s whenever possible to make data comprehension, manipulation,and visualization easier. We also have functions to fetch a largedirectory of OAI-PMH services - it isn’t exhaustive, but does contain alot.

OAI-PMH instead of paging with e.g.,page andper_page parameters,uses (optionally)resumptionTokens, optionally with an expirationdate. These tokens can be used to continue on to the next chunk of data,if the first request did not get to the end. Often, OAI-PMH serviceslimit each request to 50 records, but this may vary by provider, I don’tknow for sure. The API of this package is such that wewhile loop foryou internally until we get all records. We may in the future exposee.g., alimit parameter so you can say how many records you want, butwe haven’t done this yet.

Install

Install from CRAN

install.packages("oai")

Development version

devtools::install_github("ropensci/oai")
library("oai")

Identify

id("http://oai.datacite.org/oai")#>   repositoryName                      baseURL protocolVersion#> 1       DataCite https://oai.datacite.org/oai             2.0#>             adminEmail    earliestDatestamp deletedRecord          granularity#> 1 support@datacite.org 2011-01-01T00:00:00Z    persistent YYYY-MM-DDThh:mm:ssZ#>   compression compression.1                                    description#> 1        gzip       deflate oaioai.datacite.org:oai:oai.datacite.org:12425

ListIdentifiers

list_identifiers(from='2018-05-01T',until='2018-06-01T')#> # A tibble: 75 × 5#>    identifier                           datestamp        setSpec setSp…¹ setSp…²#>    <chr>                                <chr>            <chr>   <chr>   <chr>#>  1 4b64d1f2-31c2-40c9-80aa-bb7ddb424684 2018-05-30T13:5… instal… datase… countr…#>  2 884378d6-d591-4760-bb70-7b4851784d96 2018-05-29T19:1… instal… datase… countr…#>  3 18799ce9-1a66-40fc-ad18-5ac54cd3417b 2018-05-14T12:1… instal… datase… countr…#>  4 7e91aacb-c994-41ee-a7b7-bd23c02cd5bf 2018-05-21T10:5… instal… datase… countr…#>  5 f83746ee-4cf2-4e60-a720-dd508b559794 2018-05-08T09:4… instal… datase… countr…#>  6 a3533a61-6f88-443e-89ae-37611ea88267 2018-05-08T13:5… instal… datase… countr…#>  7 ba9b66a3-2d11-4193-922e-ace4d5909239 2018-05-05T23:5… instal… datase… countr…#>  8 78b696d9-8f0d-41ab-9c23-1c3547da411d 2018-05-05T23:0… instal… datase… countr…#>  9 c791b255-a184-4600-b828-ef9d4092a212 2018-05-05T14:2… instal… datase… countr…#> 10 b929ccda-03b1-4166-9e5b-34588339d61d 2018-05-09T02:5… instal… datase… countr…#> # … with 65 more rows, and abbreviated variable names ¹​setSpec.1, ²​setSpec.2

Count Identifiers

count_identifiers()#>                            url   count#> 1 http://export.arxiv.org/oai2 2158148

ListRecords

list_records(from='2018-05-01T',until='2018-05-15T')#> # A tibble: 41 × 26#>    identi…¹ dates…² setSpec setSp…³ setSp…⁴ title publi…⁵ ident…⁶ subject source#>    <chr>    <chr>   <chr>   <chr>   <chr>   <chr> <chr>   <chr>   <chr>   <chr>#>  1 18799ce… 2018-0… instal… datase… countr… Bird… Sokoin… https:… Occurr… ""#>  2 f83746e… 2018-0… instal… datase… countr… NDFF… Dutch … https:… Metada… "http…#>  3 a3533a6… 2018-0… instal… datase… countr… EDP … EDP - … https:… Occurr… ""#>  4 ba9b66a… 2018-0… instal… datase… countr… Ende… Sokoin… https:… Occurr… ""#>  5 78b696d… 2018-0… instal… datase… countr… Ende… Sokoin… https:… Occurr… ""#>  6 c791b25… 2018-0… instal… datase… countr… Ende… Sokoin… https:… Occurr… ""#>  7 b929ccd… 2018-0… instal… datase… countr… List… Sokoin… https:… Occurr… ""#>  8 da285c2… 2018-0… instal… datase… countr… Moni… Corpor… https:… seguim… ""#>  9 8737287… 2018-0… instal… datase… countr… Moni… Corpor… https:… seguim… ""#> 10 ed7d4c2… 2018-0… instal… datase… countr… Samo… Minist… https:… Occurr… ""#> # … with 31 more rows, 16 more variables: description <chr>,#> #   description.1 <chr>, type <chr>, creator <chr>, date <chr>, language <chr>,#> #   coverage <chr>, coverage.1 <chr>, format <chr>, source.1 <chr>,#> #   subject.1 <chr>, creator.1 <chr>, coverage.2 <chr>, description.2 <chr>,#> #   creator.2 <chr>, subject.2 <chr>, and abbreviated variable names#> #   ¹​identifier, ²​datestamp, ³​setSpec.1, ⁴​setSpec.2, ⁵​publisher, ⁶​identifier.1

GetRecords

ids<- c("87832186-00ea-44dd-a6bf-c2896c4d09b4","d981c07d-bc43-40a2-be1f-e786e25106ac")get_records(ids)#> $`87832186-00ea-44dd-a6bf-c2896c4d09b4`#> $`87832186-00ea-44dd-a6bf-c2896c4d09b4`$header#> # A tibble: 1 × 3#>   identifier                           datestamp            setSpec#>   <chr>                                <chr>                <chr>#> 1 87832186-00ea-44dd-a6bf-c2896c4d09b4 2018-06-29T12:08:17Z installation:729a73…#>#> $`87832186-00ea-44dd-a6bf-c2896c4d09b4`$metadata#> # A tibble: 0 × 0#>#>#> $`d981c07d-bc43-40a2-be1f-e786e25106ac`#> $`d981c07d-bc43-40a2-be1f-e786e25106ac`$header#> # A tibble: 1 × 3#>   identifier                           datestamp            setSpec#>   <chr>                                <chr>                <chr>#> 1 d981c07d-bc43-40a2-be1f-e786e25106ac 2021-09-28T13:58:57Z installation:804b8d…#>#> $`d981c07d-bc43-40a2-be1f-e786e25106ac`$metadata#> # A tibble: 1 × 12#>   title       publi…¹ ident…² subject source descr…³ type  creator date  langu…⁴#>   <chr>       <chr>   <chr>   <chr>   <chr>  <chr>   <chr> <chr>   <chr> <chr>#> 1 Peces de l… Instit… https:… Occurr… http:… Caract… Data… Fernan… 2021… es#> # … with 2 more variables: coverage <chr>, format <chr>, and abbreviated#> #   variable names ¹​publisher, ²​identifier, ³​description, ⁴​language

List MetadataFormats

list_metadataformats(id="87832186-00ea-44dd-a6bf-c2896c4d09b4")#> $`87832186-00ea-44dd-a6bf-c2896c4d09b4`#>   metadataPrefix                                                   schema#> 1         oai_dc           http://www.openarchives.org/OAI/2.0/oai_dc.xsd#> 2            eml http://rs.gbif.org/schema/eml-gbif-profile/1.0.2/eml.xsd#>                             metadataNamespace#> 1 http://www.openarchives.org/OAI/2.0/oai_dc/#> 2          eml://ecoinformatics.org/eml-2.1.1

List Sets

list_sets("http://api.gbif.org/v1/oai-pmh/registry")#> # A tibble: 621 × 2#>    setSpec                     setName#>    <chr>                       <chr>#>  1 dataset_type                per dataset type#>  2 dataset_type:OCCURRENCE     occurrence#>  3 dataset_type:CHECKLIST      checklist#>  4 dataset_type:METADATA       metadata#>  5 dataset_type:SAMPLING_EVENT sampling_event#>  6 country                     per country#>  7 country:AD                  Andorra#>  8 country:AM                  Armenia#>  9 country:AO                  Angola#> 10 country:AQ                  Antarctica#> # … with 611 more rows

Examples of other OAI providers

Biodiversity Heritage Library

Identify

id("http://www.biodiversitylibrary.org/oai")#>                                 repositoryName#> 1 Biodiversity Heritage Library OAI Repository#>                                   baseURL protocolVersion#> 1 https://www.biodiversitylibrary.org/oai             2.0#>                    adminEmail earliestDatestamp deletedRecord granularity#> 1 oai@biodiversitylibrary.org        2006-01-01            no  YYYY-MM-DD#>                                                        description#> 1 oaibiodiversitylibrary.org:oai:biodiversitylibrary.org:item/1000

Get records

get_records(c("oai:biodiversitylibrary.org:item/7","oai:biodiversitylibrary.org:item/9"),url="http://www.biodiversitylibrary.org/oai")#> $`oai:biodiversitylibrary.org:item/7`#> $`oai:biodiversitylibrary.org:item/7`$header#> # A tibble: 1 × 3#>   identifier                         datestamp            setSpec#>   <chr>                              <chr>                <chr>#> 1 oai:biodiversitylibrary.org:item/7 2016-01-26T06:05:19Z item#>#> $`oai:biodiversitylibrary.org:item/7`$metadata#> # A tibble: 1 × 11#>   title    creator subject descr…¹ publi…² contr…³ type  ident…⁴ langu…⁵ relat…⁶#>   <chr>    <chr>   <chr>   <chr>   <chr>   <chr>   <chr> <chr>   <chr>   <chr>#> 1 Die Mus… Fleisc… Bogor;… pt.5:v… Leiden… Missou… text… https:… Dutch   https:…#> # … with 1 more variable: rights <chr>, and abbreviated variable names#> #   ¹​description, ²​publisher, ³​contributor, ⁴​identifier, ⁵​language, ⁶​relation#>#>#> $`oai:biodiversitylibrary.org:item/9`#> $`oai:biodiversitylibrary.org:item/9`$header#> # A tibble: 1 × 3#>   identifier                         datestamp            setSpec#>   <chr>                              <chr>                <chr>#> 1 oai:biodiversitylibrary.org:item/9 2016-01-26T06:05:19Z item#>#> $`oai:biodiversitylibrary.org:item/9`$metadata#> # A tibble: 1 × 11#>   title    creator subject descr…¹ publi…² contr…³ type  ident…⁴ langu…⁵ relat…⁶#>   <chr>    <chr>   <chr>   <chr>   <chr>   <chr>   <chr> <chr>   <chr>   <chr>#> 1 Die Mus… Fleisc… Bogor;… pt.5:v… Leiden… Missou… text… https:… Dutch   https:…#> # … with 1 more variable: rights <chr>, and abbreviated variable names#> #   ¹​description, ²​publisher, ³​contributor, ⁴​identifier, ⁵​language, ⁶​relation

Acknowledgements

Michał Bojanowski thanks National Science Centre for support throughgrant 2012/07/D/HS6/01971.

Meta

  • Pleasereport any issues orbugs.
  • License: MIT
  • Get citation information foroai in R doingcitation(package = 'oai')
  • Please note that this project is released with aContributor Codeof Conduct. By participatingin this project you agree to abide by its terms.

About

OAI-PMH R client

Topics

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Code of conduct

Stars

Watchers

Forks

Contributors6


[8]ページ先頭

©2009-2025 Movatter.jp