Movatterモバイル変換


[0]ホーム

URL:


OJS Scraper for R

CRAN statusR-CMD-check

The aim of this package is to aid you in crawling OJS archives,issues, articles, galleys, and search results, and retrieving/scrapingmetadata from articles.ojsr functions rely on OJS routingconventions to compose the URL for different scrapingscenarios.

Installation

From CRAN:

install.packages('ojsr')

From Github:

install.packages('devtools')devtools::install_github("gastonbecerra/ojsr")

ojsr functions

Example

Let’s say we want to collect metadata from some journals to comparetheir top keywords. We have the journals’ names and URLs, and can useojsr to scrap their issues, articles and metadata.

library(dplyr) library(ojsr)journals <- data.frame ( cbind(    name = c( "Revista Evaluar", "PSocial" ),    url = c( "https://revistas.unc.edu.ar/index.php/revaluar", "https://publicaciones.sociales.uba.ar/index.php/psicologiasocial")  ), stringsAsFactors = FALSE )# we are using the journal URL as input to retrieve the issuesissues <- ojsr::get_issues_from_archive(input_url = journals$url) # we are using the issues URL we just scraped as an input to retrieve the articlesarticles <- ojsr::get_articles_from_issue(input_url = issues$output_url)# we are using the articles URL we just scraped as an input to retrieve the metadatametadata <- ojsr::get_html_meta_from_article(input_url = articles$output_url)# let's parse the base URLs from journals and metadata, so we can bind by journaljournals$base_url <- ojsr::parse_base_url(journals$url)metadata$base_url <- ojsr::parse_base_url(metadata$input_url)metadata %>% filter(meta_data_name=="citation_keywords") %>% # filtering only keywords  left_join(journals) %>% # include journal names  group_by(base_url, keyword = meta_data_content) %>% tally(sort=TRUE)

[8]ページ先頭

©2009-2025 Movatter.jp