Movatterモバイル変換


[0]ホーム

URL:


Title:Processing Agro-Environmental Data
Version:0.2.0
Description:A set of tools for processing and analyzing data developed in the context of the "Who Has Eaten the Planet" (WHEP) project, funded by the European Research Council (ERC). For more details on multi-regional input–output model "Food and Agriculture Biomass Input–Output" (FABIO) see Bruckner et al. (2019) <doi:10.1021/acs.est.9b03554>.
License:MIT + file LICENSE
Imports:cli, dplyr, fs, FAOSTAT, httr, mipfp, nanoparquet, pins,purrr, readr, rlang, stringr, tidyr, withr, yaml, zoo
Encoding:UTF-8
RoxygenNote:7.3.2
Suggests:ggplot2, googlesheets4, here, knitr, pointblank, rmarkdown,testthat (≥ 3.0.0), tibble
Config/testthat/edition:3
VignetteBuilder:knitr
URL:https://eduaguilera.github.io/whep/,https://github.com/eduaguilera/whep
BugReports:https://github.com/eduaguilera/whep/issues
Depends:R (≥ 4.2.0)
LazyData:true
NeedsCompilation:no
Packaged:2025-10-15 13:57:15 UTC; catalin
Author:Catalin CovaciORCID iD [aut, cre], Eduardo AguileraORCID iD [aut, cph], João SerraORCID iD [ctb], European Research Council [fnd]
Maintainer:Catalin Covaci <catalin.covaci@csic.es>
Repository:CRAN
Date/Publication:2025-10-15 15:20:02 UTC

whep: Processing Agro-Environmental Data

Description

logo

A set of tools for processing and analyzing data developed in the context of the "Who Has Eaten the Planet" (WHEP) project, funded by the European Research Council (ERC). For more details on multi-regional input–output model "Food and Agriculture Biomass Input–Output" (FABIO) see Bruckner et al. (2019)doi:10.1021/acs.est.9b03554.

Author(s)

Maintainer: Catalin Covacicatalin.covaci@csic.es (ORCID)

Authors:

Other contributors:

See Also

Useful links:


Get area codes from area names

Description

Add a new column to an existing tibble with the corresponding codefor each name. The codes are assumed to be from those defined bytheFABIO model.

Usage

add_area_code(table, name_column = "area_name", code_column = "area_code")

Arguments

table

The table that will be modified with a new column.

name_column

The name of the column intable containing the names.

code_column

The name of the output column containing the codes.

Value

A tibble with all the contents oftable and an extra columnnamedcode_column, which contains the codes. If there is no code match,anNA is included.

Examples

table <- tibble::tibble(  area_name = c("Armenia", "Afghanistan", "Dummy Country", "Albania"))add_area_code(table)table |>  dplyr::rename(my_area_name = area_name) |>  add_area_code(name_column = "my_area_name")add_area_code(table, code_column = "my_custom_code")

Get area names from area codes

Description

Add a new column to an existing tibble with the corresponding namefor each code. The codes are assumed to be from those defined bytheFABIO model, which them themselves come fromFAOSTAT internalcodes. Equivalences with ISO 3166-1 numeric can be found in theArea Codes CSV from the zip file that can be downloaded fromFAOSTAT. TODO: Think aboutthis, would be nice to use ISO3 codes but won't be enough for our periods.

Usage

add_area_name(table, code_column = "area_code", name_column = "area_name")

Arguments

table

The table that will be modified with a new column.

code_column

The name of the column intable containing the codes.

name_column

The name of the output column containing the names.

Value

A tibble with all the contents oftable and an extra columnnamedname_column, which contains the names. If there is no name match,anNA is included.

Examples

table <- tibble::tibble(area_code = c(1, 2, 4444, 3))add_area_name(table)table |>  dplyr::rename(my_area_code = area_code) |>  add_area_name(code_column = "my_area_code")add_area_name(table, name_column = "my_custom_name")

Get commodity balance sheet item codes from item names

Description

Add a new column to an existing tibble with the corresponding codefor each commodity balance sheet item name. The codes are assumed to befrom those defined by FAOSTAT.

Usage

add_item_cbs_code(  table,  name_column = "item_cbs_name",  code_column = "item_cbs_code")

Arguments

table

The table that will be modified with a new column.

name_column

The name of the column intable containing the names.

code_column

The name of the output column containing the codes.

Value

A tibble with all the contents oftable and an extra columnnamedcode_column, which contains the codes. If there is no code match,anNA is included.

Examples

table <- tibble::tibble(  item_cbs_name = c("Cottonseed", "Eggs", "Dummy Item"))add_item_cbs_code(table)table |>  dplyr::rename(my_item_cbs_name = item_cbs_name) |>  add_item_cbs_code(name_column = "my_item_cbs_name")add_item_cbs_code(table, code_column = "my_custom_code")

Get commodity balance sheet item names from item codes

Description

Add a new column to an existing tibble with the corresponding namefor each commodity balance sheet item code. The codes are assumed to befrom those defined by FAOSTAT.

Usage

add_item_cbs_name(  table,  code_column = "item_cbs_code",  name_column = "item_cbs_name")

Arguments

table

The table that will be modified with a new column.

code_column

The name of the column intable containing the codes.

name_column

The name of the output column containing the names.

Value

A tibble with all the contents oftable and an extra columnnamedname_column, which contains the names. If there is no name match,anNA is included.

Examples

table <- tibble::tibble(item_cbs_code = c(2559, 2744, 9876))add_item_cbs_name(table)table |>  dplyr::rename(my_item_cbs_code = item_cbs_code) |>  add_item_cbs_name(code_column = "my_item_cbs_code")add_item_cbs_name(table, name_column = "my_custom_name")

Get production item codes from item names

Description

Add a new column to an existing tibble with the corresponding codefor each production item name. The codes are assumed to be from thosedefined by FAOSTAT.

Usage

add_item_prod_code(  table,  name_column = "item_prod_name",  code_column = "item_prod_code")

Arguments

table

The table that will be modified with a new column.

name_column

The name of the column intable containing the names.

code_column

The name of the output column containing the codes.

Value

A tibble with all the contents oftable and an extra columnnamedcode_column, which contains the codes. If there is no code match,anNA is included.

Examples

table <- tibble::tibble(  item_prod_name = c("Rice", "Cabbages", "Dummy Item"))add_item_prod_code(table)table |>  dplyr::rename(my_item_prod_name = item_prod_name) |>  add_item_prod_code(name_column = "my_item_prod_name")add_item_prod_code(table, code_column = "my_custom_code")

Get production item names from item codes

Description

Add a new column to an existing tibble with the corresponding namefor each production item code. The codes are assumed to be from thosedefined by FAOSTAT.

Usage

add_item_prod_name(  table,  code_column = "item_prod_code",  name_column = "item_prod_name")

Arguments

table

The table that will be modified with a new column.

code_column

The name of the column intable containing the codes.

name_column

The name of the output column containing the names.

Value

A tibble with all the contents oftable and an extra columnnamedname_column, which contains the names. If there is no name match,anNA is included.

Examples

table <- tibble::tibble(item_prod_code = c(27, 358, 12345))add_item_prod_name(table)table |>  dplyr::rename(my_item_prod_code = item_prod_code) |>  add_item_prod_name(code_column = "my_item_prod_code")add_item_prod_name(table, name_column = "my_custom_name")

Supply and use tables

Description

Create a table with processes, their inputs (use) and theiroutputs (supply).

Usage

build_supply_use(  cbs_version = NULL,  feed_intake_version = NULL,  primary_prod_version = NULL,  primary_residues_version = NULL,  processing_coefs_version = NULL)

Arguments

cbs_version

File version passed toget_wide_cbs() call.

feed_intake_version

File version passed toget_feed_intake() call.

primary_prod_version

File version passed toget_primary_production() call.

primary_residues_version

File version passed toget_primary_residues() call.

processing_coefs_version

File version passed toget_processing_coefs() call.

Value

A tibble with the supply and use data for processes.It contains the following columns:

Examples

# Note: These are smaller samples to show outputs, not the real data.# For all data, call the function with default versions (i.e. no arguments).build_supply_use(  cbs_version = "example",  feed_intake_version = "example",  primary_prod_version = "example",  primary_residues_version = "example",  processing_coefs_version = "example")

Trade data sources

Description

Create a new dataframe where each row has a year range into one where eachrow is a single year, effectively 'expanding' the whole year range.

Usage

expand_trade_sources(trade_sources)

Arguments

trade_sources

A tibble dataframe where each row contains theyear range.

Value

A tibble dataframe where each row corresponds to a single year fora given source.

Examples

trade_sources <- tibble::tibble(  Name = c("a", "b", "c"),  Trade = c("t1", "t2", "t3"),  Info_Format = c("year", "partial_series", "year"),  Timeline_Start = c(1, 1, 2),  Timeline_End = c(3, 4, 5),  Timeline_Freq = c(1, 1, 2),  `Imp/Exp` = "Imp",  SACO_link = NA,)expand_trade_sources(trade_sources)

Bilateral trade data

Description

Reports trade between pairs of countries in given years.

Usage

get_bilateral_trade(trade_version = NULL, cbs_version = NULL)

Arguments

trade_version

File version used for bilateral trade input.Seewhep_inputs for version details.

cbs_version

File version passed toget_wide_cbs() call.

Value

A tibble with the reported trade between countries. For efficientmemory usage, the tibble is not exactly in tidy format.It contains the following columns:

The step by step approach to obtain this data tries to follow the FABIOmodel and is explained below. All the steps are performed separately foreach group of year and item.

Examples

# Note: These are smaller samples to show outputs, not the real data.# For all data, call the function with default versions (i.e. no arguments).get_bilateral_trade(  trade_version = "example",  cbs_version = "example")

Scrapes activity_data from FAOSTAT and slightly post-processes it

Description

Important: Dynamically allows for the introduction of subsets as"...".

Note: overhead by individually scraping FAOSTAT code QCL for crop data;it's fine.

Usage

get_faostat_data(activity_data, ...)

Arguments

activity_data

activity data required from FAOSTAT; needsto be one ofc('livestock','crop_area','crop_yield','crop_production').

...

can be whichever column name fromget_faostat_bulk,particularlyyear,area orISO3_CODE.

Value

data.frame of FAOSTAT foractivity_data; default is forall years and countries.

Examples

get_faostat_data("livestock", year = 2010, area = "Portugal")

Livestock feed intake

Description

Get amount of items used for feeding livestock.

Usage

get_feed_intake(version = NULL)

Arguments

version

File version to use as input. Seewhep_inputs for details.

Value

A tibble with the feed intake data.It contains the following columns:

Examples

# Note: These are smaller samples to show outputs, not the real data.# For all data, call the function with default version (i.e. no arguments).get_feed_intake(version = "example")

Primary items production

Description

Get amount of crops, livestock and livestock products.

Usage

get_primary_production(version = NULL)

Arguments

version

File version to use as input. Seewhep_inputs for details.

Value

A tibble with the item production data.It contains the following columns:

Examples

# Note: These are smaller samples to show outputs, not the real data.# For all data, call the function with default version (i.e. no arguments).get_primary_production(version = "example")

Crop residue items

Description

Get type and amount of residue produced for each crop production item.

Usage

get_primary_residues(version = NULL)

Arguments

version

File version to use as input. Seewhep_inputs for details.

Value

A tibble with the crop residue data.It contains the following columns:

Examples

# Note: These are smaller samples to show outputs, not the real data.# For all data, call the function with default version (i.e. no arguments).get_primary_residues(version = "example")

Processed products share factors

Description

Reports quantities of commodity balance sheet items used forprocessingand quantities of their corresponding processed output items.

Usage

get_processing_coefs(version = NULL)

Arguments

version

File version to use as input. Seewhep_inputs for details.

Value

A tibble with the quantities for each processed product.It contains the following columns:

For the final data obtained, the quantitiesfinal_value_processed arebalanced in the following sense: the total sum offinal_value_processedfor each unique tuple of⁠(year, area_code, item_cbs_code_processed)⁠should be exactly the quantity reported for that year, country anditem_cbs_code_processed item in theproduction column obtained fromget_wide_cbs(). This is because they are not primary products, so theamount from 'production' is actually the amount of subproduct obtained.TODO: Fix few data where this doesn't hold.

Examples

# Note: These are smaller samples to show outputs, not the real data.# For all data, call the function with default version (i.e. no arguments).get_processing_coefs(version = "example")

Commodity balance sheet data

Description

States supply and use parts for each commodity balance sheet (CBS) item.

Usage

get_wide_cbs(version = NULL)

Arguments

version

File version to use as input. Seewhep_inputs for details.

Value

A tibble with the commodity balance sheet data in wide format.It contains the following columns:

The other columns are quantities (measured in tonnes), where total supplyand total use should be balanced.

For supply:

For use:

There is an additional columndomestic_supply which is computed as thetotal use excludingexport.

Examples

# Note: These are smaller samples to show outputs, not the real data.# For all data, call the function with default version (i.e. no arguments).get_wide_cbs(version = "example")

Commodity balance sheet items

Description

Defines name/code correspondences for commodity balance sheet (CBS) items.

Usage

items_cbs

Format

A tibble where each row corresponds to one CBS item.It contains the following columns:

Source

Inspired byFAOSTAT data.


Primary production items

Description

Defines name/code correspondences for production items.

Usage

items_prod

Format

A tibble where each row corresponds to one production item.It contains the following columns:

Source

Inspired byFAOSTAT data.


Fill gaps by linear interpolation, or carrying forward or backward.

Description

Fills gaps (NA values) in a time-dependent variable bylinear interpolation between two points, or carrying forward or backwardsthe last or initial values, respectively. It also creates a new variableindicating the source of the filled values.

Usage

linear_fill(  df,  var,  time_index,  interpolate = TRUE,  fill_forward = TRUE,  fill_backward = TRUE,  .by = NULL)

Arguments

df

A tibble data frame containing one observation per row.

var

The variable of df containing gaps to be filled.

time_index

The time index variable (usually year).

interpolate

Logical. IfTRUE (default),performs linear interpolation.

fill_forward

Logical. IfTRUE (default),carries last value forward.

fill_backward

Logical. IfTRUE (default),carries first value backward.

.by

A character vector with the grouping variables (optional).

Value

A tibble data frame (ungrouped) where gaps in var have been filled,and a new "source" variable has been created indicating if the value isoriginal or, in case it has been estimated, the gapfilling method that hasbeen used.

Examples

sample_tibble <- tibble::tibble(  category = c("a", "a", "a", "a", "a", "a", "b", "b", "b", "b", "b", "b"),  year = c(    "2015", "2016", "2017", "2018", "2019", "2020",    "2015", "2016", "2017", "2018", "2019", "2020"  ),  value = c(NA, 3, NA, NA, 0, NA, 1, NA, NA, NA, 5, NA),)linear_fill(sample_tibble, value, year, .by = c("category"))linear_fill(  sample_tibble,  value,  year,  interpolate = FALSE,  .by = c("category"),)

Polities

Description

Defines name/code correspondences for polities (political entities).

Usage

polities

Format

A tibble where each row corresponds to one polity.It contains the following columns:TODO: On polities Pull Request, coming soon


Fill gaps using a proxy variable

Description

Fills gaps in a variable based on changes in a proxy variable, using ratiosbetween the filled variable and the proxy variable, and labels outputaccordingly.

Usage

proxy_fill(df, var, proxy_var, time_index, ...)

Arguments

df

A tibble data frame containing one observation per row.

var

The variable of df containing gaps to be filled.

proxy_var

The variable to be used as proxy.

time_index

The time index variable (usually year).

...

Optionally, additional arguments that will be passed tolinear_fill() with the ratios. See that function to know the acceptedarguments.

Value

A tibble dataframe (ungrouped) where gaps in var have been filled,a new proxy_ratio variable has been created,and a new "source" variable has been created indicating if the value isoriginal or, in case it has been estimated, the gapfilling method that hasbeen used.

Examples

sample_tibble <- tibble::tibble(  category = c("a", "a", "a", "a", "a", "a", "b", "b", "b", "b", "b", "b"),  year = c(    "2015", "2016", "2017", "2018", "2019", "2020",    "2015", "2016", "2017", "2018", "2019", "2020"  ),  value = c(NA, 3, NA, NA, 0, NA, 1, NA, NA, NA, 5, NA),  proxy_variable = c(1, 2, 2, 2, 2, 2, 1, 2, 3, 4, 5, 6))proxy_fill(sample_tibble, value, proxy_variable, year, .by = c("category"))

Fill gaps summing the previous value of a variable to the value ofanother variable.

Description

Fills gaps in a variable with the sum of its previous value and the valueof another variable. When a gap has multiple observations, the values areaccumulated along the series. When there is a gap at the start of theseries, it can either remain unfilled or assume an invisible 0 value beforethe first observation and start filling with cumulative sum.

Usage

sum_fill(df, var, change_var, start_with_zero = TRUE, .by = NULL)

Arguments

df

A tibble data frame containing one observation per row.

var

The variable of df containing gaps to be filled.

change_var

The variable whose values will be used to fill the gaps.

start_with_zero

Logical. If TRUE, assumes an invisible 0 value beforethe first observation and fills with cumulative sum starting from the firstchange_var value. If FALSE (default), starting NA values remain unfilled.

.by

A character vector with the grouping variables (optional).

Value

A tibble dataframe (ungrouped) where gaps in var have been filled,and a new "source" variable has been created indicating if the value isoriginal or, in case it has been estimated, the gapfilling method that hasbeen used.

Examples

sample_tibble <- tibble::tibble(  category = c("a", "a", "a", "a", "a", "a", "b", "b", "b", "b", "b", "b"),  year = c(    "2015", "2016", "2017", "2018", "2019", "2020",    "2015", "2016", "2017", "2018", "2019", "2020"  ),  value = c(NA, 3, NA, NA, 0, NA, 1, NA, NA, NA, 5, NA),  change_variable = c(1, 2, 3, 4, 1, 1, 0, 0, 0, 0, 0, 1))sum_fill(  sample_tibble,  value,  change_variable,  start_with_zero = FALSE,  .by = c("category"))sum_fill(  sample_tibble,  value,  change_variable,  start_with_zero = TRUE,  .by = c("category"))

External inputs

Description

The information needed for accessing external datasets used as inputsin our modeling.

Usage

whep_inputs

Format

A tibble where each row corresponds to one external input dataset.It contains the following columns:

Source

Created by the package authors.


Input file versions

Description

Lists all existing versions of an input file fromwhep_inputs.

Usage

whep_list_file_versions(file_alias)

Arguments

file_alias

Internal name of the requested file. You can find thepossible values in thewhep_inputs dataset.

Value

A tibble where each row is a version. For details about its format,seepins::pin_versions().

Examples

whep_list_file_versions("read_example")

Download, cache and read files

Description

Used to fetch input files that are needed for the package's functionsand that were built in external sources and are too large to includedirectly. This is a public function for transparency purposes, so thatusers can inspect the original inputs of this package that were notdirectly processed here.

If the requested file doesn't exist locally, it is downloaded from a publiclink and cached before reading it. This is all implemented using thepins package. It supports multiplefile formats and file versioning.

Usage

whep_read_file(file_alias, type = "parquet", version = NULL)

Arguments

file_alias

Internal name of the requested file. You can find thepossible values in thealias column of thewhep_inputs dataset.

type

The extension of the file that must be read. Possible values:

  • parquet: This is the default value for code efficiency reasons.

  • csv: Mainly available for those who want a more human-readable option.If theparquet version is available, this is useless because thisfunction already returns the dataset in anR object, so the origin isirrelevant, andparquet is read faster.

Saving each file in both formats is for transparency and accessibilitypurposes, e.g., having to share the data with non-programmers who caneasily import a CSV into a spreadsheet. You will most likely never haveto set this option manually unless for some reason a file could not besupplied in e.g.parquet format but was in another one.

version

The version of the file that must be read. Possible values:

  • NULL: This is the default value. A frozen version is chosen to makethe code reproducible. Each release will have its own frozen versions.The version is the string that can be found inwhep_inputs in theversion column.

  • "latest": This overrides the frozen version and instead fetches thelatest one that is available. This might or might not match the frozenversion.

  • Other: A specific version can also be used. For more details read theversion column information fromwhep_inputs.

Value

A tibble with the dataset. Some information about each dataset canbe found in the code where it's used as input for further processing.

Examples

whep_read_file("read_example")whep_read_file("read_example", type = "parquet", version = "latest")whep_read_file(  "read_example",  type = "csv",  version = "20250721T152646Z-ce61b")

[8]ページ先頭

©2009-2025 Movatter.jp