| Type: | Package |
| Title: | Download and Tidy Time Series Data from the Australian Bureau ofStatistics |
| Version: | 0.4.19 |
| Maintainer: | Matt Cowgill <mattcowgill@gmail.com> |
| Description: | Downloads, imports, and tidies time series data from the Australian Bureau of Statisticshttps://www.abs.gov.au/. |
| License: | MIT + file LICENSE |
| Encoding: | UTF-8 |
| Depends: | R (≥ 3.5) |
| Imports: | readxl (≥ 1.2.0), dplyr (≥ 0.8.0), hutils (≥ 1.5.0), fst,purrr (≥ 1.0.0), tidyr (≥ 1.0.0), stringi, tools, glue, httr,rvest, xml2, rlang, labelled |
| URL: | https://github.com/mattcowgill/readabs |
| BugReports: | https://github.com/mattcowgill/readabs/issues |
| RoxygenNote: | 7.3.2 |
| VignetteBuilder: | knitr |
| Suggests: | knitr, rmarkdown, markdown, testthat (≥ 2.1.0), ggplot2 |
| NeedsCompilation: | no |
| Packaged: | 2025-05-18 06:40:18 UTC; mattcowgill |
| Author: | Matt Cowgill |
| Repository: | CRAN |
| Date/Publication: | 2025-05-18 07:00:02 UTC |
ABS.Stat API functions
Description
These experimental functions provide a minimal interface to the ABS.Stat API.
More information on the ABS.Stat API can be foundon theABS website
Note that an ABS.Stat 'dataflow' is like a table. A 'datastructure'contains metadata that describes the variables in the dataflow. To load datafrom the ABS.Stat API, you need to either:
Using
read_api_dataflows()you can get information on the available dataflowsUsing
read_api_datastructure()you can get metadata relating to aspecific dataflow, including the variables available in each dataflowUsing
read_api()you can get the data belonging to a given dataflow.Using
read_api_url()you can get the data for a given query urlgenerated using theonline data viewer.
Usage
read_api_dataflows()read_api( id, datakey = NULL, start_period = NULL, end_period = NULL, version = NULL)read_api_url(url)read_api_datastructure(id)Arguments
id | A dataflow id. Use |
datakey | A named list matching filter variables to codes. All variableswith a |
start_period | The start period (used to filter by time). This isinclusive. The supported formats are:
|
end_period | The end period (used to filter on time). This is inclusive.The supported formats are the same as for |
version | A version number, if unspecified the latest version of thedataset is used. Use |
url | A complete query url |
Details
Note that the API enforces a reasonably strict gateway timeout policy. Thismeans that, if you're trying to access a reasonably large dataset, you willneed to filter it on the server side using thedatakey. You might like toreview the data manually via theABS websiteto figure out what subset of the data you require.
Note, furthermore, that the datastructure contains a complete codebook forthe variables appearing in the relevant dataflow. Since some variables areshared across multiple dataflows, this means that the datastructurecorresponding to a particularid may contain values for a given variablewhich are not in the corresponding dataflow.
Value
A data.frame
Examples
## Not run: # List available dataflowsread_api_dataflows()# Say we want the "Estimated resident population, Country of birth"# data flow, with the id ERP_COB. We load the data like this:# Get full data set for a given flow by providing id and start period:read_api("ERP_COB", start_period = 2020)# In some cases, loading a whole dataflow (as above) won't work.# For eg., the `ABS_C16_T10_SA` dataflow is very large,# so the gateway will timeout if we try to collect the full data settry(read_api("ABS_C16_T10_SA"))# We need to filter the dataflow before downlaoding it.# To figure out how to filter it, we get metadata ('datastructure').ds <- read_api_datastructure("ABS_C16_T10_SA")# The `asgs_2016` code for 'Australia' is 0ds[ds$var == "asgs_2016" & ds$label == "Australia", ]# The `sex_abs` code for 'Persons' (i.e. all persons) is 3ds[ds$var == "sex_abs" & ds$label == "Persons", ]# So we have:x <- read_api("ABS_C16_T10_SA", datakey = list(asgs_2016 = 0, sex_abs = 3))unique(x["asgs_2016"]) # Confirming only 'Australia' level records came throughunique(x["sex_abs"]) # Confirming only 'Persons' level records came through# Please note however that not all values in the datastructure necessarily# appear in the data. You get 404s in this caseds[ds$var == "regiontype" & ds$label == "Destination Zones", ]try(read_api("ABS_C16_T10_SA", datakey = list(regiontype = "DZN")))# If you already have a query url, then use `read_api_url()`wpi_url <- "https://data.api.abs.gov.au/rest/data/ABS,WPI/all"read_api_url(wpi_url)## End(Not run)Get date of most recent observation(s) in ABS time series
Description
This function returns the most recent observation date for a specified ABStime series catalogue number (as a whole), individual tables, or series IDs.
Usage
check_latest_date(cat_no = NULL, tables = "all", series_id = NULL)Arguments
cat_no | ABS catalogue number, as a string, including the extension.For example, "6202.0". |
tables | numeric. Time series tables in |
series_id | (optional) character. Supply an ABS unique time seriesidentifier (such as "A2325807L") to get only that series.This is an alternative to specifying |
Details
Where the individual time series in your requesthave multiple dates, only the most recent will be returned.
Value
Date vector of length one. Date corresponds to the most recentobservation date for any of the time series in the table(s) requested.observation date for any of the time series in the table(s) requested.
Examples
## Not run: # Check a whole catalogue number; return the latest release date for any# time series in the numbercheck_latest_date("6345.0")# Return latest release date for a table within a catalogue number - note# the function will return the release date# of the most-recently-updated series within the tablescheck_latest_date("6345.0", tables = 1)# Or for multiple tables - note the function will return the release date# of the most-recently-updated series within the tablescheck_latest_date("6345.0", tables = c("1", "5a"))# Or for an individual time seriescheck_latest_date(series_id = "A2713849C")## End(Not run)Internal function to check if the data frame returned by read_lfs_grossflows()contains expected unique values in key columns
Description
Internal function to check if the data frame returned by read_lfs_grossflows()contains expected unique values in key columns
Usage
check_lfs_grossflows(df)Arguments
df | data frame containing gross flows data |
Experimental helper function to download ABS data cubes that are not compatible with read_abs.
Description
download_abs_data_cube() downloads the latest ABS data cubes based on the catalogue name (from the website url) and cube.The function downloads the file to disk.
Unlikeread_abs(), this function doesn't import or tidy the data.Convenience functions are provided to import and tidy key data cubes; see?read_payrolls() and?read_lfs_grossflows().
Usage
download_abs_data_cube( catalogue_string, cube, path = Sys.getenv("R_READABS_PATH", unset = tempdir()))Arguments
catalogue_string | ABS catalogue name as a string from the ABS website.For example, Labour Force, Australia, Detailed is "labour-force-australia-detailed".The possible catalogues can be obtained using the helper function |
cube | character. A character string that is either the complete filename or (uniquely) in the filename of the data cube you want todownload, e.g. "EQ09". The available filenames can be obtained using the helper function |
path | Local directory in which downloaded files should be stored. By default, |
Details
download_abs_data_cube() downloads an Excel spreadsheet from the ABS.
The file need to be saved somewhere on your disk.This local directory can be controlled using thepath argument toread_abs(). If thepath argument is not set,read_abs() will storethe files in a directory set in the "R_READABS_PATH" environment variable.If this variable isn't set, files will be saved in a temporary directory.
To check the value of the "R_READABS_PATH" variable, runSys.getenv("R_READABS_PATH"). You can set the value of this variablefor a single session usingSys.setenv(R_READABS_PATH = <path>).If you would like to change this variable for all future R sessions, edityour.Renviron file and addR_READABS_PATH = <path> line.The easiest way to edit this file is usingusethis::edit_r_environ().
The filepath is returned invisibly which enables piping tounzip() orreadxl::read_excel.
See Also
Other data cube functions:search_catalogues(),show_available_catalogues(),show_available_files()
Examples
## Not run: download_abs_data_cube( catalogue_string = "labour-force-australia-detailed", cube = "EQ09")## End(Not run)This function is temporarily necessary while the readabs maintainerattempts to resolve an issue with the ABS. The ABS as at late March 2021stopped including Table 5 of the Weekly Payrolls release with each newrelease of the data. This function finds the link from the previousrelease and attemps to download it. This function will no longer be requiredif/when the ABS reverts to the previous release arrangements. The functionis internal and is called byread_payrolls().
Description
This function is temporarily necessary while the readabs maintainerattempts to resolve an issue with the ABS. The ABS as at late March 2021stopped including Table 5 of the Weekly Payrolls release with each newrelease of the data. This function finds the link from the previousrelease and attemps to download it. This function will no longer be requiredif/when the ABS reverts to the previous release arrangements. The functionis internal and is called byread_payrolls().
Usage
download_previous_payrolls(cube_name, path)Arguments
cube_name | eg. DO004 for table 4 |
path | Directory in which to download payrolls cube |
Value
A list containing two elements:result (will contain path + filenameto downloaded file if download was successful); anderror (NULL if filedownloaded successfully; character otherwise).
Extract data sheets from an ABS timeseries workbook saved locally as anExcel file.
Description
Note that this function will not tidy the data for you.Useread_abs_local()to import and tidy data from local ABS time seriesspreadsheets orread_abs() to download, import and tidy ABS time series.
Usage
extract_abs_sheets( filename, table_title = NULL, path = Sys.getenv("R_READABS_PATH", unset = tempdir()))Arguments
filename | Filename for an ABS time series spreadsheet (as string) |
table_title | String giving the full title of the ABS table, such as"Table 1. Employed persons, Australia" |
path | Local directory in which an ABS time series is stored. Default is |
Very slightly faster version of stringr's str_squish()
Description
Very slightly faster version of stringr's str_squish()
Usage
fast_str_squish(string)Arguments
string | A string to squish (remove whitespace) |
Show the available Labour Force, Australia, detailed data cubes that can bedownloaded
Description
Show the available Labour Force, Australia, detailed data cubes that can bedownloaded
Usage
get_available_lfs_cubes()Details
Intended to be used withread_lfs_datacube(). Callread_lfs_datacube() interactively, find the table of interest(eg. "LM1"), then useread_lfs_datacube().
Examples
get_available_lfs_cubes()Download, extract, and tidy ABS time series spreadsheets
Description
read_abs() downloads ABS time series spreadsheets,then extracts the data from those spreadsheets,then tidies the data. The result is a singledata frame (tibble) containing tidied data.
Usage
read_abs( cat_no = NULL, tables = "all", series_id = NULL, path = Sys.getenv("R_READABS_PATH", unset = tempdir()), metadata = TRUE, show_progress_bars = TRUE, retain_files = TRUE, check_local = TRUE, release_date = "latest")read_abs_series(series_id, ...)Arguments
cat_no | ABS catalogue number, as a string, including the extension.For example, "6202.0". |
tables | numeric. Time series tables in |
series_id | (optional) character. Supply an ABS unique time seriesidentifier (such as "A2325807L") to get only that series.This is an alternative to specifying |
path | Local directory in which downloaded ABS time seriesspreadsheets should be stored. By default, |
metadata | logical. If |
show_progress_bars | TRUE by default. If set to FALSE, progress barswill not be shown when ABS spreadsheets are downloading. |
retain_files | when TRUE (the default), the spreadsheets downloadedfrom the ABS website will be saved in the directory specified with |
check_local | If |
release_date | Either |
... | Arguments to |
Details
read_abs_series() is a wrapper aroundread_abs(), withseries_id asthe first argument.
read_abs() downloads spreadsheet(s) from the ABS containing timeseries data. These files need to be saved somewhere on your disk.This local directory can be controlled using thepath argument toread_abs(). If thepath argument is not set,read_abs() will storethe files in a directory set in the "R_READABS_PATH" environment variable.If this variable isn't set, files will be saved in a temporary directory.
To check the value of the "R_READABS_PATH" variable, runSys.getenv("R_READABS_PATH"). You can set the value of this variablefor a single session usingSys.setenv(R_READABS_PATH = <path>).If you would like to change this variable for all future R sessions, edityour.Renviron file and addR_READABS_PATH = <path> line.The easiest way to edit this file is usingusethis::edit_r_environ().
Certain corporate networks restrict your ability to download files in an Rsession. On some of these networks, the"wininet" method must be used whendownloading files. Users can now specify the method that will be used todownload files by setting the"R_READABS_DL_METHOD" environment variable.
For example, the following code sets the environment variable for yourcurrent session: sSys.setenv("R_READABS_DL_METHOD" = "wininet")You can addR_READABS_DL_METHOD = "wininet" to your .Renviron to havethis persist across sessions.
Therelease_date argument allows you to download table(s) other than thelatest release. This is useful for examining revisions to time series, orfor obtaining the version of series that were available on a given date.Note that you cannot supply more than one date torelease_date. Note alsothat any dates prior to mid-2019 (the exact date varies by series) will fail.Specifyingrelease_date only reliably works for monthly, and somequarterly, data. It does not work for annual data.
Value
A data frame (tibble) containing the tidied data from the ABS timeseries table(s).
Examples
# Download and tidy all time series spreadsheets# from the Wage Price Index (6345.0)## Not run: wpi <- read_abs("6345.0")## End(Not run)# Download table 1 from the Wage Price Index## Not run: wpi_t1 <- read_abs("6345.0", tables = "1")## End(Not run)# Or table 1 as in the Sep 2019 release of the WPI:## Not run: wpi_t1_sep2019 <- read_abs("6345.0", tables = "1", release_date = "2019-09-01")## End(Not run)# Or tables 1 and 2a from the WPI## Not run: wpi_t1_t2a <- read_abs("6345.0", tables = c("1", "2a"))## End(Not run)# Get two specific time series, based on their time series IDs## Not run: cpi <- read_abs(series_id = c("A2325806K", "A2325807L"))## End(Not run)# Get series IDs using the `read_abs_series()` wrapper function## Not run: cpi <- read_abs_series(c("A2325806K", "A2325807L"))## End(Not run)Extracts ABS time series data from local Excel spreadsheets and converts tolong format.
Description
read_abs_data() is soft deprecated and will be removed in a future version.Please useread_abs_local() to import and tidy locally-storedABS time series spreadsheets, orread_abs() to download, import,and tidy time series spreadsheets from the ABS website.
Usage
read_abs_data(path, sheet)Arguments
path | Filepath to Excel spreadsheet. |
sheet | Sheet name or number. |
Value
Long-format dataframe
Read and tidy locally-saved ABS time series spreadsheet(s)
Description
If you need to download and tidy time series data from the ABS,useread_abs().read_abs_local() imports and tidies datafrom ABS time series spreadsheets that are already saved to your local drive.
Usage
read_abs_local( cat_no = NULL, filenames = NULL, path = Sys.getenv("R_READABS_PATH", unset = tempdir()), use_fst = TRUE, metadata = TRUE)Arguments
cat_no | character; a single catalogue number such as "6202.0".When |
filenames | character vector of at least one filename of alocally-stored ABS time series spreadsheet. For example, "6202001.xls" orc("6202001.xls", "6202005.xls"). Ignored if a value is supplied to |
path | path to local directory containing ABS time series file(s).Default is |
use_fst | logical. If |
metadata | logical. If |
Details
Unlikeread_abs(), thetable_title column in the data framereturned byread_abs_local() is blank. If you requiretable_title,please useread_abs() instead.
Examples
# Load and tidy two specified files from the "data/ABS" subdirectory# of your working directory## Not run: lfs <- read_abs_local(c("6202001.xls", "6202005.xls"))## End(Not run)Extracts ABS series metadata directly from Excel spreadsheets andconverts to long-form.
Description
Extracts ABS series metadata directly from Excel spreadsheets andconverts to long-form.
Usage
read_abs_metadata(path, sheet)Arguments
path | Filepath to Excel spreadsheet. |
sheet | Sheet name or number. |
Value
Long-form dataframe
Download and import an ABS time series spreadsheet from a given URL
Description
Download and import an ABS time series spreadsheet from a given URL
Usage
read_abs_url( url, path = Sys.getenv("R_READABS_PATH", unset = tempdir()), show_progress_bars = TRUE, ...)Arguments
url | Character vector of url(s) to ABS time series spreadsheet(s). |
path | Local directory in which downloaded ABS time seriesspreadsheets should be stored. By default, |
show_progress_bars | TRUE by default. If set to FALSE, progress barswill not be shown when ABS spreadsheets are downloading. |
... | Additional arguments passed to |
Details
If you have a specific URL to the time series spreadsheet you wishto download,read_abs_url() will download, import and tidy it. This isuseful for older vintages of data, or discontinued data.
Examples
## Not run: url <- paste0( "https://www.abs.gov.au/statistics/labour/", "employment-and-unemployment/labour-force-australia/aug-2022/6202001.xlsx")read_abs_url(url)## End(Not run)read_awe
Description
Convenience function to obtain wage levels from ABS6302.0, Average Weekly Earnings, Australia.
Usage
read_awe( wage_measure = c("awote", "ftawe", "awe"), sex = c("persons", "males", "females"), sector = c("total", "private", "public"), state = c("all", "nsw", "vic", "qld", "sa", "wa", "tas", "nt", "act"), na.rm = FALSE, path = Sys.getenv("R_READABS_PATH", unset = tempdir()), show_progress_bars = FALSE, check_local = FALSE)Arguments
wage_measure | Character of length 1. Must be one of:
|
sex | Character of length 1. Must be one of: |
sector | Character of length 1. Must be one of: |
state | Character of length 1. Must be one of: |
na.rm | Logical. |
path | See |
show_progress_bars | See |
check_local | See |
Details
The latest AWE data is available usingread_abs(cat_no = "6302.0", tables = 2).However, this time series only goes back to 2012, when the ABS switchedfrom quarterly to biannual collection and release of the AWE data. Theread_awe() function assembles on time series back to November 1983 quarter;it is quarterly to 2012 and biannual from then. Note that the datareturned with this function is consistently quarterly; any quarters forwhich there are no observations are recorded asNA unlessna.rm =TRUE.
Value
Atbl_df with four columns:date,sex,wage_measure andvalue.The data is nominal and seasonally adjusted.
Examples
## Not run: read_awe("awote", "persons")## End(Not run)Download a tidy tibble containing the Consumer Price Index from the ABS
Description
read_cpi() uses theread_abs() function to download, import,and tidy the Consumer Price Index from the ABS. It returns a tibblecontaining two columns: the date and the CPI index value that correspondsto that date. This makes joining the CPI to another dataframe easy.read_cpi() returns the original (ie. not seasonally adjusted)all groups CPI for Australia. If you want the analytical series(eg. seasonally adjusted CPI, or trimmed mean CPI), you can useread_abs().
Usage
read_cpi( path = Sys.getenv("R_READABS_PATH", unset = tempdir()), show_progress_bars = TRUE, check_local = FALSE, retain_files = FALSE)Arguments
path | character; default is "data/ABS". Only used ifretain_files is set to TRUE. Local directory in which to savedownloaded ABS time series spreadsheets. |
show_progress_bars | logical; TRUE by default. If set to FALSE, progressbars will not be shown when ABS spreadsheets are downloading. |
check_local | logical; FALSE by default. See |
retain_files | logical; FALSE by default. When TRUE, the spreadsheetsdownloaded from the ABS website will be saved in thedirectory specified with 'path'. |
Examples
# Create a tibble called 'cpi' that contains the CPI index# numbers for each quartercpi <- read_cpi()# This tibble can now be joined to another to help streamline the process of# deflating nominal values.Download a tidy tibble containing the Estimated Residential Population from the ABS
Description
read_erp() uses theread_abs() function to download, import,and tidy the Estimated Residential Population from the ABS. It allows the userto specify age, sex and states/territories of interest. It returns a tibblecontaining five columns: the date, the age range, sex and states that the ERPcorresponds to. This makes joining the ERP to another dataframe easy.
Usage
read_erp( age_range = 0:100, sex = "Persons", states = c("Australia", "New South Wales", "Victoria", "Queensland", "South Australia", "Western Australia", "Tasmania", "Northern Territory", "Australian Capital Territory"), path = Sys.getenv("R_READABS_PATH", unset = tempdir()), show_progress_bars = TRUE, check_local = FALSE, retain_files = FALSE)Arguments
age_range | numeric; default is "0:100". A vector containing ages in singleyears for which an ERP is sought. The ABS top-code ages at 100. |
sex | character; default is "Persons". Other values are "Male" and"Female". Multiple values allowed. |
states | character; default is "Australia". Other values are the fullor abbreviated names of the states and self-governing territories. Multiplevalues allowed. |
path | character; default is "data/ABS". Only used ifretain_files is set to TRUE. Local directory in which to savedownloaded ABS time series spreadsheets. |
show_progress_bars | logical; TRUE by default. If set to FALSE, progressbars will not be shown when ABS spreadsheets are downloading. |
check_local | logical; FALSE by default. See |
retain_files | logical; FALSE by default. When TRUE, the spreadsheetsdownloaded from the ABS website will be saved in thedirectory specified with 'path'. |
Examples
# Create a tibble called 'erp' that contains the ERP index# numbers for 30 June each year for Australia.erp <- read_erp()Download and tidy ABS Job Mobility tables
Description
Import a tidy tibble of ABS Job Mobility data
Usage
read_job_mobility( tables = "all", path = Sys.getenv("R_READABS_PATH", unset = tempdir()))Arguments
tables | Either |
path | Local directory in which downloaded ABS time series spreadsheets should be stored. By default, 'path' takes the value set in the environment variable "R_READABS_PATH". If this variable is not set, any files downloaded by read_abs() will be stored in a temporary directory (tempdir()). |
Examples
## Not run: # Get all tables from the ABS Job Mobility seriesread_job_mobility()# Get tables 1 and 2read_job_mobility(c(1, 2))## End(Not run)Convenience function to download and tidy data cubes fromABS Labour Force, Australia, Detailed.
Description
Convenience function to download and tidy data cubes fromABS Labour Force, Australia, Detailed.
Usage
read_lfs_datacube(cube, path = Sys.getenv("R_READABS_PATH", unset = tempdir()))Arguments
cube | character. A character string that is either the complete filenameor (uniquely) in the filename of the data cube you want to download. Use |
path | Local directory in which downloaded files should be stored. |
Value
A tibble with the data from the data cube. Columns names aretidied and dates are converted to Date class.
Examples
read_lfs_datacube("EQ02")Download, import and tidy 'gross flows' data cubefrom the monthly ABS Labour Force survey.
Description
This convenience function downloads, imports and tidies the 'gross flows' datacube from the monthly ABS Labour Force survey. The gross flows data cube (GM1)shows estimates of the number of people whotransitioned from one labour force status to another between two months.
Usage
read_lfs_grossflows( weights = c("current", "previous"), path = Sys.getenv("R_READABS_PATH", unset = tempdir()))Arguments
weights | either |
path | Local directory in which downloaded files should be stored.By default, 'path' takes the value set in the environment variable"R_READABS_PATH". If this variable is not set, any files downloadedwill be stored in a temporary directory ( |
Value
A tibble containing data cube GM1 from the monthly Labour Force survey.
Examples
## Not run: read_lfs_grossflows()## End(Not run)Download and tidy ABS payroll jobs and wages data
Description
Import a tidy tibble of ABS Payroll Jobss data.
Usage
read_payrolls( series = c("industry_jobs", "subindustry_jobs", "empsize_jobs"), path = Sys.getenv("R_READABS_PATH", unset = tempdir()))Arguments
series | Character. Must be one of:
The default is "industry_jobs". |
path | Local directory in which downloaded ABS time seriesspreadsheets should be stored. By default, |
Details
The ABSPayroll Jobsdataset draws upon data collectedby the Australian Taxation Office as part of its Single-Touch Payrollinitiative and supplements the monthly Labour Force Survey. Unfortunately,the data as published by the ABS (1) is not in a standard time seriesspreadsheet; and (2) is messy in various ways that make it hard toread in R. This convenience function usesdownload_abs_data_cube() toimport the payrolls data, and then tidies it up.
Note that this ABS release used to be called Weekly Payroll Jobs and Wages Australia.The total wages series were removed from this release in mid-2023 and itwas renamed to Weekly Payroll Jobs. The ability to read total wagesindexes using this function was therefore also removed. It was then renamedPayroll Jobs and the frequency was reduced, with further modifications tothe data released.
Value
A tidy (long)tbl_df. The number of columns differs based on theseries.
Examples
## Not run: # Fetch payroll jobs by industry and state (the default, "industry_jobs")read_payrolls()# Payroll jobs by employer sizeread_payrolls("empsize_jobs")## End(Not run)Helper function fordownload_abs_data_cube to scrape the available catalogues from the ABS website.
Description
This function downloads a new version of the lookup table used byshow_available_catalogues.
Usage
scrape_abs_catalogues()Value
A tibble containing the catalogues and how they are organised on the ABS website.
Search for ABS catalogues that match a string
Description
Helper function to use with
download_abs_data_cube().
download_abs_data_cube() requires that you specify acatalogue.search_catalogues() helps you find the catalogue you want, by searching fora given string in the catalogue names, product title, and broad topic.
Usage
search_catalogues(string, refresh = FALSE)Arguments
string | Character. A word or phrase you want to search for, such as "labour" or"union". Not case sensitive. |
refresh | Logical. |
Value
A data frame (tibble) containing the topic (heading), product title(sub_heading), catalogue (catalogue) and URL (URL) of any cataloguesthat match the provided string.
See Also
Other data cube functions:download_abs_data_cube(),show_available_catalogues(),show_available_files()
Examples
search_catalogues("labour")Search for a file within an ABS catalogue
Description
Search for a file within an ABS catalogue
Usage
search_files(string, catalogue, refresh = FALSE)Arguments
string | String to search for among filenames in a catalogue |
catalogue | Name of catalogue |
refresh | logical; |
Examples
## Not run: search_files("GM1", "labour-force-australia")## End(Not run)Separate the series column in a tidy ABS time series data frame
Description
Separate the 'series' column in a data frame (tibble)downloaded usingread_abs() into multiple columns using the ";"separator.
Usage
separate_series( data, column_names = NULL, remove_totals = FALSE, remove_nas = FALSE)Arguments
data | A data frame (tibble) containing tidied data from the ABS timeseries table(s). |
column_names | (optional) character vector. Supply a vector of columnnames, such as |
remove_totals | logical. FALSE by default.If set to TRUE, any series rows that contain the word "total"will be removed. |
remove_nas | locical. FALSE by default. If set to TRUE, any rowscontainining an NA in at least one of the separated series columnswill be removed. |
Value
A data frame (tibble) containing the tidied data from the ABS timeseries table(s).
Examples
## Not run: wpi <- read_abs("6345.0", 1) %>% separate_series()## End(Not run)Helper function fordownload_abs_data_cube to show the available catalogues.
Description
This function lists the possible catalogues that are available on the ABS website.These catalogues must be specified as a string as an argument todownload_abs_data_cube.
Usage
show_available_catalogues(selected_heading = NULL, refresh = FALSE)Arguments
selected_heading | optional character string specifying the heading on theABS statistics webpage.e.g. "Earnings and work hours" |
refresh | logical; |
Value
a character vector of catalogues.
See Also
Other data cube functions:download_abs_data_cube(),search_catalogues(),show_available_files()
Examples
show_available_catalogues("Earnings and work hours")Helper function to show the files available in a particular catalogue number.
Description
To be used in conjunction with
download_abs_data_cube().
This function lists the possible files that are available in a catalogue.The filename (or an unambiguous part of the filename) must be specifiedas a string as an argument todownload_abs_data_cube.
Usage
show_available_files(catalogue_string, refresh = FALSE)get_available_files(catalogue_string, refresh = FALSE)Arguments
catalogue_string | character string specifying the catalogue,e.g. "labour-force-australia-detailed".You can use |
refresh | logical; |
Details
get_available_files() is an alias forshow_available_files().
Value
A tibble containing the title of the file, the filename and the complete url.
See Also
Other data cube functions:download_abs_data_cube(),search_catalogues(),show_available_catalogues()
Other data cube functions:download_abs_data_cube(),search_catalogues(),show_available_catalogues()
Examples
## Not run: show_available_files("labour-force-australia-detailed")## End(Not run)Tidy ABS time series data.
Description
Tidy ABS time series data.
Usage
tidy_abs(df, metadata = TRUE)Arguments
df | A data frame containing ABS time series datathat has been extracted using |
metadata | logical. If |
Value
data frame (tibble) in long format.
Examples
# First extract the data from the local spreadsheet## Not run: wpi <- extract_abs_sheets("634501.xls")## End(Not run)# Then tidy the data extracted from the spreadsheet. Note that# \code{extract_abs_sheets()} returns a list of data frames, so we need to# subset the list.## Not run: tidy_wpi <- tidy_abs(wpi[[1]])## End(Not run)Tidy multiple dataframes of ABS time series data contained in a list.
Description
Tidy multiple dataframes of ABS time series data contained in a list.
Usage
tidy_abs_list(list_of_dfs, metadata = TRUE)Arguments
list_of_dfs | A list of dataframes containing extractedABS time series data. |
metadata | logical. If |
Internal function to tidy a dataframe from ABS 6302
Description
Internal function to tidy a dataframe from ABS 6302
Usage
tidy_awe(df)Arguments
df | Data frame containing table 2 from ABS 6302, imported using |