Movatterモバイル変換


[0]ホーム

URL:


Title:Parse 'NOAA' Integrated Surface Data Files
Description:Tools for parsing 'NOAA' Integrated Surface Data ('ISD') files, described athttps://www.ncdc.noaa.gov/isd. Data includes for example, wind speed and direction, temperature, cloud data, sea level pressure, and more. Includes data from approximately 35,000 stations worldwide, though best coverage is in North America/Europe/Australia. Data is stored as variable length ASCII character strings, with most fields optional. Included are tools for parsing entire files, or individual lines of data.
Version:0.4.0
License:MIT + file LICENSE
Encoding:UTF-8
URL:https://docs.ropensci.org/isdparser (docs)https://github.com/ropensci/isdparser (devel)
BugReports:https://github.com/ropensci/isdparser/issues
LazyData:true
VignetteBuilder:knitr
Imports:tibble (≥ 1.2), data.table (≥ 1.10.0), lubridate
Suggests:testthat, rmarkdown, knitr
RoxygenNote:7.0.2
X-schema.org-applicationCategory:Climate
X-schema.org-keywords:climate, NOAA, data, ISD, stations
X-schema.org-isPartOf:https://ropensci.org
NeedsCompilation:no
Packaged:2020-02-17 21:46:55 UTC; sckott
Author:Scott ChamberlainORCID iD [aut, cre]
Maintainer:Scott Chamberlain <myrmecocystus@gmail.com>
Repository:CRAN
Date/Publication:2020-02-17 22:10:12 UTC

Parse NOAA ISD Files

Description

Parse NOAA ISD Files

Data format

Each record (data.frame row or individual list element) you get viaisd_parse orisd_parse_line has all data combined.Control data fields are first, then mandatory fields, then additional datafields and remarks. Control and mandatory fields have column namesdescribing what they are, while additional data fields have a lengththree character prefix (e.g., AA1) linking the fields to the documentationfor theAdditional Data Section atftp://ftp.ncdc.noaa.gov/pub/data/noaa/ish-format-document.pdf

Data size

Each line of an ISD data file has maximum of 2,844 characters.

Control Data

The beginning of each record provides information about the reportincluding date, time, and station location information. Data fieldswill be in positions identified in the applicable data definition.The control data section is fixed length and is 60 characters long.

Mandatory data

Each line of an ISD data file starts with mandatory data section.The mandatory data section contains meteorological information on thebasic elements such as winds, visibility, and temperature. These are themost commonly reported parameters and are available most of the time.The mandatory data section is fixed length and is 45 characters long.

Additional data

Each line of an ISD data file has an optional additional datasection, which follows the mandatory data section. These additional datacontain information of significance and/or which are received withvarying degrees of frequency. Identifiers are used to note when dataare present in the record. If all data fields in a group are missing,the entire group is usually not reported. If no groups are reportedthe section will be omitted. The additional data section is variablein length with a minimum of 0 characters and a maximum of 637(634 characters plus a 3 character section identifier) characters.

Remarks data

The numeric and character (plain language) remarks are provided if theyexist. The data will vary in length and are identified in the applicabledata definition. The remarks section has a maximum length of 515(512 characters plus a 3 character section identifier) characters.

Missing values

Missing values for any non-signed item are filled (i.e., 999). Missingvalues for any signed item are positive filled (i.e., +99999).

Longitude and Latitude Coordinates

Longitudes will be reported with negative values representing longitudeswest of 0 degrees, and latitudes will be negative south of the equator.Although the data field allows for values to a thousandth of a degree,the values are often only computed to the hundredth of a degree witha 0 entered in the thousandth position.

Author(s)

Scott Chamberlainmyrmecocystus@gmail.com


NOAA ISD metadata data.frame

Description

This data.frame includes metadata describing all the data provided in ISDdata files. And is used for transforming and scaling variables.

Format

A data frame with 643 rows and 19 columns

Details

Original csv data is in inst/extdata/isd_metadata.csv, collected from

The data.frame has the following columns:


Parse NOAA ISD/ISH data files

Description

Parse NOAA ISD/ISH data files

Usage

isd_parse(  path,  additional = TRUE,  parallel = FALSE,  cores = getOption("cl.cores", 2),  progress = FALSE)

Arguments

path

(character) file path. required

additional

(logical) include additional and remarks data sectionsin output. Default:TRUE

parallel

(logical). do processing in parallel. Default:FALSE

cores

(integer) number of cores to use: Default: 2. We look inyour option "cl.cores", but use default value if not found.

progress

(logical) print progress - ignored ifparallel=TRUE.The default isFALSE because printing progress adds a small bit oftime, so if processing time is important, then keep asFALSE

Value

A tibble (data.frame)

References

ftp://ftp.ncdc.noaa.gov/pub/data/noaa

See Also

isd_parse_line

Examples

path <- system.file('extdata/104270-99999-1928.gz', package = "isdparser")(res <- isd_parse(path))# with progress(res2 <- isd_parse(path, progress = TRUE))# only control + mandatory sections(res <- isd_parse(path, additional = FALSE))## Not run: # in parallel(out <- isd_parse(path, parallel = TRUE))## End(Not run)

Parse NOAA ISD/ISH csv data files

Description

Parse NOAA ISD/ISH csv data files

Usage

isd_parse_csv(path)

Arguments

path

(character) file path. required

Details

Note that the 'rem' (remarks) and 'eqd' columns arenot parsed, just as with [isd_parse()].

Value

A tibble (data.frame)

Column information

- USAF MASTER and NCEI WBAN station identifiers are combined into an 11character code with the column 'station'- Date and Time have been combined to the column 'date'- Call letter is synonymous with 'call_sign' column- WIND-OBSERVATION is abbreviated as column 'wnd'- SKY-CONDITION-OBSERVATION is abbreviated as column 'cig'- VISIBILITY-OBSERVATION is abbreviated as column 'vis'- AIR-TEMPERATURE-OBSERVATION air temperature is abbreviated as the columnheader 'tmp'- AIR-TEMPERATURE-OBSERVATION dew point is abbreviated as the column'dew'- AIR-PRESSURE-OBSERVATION sea level pressure is abbreviated as the column'slp'

References

https://www.ncei.noaa.gov/data/global-hourly/access/https://www.ncei.noaa.gov/data/global-hourly/doc/CSV_HELP.pdfhttps://www.ncei.noaa.gov/data/global-hourly/doc/isd-format-document.pdf

Examples

path <- system.file('extdata/00702699999.csv', package = "isdparser")(res <- isd_parse_csv(path))# isd_parse_csv compared to isd_parseif (interactive()) {x="https://www.ncei.noaa.gov/data/global-hourly/access/2017/00702699999.csv"download.file(x, (f_csv=file.path(tempdir(), "00702699999.csv")))y="ftp://ftp.ncdc.noaa.gov/pub/data/noaa/2017/007026-99999-2017.gz"download.file(y, (f_gz=file.path(tempdir(), "007026-99999-2017.gz")))from_csv <- isd_parse_csv(f_csv)from_gz <- isd_parse(f_gz, parallel = TRUE)x="https://www.ncei.noaa.gov/data/global-hourly/access/1913/02982099999.csv"download.file(x, (f=file.path(tempdir(), "02982099999.csv")))isd_parse_csv(f)x="https://www.ncei.noaa.gov/data/global-hourly/access/1923/02970099999.csv"download.file(x, (f=file.path(tempdir(), "02970099999.csv")))isd_parse_csv(f)x="https://www.ncei.noaa.gov/data/global-hourly/access/1945/04390099999.csv"download.file(x, (f=file.path(tempdir(), "04390099999.csv")))isd_parse_csv(f)x="https://www.ncei.noaa.gov/data/global-hourly/access/1976/02836099999.csv"download.file(x, (f=file.path(tempdir(), "02836099999.csv")))isd_parse_csv(f)}

Parse NOAA ISD/ISH data files - line by line

Description

Parse NOAA ISD/ISH data files - line by line

Usage

isd_parse_line(x, additional = TRUE, as_data_frame = TRUE)

Arguments

x

(character) a single ISD line

additional

(logical) include additional and remarks data sectionsin output. Default:TRUE

as_data_frame

(logical) output a tibble. Default:FALSE

Value

A tibble (data.frame)

References

ftp://ftp.ncdc.noaa.gov/pub/data/noaa

See Also

isd_parse

Examples

path <- system.file('extdata/024130-99999-2016.gz', package = "isdparser")lns <- readLines(path, encoding = "latin1")isd_parse_line(lns[1])isd_parse_line(lns[1], FALSE)res <- lapply(lns[1:1000], isd_parse_line)library("data.table")library("tibble")as_tibble( rbindlist(res, use.names = TRUE, fill = TRUE))# only control + mandatory sectionsisd_parse_line(lns[10], additional = FALSE)isd_parse_line(lns[10], additional = TRUE)

Transform ISD data variables

Description

Transform ISD data variables

Usage

isd_transform(x)

Arguments

x

(data.frame/tbl_df) data.frame/tbl fromisd_parse ordata.frame/tbl or list fromisd_parse_line

Details

This function helps you clean your ISD data.isd_parseandisd_parse_line give back data without modifying thedata. However, you'll likely want to transform some of the variables,in terms of the variable class (character to numeric), accounting for thescaling factor (variable X may need to be multiplied by 1000 accordingto the ISD docs), and missing values (unfortunately, missing valuestandards vary across ISD data).

Value

A tibble (data.frame) or list

operations performed

See Also

isd_parse,isd_parse_line

Examples

path <- system.file('extdata/104270-99999-1928.gz', package = "isdparser")(res <- isd_parse(path))isd_transform(res)lns <- readLines(path, encoding = "latin1")# data.frame(res <- isd_parse_line(lns[1]))isd_transform(res)# list(res <- isd_parse_line(lns[1], as_data_frame = FALSE))isd_transform(res)

[8]ページ先頭

©2009-2025 Movatter.jp