Movatterモバイル変換


[0]ホーム

URL:


Skip to contents

Schema class

Source:R/schema.R
Schema-class.Rd

ASchema is an Arrow object containingFields, which map names toArrowdata types. Create aSchema when youwant to convert an Rdata.frame to Arrow but don't want to rely on thedefault mapping of R types to Arrow types, such as when you want to choose aspecific numeric precision, or when creating aDataset and you want toensure a specific schema rather than inferring it from the various files.

Many Arrow objects, includingTable andDataset, have a$schema method(active binding) that lets you access their schema.

Methods

  • $ToString(): convert to a string

  • $field(i): returns the field at indexi (0-based)

  • $GetFieldByName(x): returns the field with namex

  • $WithMetadata(metadata): returns a newSchema with the key-valuemetadata set. Note that all list elements inmetadata will be coercedtocharacter.

  • $code(namespace): returns the R code needed to generate this schema. Usenamespace=TRUE to call witharrow::.

Active bindings

  • $names: returns the field names (called innames(Schema))

  • $num_fields: returns the number of fields (called inlength(Schema))

  • $fields: returns the list ofFields in theSchema, suitable foriterating over

  • $HasMetadata: logical: does thisSchema have extra metadata?

  • $metadata: returns the key-value metadata as a named list.Modify or replace by assigning in (sch$metadata <- new_metadata).All list elements are coerced to string.

R Metadata

When converting a data.frame to an Arrow Table or RecordBatch, attributesfrom thedata.frame are saved alongside tables so that the object can bereconstructed faithfully in R (e.g. withas.data.frame()). This metadatacan be both at the top-level of thedata.frame (e.g.attributes(df)) orat the column (e.g.attributes(df$col_a)) or for list columns only:element level (e.g.attributes(df[1, "col_a"])). For example, this allowsfor storinghaven columns in a table and being able to faithfullyre-create them when pulled back into R. This metadata is separate from theschema (column names and types) which is compatible with other Arrowclients. The R metadata is only read by R and is ignored by other clients(e.g. Pandas has its own custom metadata). This metadata is stored in$metadata$r.

Since Schema metadata keys and values must be strings, this metadata issaved by serializing R's attribute list structure to a string. If theserialized metadata exceeds 100Kb in size, by default it is compressedstarting in version 3.0.0. To disable this compression (e.g. for tablesthat are compatible with Arrow versions before 3.0.0 and include largeamounts of metadata), set the optionarrow.compress_metadata toFALSE.Files with compressed metadata are readable by older versions of arrow, butthe metadata is dropped.


[8]ページ先頭

©2009-2025 Movatter.jp