FileSystem is an abstract file system API,LocalFileSystem is an implementation accessing fileson the local machine.SubTreeFileSystem is an implementation that delegatesto another implementation after prepending a fixed base path
Factory
LocalFileSystem$create() returns the object and takes no arguments.
SubTreeFileSystem$create() takes the following arguments:
base_path, a string pathbase_fs, aFileSystemobject
S3FileSystem$create() optionally takes arguments:
anonymous: logical, defaultFALSE. If true, will not attempt to look upcredentials using standard AWS configuration methods.access_key,secret_key: authentication credentials. If one is provided,the other must be as well. If both are provided, they will override anyAWS configuration set at the environment level.session_token: optional string for authentication along withaccess_keyandsecret_keyrole_arn: string AWS ARN of an AccessRole. If provided instead ofaccess_keyandsecret_key, temporary credentials will be fetched by assuming this role.session_name: optional string identifier for the assumed role session.external_id: optional unique string identifier that might be requiredwhen you assume a role in another account.load_frequency: integer, frequency (in seconds) with which temporarycredentials from an assumed role session will be refreshed. Default is900 (i.e. 15 minutes)region: AWS region to connect to. If omitted, the AWS library willprovide a sensible default based on client configuration, falling backto "us-east-1" if no other alternatives are found.endpoint_override: If non-empty, override region with a connect stringsuch as "localhost:9000". This is useful for connecting to file systemsthat emulate S3.scheme: S3 connection transport (default "https")proxy_options: optional string, URI of a proxy to use when connectingto S3background_writes: logical, whetherOutputStreamwrites will be issuedin the background, without blocking (defaultTRUE)allow_bucket_creation: logical, if TRUE, the filesystem will createbuckets if$CreateDir()is called on the bucket level (defaultFALSE).allow_bucket_deletion: logical, if TRUE, the filesystem will deletebuckets if$DeleteDir()is called on the bucket level (defaultFALSE).check_directory_existence_before_creation: logical, check if directoryalready exists or not before creation. Helpful for cloud storage operationswhere object mutation operations are rate limited or existing directoriesare read-only. (defaultFALSE).request_timeout: Socket read time on Windows and macOS in seconds. Ifnegative, the AWS SDK default (typically 3 seconds).connect_timeout: Socket connection timeout in seconds. If negative, AWSSDK default is used (typically 1 second).
GcsFileSystem$create() optionally takes arguments:
anonymous: logical, defaultFALSE. If true, will not attempt to look upcredentials using standard GCS configuration methods.access_token: optional string for authentication. Should be provided alongwithexpirationexpiration:POSIXct. optional datetime representing point at whichaccess_tokenwill expire.json_credentials: optional string for authentication. Either a stringcontaining JSON credentials or a path to their location on the filesystem.If a path to credentials is given, the file should be UTF-8 encoded.endpoint_override: if non-empty, will connect to provided host name / port,such as "localhost:9001", instead of default GCS ones. This is primarily usefulfor testing purposes.scheme: connection transport (default "https")default_bucket_location: the default location (or "region") to create newbuckets in.retry_limit_seconds: the maximum amount of time to spend retrying ifthe filesystem encounters errors. Default is 15 seconds.default_metadata: default metadata to write in new objects.project_id: the project to use for creating buckets.
Methods
path(x): Create aSubTreeFileSystemfrom the currentFileSystemrooted at the specified pathx.cd(x): Create aSubTreeFileSystemfrom the currentFileSystemrooted at the specified pathx.ls(path, ...): List files or objects at the given path or from the rootof theFileSystemifpathis not provided. Additional arguments passedtoFileSelector$create, seeFileSelector.$GetFileInfo(x):xmay be aFileSelector or a charactervector of paths. Returns a list ofFileInfo$CreateDir(path, recursive = TRUE): Create a directory and subdirectories.$DeleteDir(path): Delete a directory and its contents, recursively.$DeleteDirContents(path): Delete a directory's contents, recursively.Like$DeleteDir(),but doesn't delete the directory itself. Passing an empty path ("") willwipe the entire filesystem tree.$DeleteFile(path): Delete a file.$DeleteFiles(paths): Delete many files. The default implementationissues individual delete operations in sequence.$Move(src, dest): Move / rename a file or directory. If the destinationexists:if it is a non-empty directory, an error is returnedotherwise, if it has the same type as the source, it is replacedotherwise, behavior is unspecified (implementation-dependent).$CopyFile(src, dest): Copy a file. If the destination exists and is adirectory, an error is returned. Otherwise, it is replaced.$OpenInputStream(path): Open aninput stream forsequential reading.$OpenInputFile(path): Open aninput file for randomaccess reading.$OpenOutputStream(path): Open anoutput stream forsequential writing.$OpenAppendStream(path): Open anoutput stream forappending.
Active bindings
$type_name: string filesystem type name, such as "local", "s3", etc.$region: string AWS region, forS3FileSystemandSubTreeFileSystemcontaining aS3FileSystem$base_fs: forSubTreeFileSystem, theFileSystemit contains$base_path: forSubTreeFileSystem, the path in$base_fswhich is consideredroot in thisSubTreeFileSystem.$options: forGcsFileSystem, the options used to create theGcsFileSysteminstance as alist
Notes
On S3FileSystem,$CreateDir() on a top-level directory creates a new bucket.When S3FileSystem creates new buckets (assuming allow_bucket_creation is TRUE),it does not pass any non-default settings. In AWS S3, the bucket and allobjects will be not publicly visible, and will have no bucket policiesand no resource tags. To have more control over how buckets are created,use a different API to create them.
On S3FileSystem, output is only produced for fatal errors or when printingreturn values. For troubleshooting, the log level can be set using theenvironment variableARROW_S3_LOG_LEVEL (e.g.,Sys.setenv("ARROW_S3_LOG_LEVEL"="DEBUG")). The log level must be set priorto running any code that interacts with S3. Possible values include 'FATAL'(the default), 'ERROR', 'WARN', 'INFO', 'DEBUG' (recommended), 'TRACE', and'OFF'.