Movatterモバイル変換


[0]ホーム

URL:


Title:DBI Package for the DuckDB Database Management System
Version:1.4.3
Description:The DuckDB project is an embedded analytical data management system with support for the Structured Query Language (SQL). This package includes all of DuckDB and an R Database Interface (DBI) connector.
License:MIT + file LICENSE
URL:https://r.duckdb.org/,https://github.com/duckdb/duckdb-r
BugReports:https://github.com/duckdb/duckdb-r/issues
Depends:DBI, R (≥ 4.1.0)
Imports:methods, utils
Suggests:adbcdrivermanager, arrow (≥ 13.0.0), bit64, callr, clock,DBItest, dbplyr, dplyr, rlang, testthat, tibble, vctrs, withr
Config/build/compilation-database:false
Config/build/never-clean:true
Config/comment/compilation-database:Generate manually withpkgload:::generate_db() for faster pkgload::load_all()
Config/gha/extra-packages:arrow=?ignore-before-r=4.2.0adbcdrivermanager=?ignore-before-r=4.2.0
Config/gha/filter:os != "windows-latest" | r != "4.1"
Config/gha/filter-note:Inexplicable build failures on Windows GHA withR 4.1, works locally
Encoding:UTF-8
RoxygenNote:7.3.3.9000
SystemRequirements:xz (for building from source)
Biarch:true
NeedsCompilation:yes
Packaged:2025-12-09 16:11:03 UTC; kirill
Author:Hannes MühleisenORCID iD [aut], Mark RaasveldtORCID iD [aut], Kirill MüllerORCID iD [cre], Stichting DuckDB Foundation [cph], Apache Software Foundation [cph], PostgreSQL Global Development Group [cph], The Regents of the University of California [cph], Cameron Desrochers [cph], Victor Zverovich [cph], RAD Game Tools [cph], Valve Software [cph], Rich Geldreich [cph], Tenacious Software LLC [cph], The RE2 Authors [cph], Google Inc. [cph], Facebook Inc. [cph], Steven G. Johnson [cph], Jiahao Chen [cph], Tony Kelman [cph], Jonas Fonseca [cph], Lukas Fittl [cph], Salvatore Sanfilippo [cph], Art.sy, Inc. [cph], Oran Agra [cph], Redis Labs, Inc. [cph], Melissa O'Neill [cph], PCG Project contributors [cph]
Maintainer:Kirill Müller <kirill@cynkra.com>
Repository:CRAN
Date/Publication:2025-12-10 06:10:02 UTC

duckdb: DBI Package for the DuckDB Database Management System

Description

logo

The DuckDB project is an embedded analytical data management system with support for the Structured Query Language (SQL). This package includes all of DuckDB and an R Database Interface (DBI) connector.

Author(s)

Maintainer: Kirill Müllerkirill@cynkra.com (ORCID)

Authors:

Other contributors:

See Also

Useful links:


DuckDB SQL backend for dbplyr

Description

This is a SQL backend for dbplyr tailored to take into account DuckDB'spossibilities. This mainly follows the backend for PostgreSQL, butcontains more mapped functions.

tbl_file() is an experimental variant ofdplyr::tbl() to directly access files on disk.It is safer thandplyr::tbl() because there is no risk of misinterpreting the request,and paths with special characters are supported.

tbl_function() is an experimental variant ofdplyr::tbl()to create a lazy table from a table-generating function,useful for reading nonstandard CSV files or other data sources.It is safer thandplyr::tbl() because there is no risk of misinterpreting the query.Seehttps://duckdb.org/docs/data/overview for details on data importing functions.

As an alternative, usedplyr::tbl(src, dplyr::sql("SELECT ... FROM ...")) for custom SQL queries.

tbl_query() is deprecated in favor oftbl_function().

Usesimulate_duckdb() withlazy_frame()to see simulated SQL without opening a DuckDB connection.

Usage

tbl_file(src = NULL, path, ..., cache = FALSE)tbl_function(src, query, ..., cache = FALSE)tbl_query(src, query, ...)simulate_duckdb(...)

Arguments

src

A duckdb connection object,default_conn() if omitted.

path

Path to existing Parquet, CSV or JSON file

...

Any parameters to be forwarded

cache

Enable object cache for Parquet files

query

SQL code, omitting theFROM clause

Examples

library(dplyr, warn.conflicts = FALSE)con <- DBI::dbConnect(duckdb(), path = ":memory:")db <- copy_to(con, data.frame(a = 1:3, b = letters[2:4]))db %>%  filter(a > 1) %>%  select(b)path <- tempfile(fileext = ".csv")write.csv(data.frame(a = 1:3, b = letters[2:4]))db_csv <- tbl_file(con, path)db_csv %>%  summarize(sum_a = sum(a))db_csv_fun <- tbl_function(con, paste0("read_csv_auto('", path, "')"))db_csv %>%  count()DBI::dbDisconnect(con, shutdown = TRUE)

Get the default connection

Description

[Experimental]

default_conn() returns a default, built-in connection.

Usage

default_conn()

Details

Currently, the connection is established withduckdb(environment_scan = TRUE)anddbConnect(timezone_out = "", array = "matrix")so that data frames are automatically available as tables,timestamps are returned in the local timezone,and DuckDB's array type is returned as an R matrix.The details of how the connection is established are subject to change.In particular, returning the output as a tibble or other object may be supportedin the future.

This connection is intended for interactive use.There is no way for this or other packages to comprehensively track the stateof this connection, so scripts and packages should manage their own connections.

Value

A DuckDB connection object

Examples

conn <- default_conn()sql_query("SELECT 42", conn = conn)

Connect to a DuckDB database instance

Description

duckdb() creates or reuses a database instance.

duckdb_shutdown() shuts down a database instance.

Return anadbcdrivermanager::adbc_driver() for use with Arrow DatabaseConnectivity via the adbcdrivermanager package.

dbConnect() connects to a database instance.

dbDisconnect() closes a DuckDB database connection.The associated DuckDB database instance is shut down automatically,it is no longer necessary to setshutdown = TRUE or to callduckdb_shutdown().

Usage

duckdb(  dbdir = DBDIR_MEMORY,  read_only = FALSE,  bigint = "numeric",  config = list(),  ...,  environment_scan = FALSE)duckdb_shutdown(drv)duckdb_adbc()## S4 method for signature 'duckdb_driver'dbConnect(  drv,  dbdir = DBDIR_MEMORY,  ...,  debug = getOption("duckdb.debug", FALSE),  read_only = FALSE,  timezone_out = "UTC",  tz_out_convert = c("with", "force"),  config = list(),  bigint = "numeric",  array = "none")## S4 method for signature 'duckdb_connection'dbDisconnect(conn, ..., shutdown = TRUE)

Arguments

dbdir

Location for database files. Should be a path to an existingdirectory in the file system. With the default (or""), alldata is kept in RAM.

read_only

Set toTRUE for read-only operation.For file-based databases, this is only applied when the database file is opened for the first time.Subsequent connections (via the samedrv object or adrv object pointing to the same path)will silently ignore this flag.

bigint

How 64-bit integers should be returned. There are two options:"numeric" and"integer64".If"numeric" is selected, bigint integers will be treated as double/numeric.If"integer64" is selected, bigint integers will be set to bit64 encoding.

config

Named list with DuckDB configuration flags, seehttps://duckdb.org/docs/configuration/overview#configuration-reference for the possible options.These flags are only applied when the database object is instantiated.Subsequent connections will silently ignore these flags.

...

Reserved for future extensions, must be empty.

environment_scan

Set toTRUE to treatdata frames from the calling environment as tables.If a database table with the same name exists, it takes precedence.The default of this setting may change in a future version.

drv

Object returned byduckdb()

debug

Print additional debug information, such as queries.

timezone_out

The time zone returned to R, defaults to"UTC", whichis currently the only timezone supported by duckdb.If you want to display datetime values in the local timezone,set toSys.timezone() or"".

tz_out_convert

How to convert timestamp columns to the timezone specifiedintimezone_out. There are two options:"with", and"force". If"with"is chosen, the timestamp will be returned as it would appear in the specified time zone.If"force" is chosen, the timestamp will have the same clocktime as the timestamp in the database, but with the new time zone.

array

How arrays should be returned. There are two options:"none" and"matrix".If"none" is selected, arrays are not returned. Instead an error is generated.If"matrix" is selected, arrays are returned as a column matrix. Each array is one row in the matrix.

conn

Aduckdb_connection object

shutdown

Unused. The database instance is shut down automatically.

Details

The behavior ofwith = "force" at DST transitions depends on how R handles translation fromthe underlying time representation to a human-readable format.If the timestamp is invalid in the target timezone, the resulting value may beNAor an adjusted time.

Value

duckdb() returns an object of classduckdb_driver.

dbDisconnect() andduckdb_shutdown() are called for theirside effect.

An object of class "adbc_driver"

dbConnect() returns an object of classduckdb_connection.

Examples

library(adbcdrivermanager)with_adbc(db <- adbc_database_init(duckdb_adbc()), {  as.data.frame(read_adbc(db, "SELECT 1 as one;"))})drv <- duckdb()con <- dbConnect(drv)dbGetQuery(con, "SELECT 'Hello, world!'")dbDisconnect(con)duckdb_shutdown(drv)# Shorter:con <- dbConnect(duckdb())dbGetQuery(con, "SELECT 'Hello, world!'")dbDisconnect(con, shutdown = TRUE)

DuckDB connection class

Description

ImplementsDBI::DBIConnection.

Usage

## S4 method for signature 'duckdb_connection'dbAppendTable(conn, name, value, ..., row.names = NULL)## S4 method for signature 'duckdb_connection'dbBegin(conn, ...)## S4 method for signature 'duckdb_connection'dbCommit(conn, ...)## S4 method for signature 'duckdb_connection'dbDataType(dbObj, obj, ...)## S4 method for signature 'duckdb_connection,ANY'dbExistsTable(conn, name, ...)## S4 method for signature 'duckdb_connection'dbGetInfo(dbObj, ...)## S4 method for signature 'duckdb_connection'dbIsValid(dbObj, ...)## S4 method for signature 'duckdb_connection,character'dbListFields(conn, name, ...)## S4 method for signature 'duckdb_connection'dbListTables(conn, ...)## S4 method for signature 'duckdb_connection,ANY'dbQuoteIdentifier(conn, x, ...)## S4 method for signature 'duckdb_connection'dbQuoteLiteral(conn, x, ...)## S4 method for signature 'duckdb_connection,character'dbRemoveTable(conn, name, ..., fail_if_missing = TRUE)## S4 method for signature 'duckdb_connection'dbRollback(conn, ...)## S4 method for signature 'duckdb_connection,character'dbSendQuery(conn, statement, params = NULL, ..., arrow = FALSE)## S4 method for signature 'duckdb_connection,character,data.frame'dbWriteTable(  conn,  name,  value,  ...,  row.names = FALSE,  overwrite = FALSE,  append = FALSE,  field.types = NULL,  temporary = FALSE)## S4 method for signature 'duckdb_connection'show(object)

Arguments

conn

Aduckdb_connection object as returned byDBI::dbConnect()

name

The table name, passed on todbQuoteIdentifier(). Options are:

  • a character string with the unquoted DBMS table name,e.g."table_name",

  • a call toId() with components to the fully qualified table name,e.g.Id(schema = "my_schema", table = "table_name")

  • a call toSQL() with the quoted and fully qualified table namegiven verbatim, e.g.SQL('"my_schema"."table_name"')

value

Adata.frame (or coercible to data.frame).

...

Other parameters passed on to methods.

row.names

Whether the row.names of the data.frame should be preserved

dbObj

An object inheriting from classduckdb_connection.

obj

An R object whose SQL type we want to determine.

statement

a character string containing SQL.

params

FordbBind(), a list of values, named or unnamed,or a data frame, with one element/column per query parameter.FordbBindArrow(), values as a nanoarrow stream,with one column per query parameter.

arrow

Whether the query should be returned as an Arrow Table

overwrite

If a table with the given name already exists, should it be overwritten?

append

If a table with the given name already exists, just try to append the passed data to it

field.types

Override the auto-generated SQL types

temporary

Should the created table be temporary?

object

Any R object


DuckDB driver class

Description

ImplementsDBI::DBIDriver.

Usage

## S4 method for signature 'duckdb_driver'dbDataType(dbObj, obj, ...)## S4 method for signature 'duckdb_driver'dbGetInfo(dbObj, ...)## S4 method for signature 'duckdb_driver'dbIsValid(dbObj, ...)## S4 method for signature 'duckdb_driver'show(object)

Arguments

dbObj

An object inheriting from classduckdb_driver.

...

Other arguments to methods.

object

Any R object


DuckDB EXPLAIN query tree

Description

DuckDB EXPLAIN query tree


Reads a CSV file into DuckDB

Description

Directly reads a CSV file into DuckDB, tries to detect and create the correct schema for it.This usually is much faster than reading the data into R and writing it to DuckDB.

Usage

duckdb_read_csv(  conn,  name,  files,  ...,  header = TRUE,  na.strings = "",  nrow.check = 500,  delim = ",",  quote = "\"",  col.names = NULL,  col.types = NULL,  lower.case.names = FALSE,  sep = delim,  transaction = TRUE,  temporary = FALSE)

Arguments

conn

A DuckDB connection, created bydbConnect().

name

The name for the virtual table that is registered or unregistered

files

One or more CSV file names, should all have the same structure though

...

Reserved for future extensions, must be empty.

header

Whether or not the CSV files have a separate header in the first line

na.strings

Which strings in the CSV files should be considered to be NULL

nrow.check

How many rows should be read from the CSV file to figure out data types

delim

Which field separator should be used

quote

Which quote character is used for columns in the CSV file

col.names

Override the detected or generated column names

col.types

Character vector of column types in the same order as col.names,or a named character vector where names are column names and types pairs.Valid types areDuckDB data types, e.g. VARCHAR, DOUBLE, DATE, BIGINT, BOOLEAN, etc.

lower.case.names

Transform column names to lower case

sep

Alias for delim for compatibility

transaction

Should a transaction be used for the entire operation

temporary

Set toTRUE to create a temporary table

Details

If the table already exists in the database, the csv is appended to it. Otherwise the table is created.

Value

The number of rows in the resulted table, invisibly.

Examples

con <- dbConnect(duckdb())data <- data.frame(a = 1:3, b = letters[1:3])path <- tempfile(fileext = ".csv")write.csv(data, path, row.names = FALSE)duckdb_read_csv(con, "data", path)dbReadTable(con, "data")dbDisconnect(con)# Providing data types for columnspath <- tempfile(fileext = ".csv")write.csv(iris, path, row.names = FALSE)con <- dbConnect(duckdb())duckdb_read_csv(con, "iris", path,  col.types = c(    Sepal.Length = "DOUBLE",    Sepal.Width = "DOUBLE",    Petal.Length = "DOUBLE",    Petal.Width = "DOUBLE",    Species = "VARCHAR"  ))dbReadTable(con, "iris")dbDisconnect(con)

Register a data frame as a virtual table

Description

duckdb_register() registers a data frame as a virtual table (view)in a DuckDB connection.No data is copied.

Usage

duckdb_register(conn, name, df, overwrite = FALSE, experimental = FALSE)duckdb_unregister(conn, name)

Arguments

conn

A DuckDB connection, created bydbConnect().

name

The name for the virtual table that is registered or unregistered

df

Adata.frame with the data for the virtual table

overwrite

Should an existing registration be overwritten?

experimental

Enable experimental optimizations

Details

duckdb_unregister() unregisters a previously registered data frame.

Value

These functions are called for their side effect.

Examples

con <- dbConnect(duckdb())data <- data.frame(a = 1:3, b = letters[1:3])duckdb_register(con, "data", data)dbReadTable(con, "data")duckdb_unregister(con, "data")dbDisconnect(con)

Register an Arrow data source as a virtual table

Description

duckdb_register_arrow() registers an Arrow data source as a virtual table (view)in a DuckDB connection.No data is copied.

Usage

duckdb_register_arrow(conn, name, arrow_scannable, use_async = NULL)duckdb_unregister_arrow(conn, name)duckdb_list_arrow(conn)

Arguments

conn

A DuckDB connection, created bydbConnect().

name

The name for the virtual table that is registered or unregistered

arrow_scannable

A scannable Arrow-object

use_async

Switched to the asynchronous scanner. (deprecated)

Details

duckdb_unregister_arrow() unregisters a previously registered data frame.

Value

These functions are called for their side effect.


DuckDB Result Set

Description

Methods for accessing result sets for queries on DuckDB connections.ImplementsDBI::DBIResult.

Usage

duckdb_fetch_arrow(res, chunk_size = 1e+06)duckdb_fetch_record_batch(res, chunk_size = 1e+06)## S4 method for signature 'duckdb_result'dbBind(res, params, ...)## S4 method for signature 'duckdb_result'dbClearResult(res, ...)## S4 method for signature 'duckdb_result'dbColumnInfo(res, ...)## S4 method for signature 'duckdb_result'dbFetch(res, n = -1, ...)## S4 method for signature 'duckdb_result'dbGetInfo(dbObj, ...)## S4 method for signature 'duckdb_result'dbGetRowCount(res, ...)## S4 method for signature 'duckdb_result'dbGetRowsAffected(res, ...)## S4 method for signature 'duckdb_result'dbGetStatement(res, ...)## S4 method for signature 'duckdb_result'dbHasCompleted(res, ...)## S4 method for signature 'duckdb_result'dbIsValid(dbObj, ...)## S4 method for signature 'duckdb_result'show(object)

Arguments

res

Query result to be converted to a Record Batch Reader

chunk_size

The chunk size

params

FordbBind(), a list of values, named or unnamed,or a data frame, with one element/column per query parameter.FordbBindArrow(), values as a nanoarrow stream,with one column per query parameter.

...

Other arguments passed on to methods.

n

maximum number of records to retrieve per fetch. Usen = -1orn = Infto retrieve all pending records. Some implementations may recognize otherspecial values.

dbObj

An object inheriting from classduckdb_result.

object

Any R object


Deprecated functions

Description

read_csv_duckdb() has been superseded byduckdb_read_csv().The order of the arguments has changed.

Usage

read_csv_duckdb(conn, files, tablename, ...)

Run an SQL query or statement

Description

[Experimental]

sql_query() runs an arbitrary SQL query usingDBI::dbGetQuery()and returns adata.frame with the query results.sql_exec() runs an arbitrary SQL statement usingDBI::dbExecute()and returns the number of affected rows.

These functions are intended as an easy way to interactively run DuckDBwithout having to manage connections.By default, data frame objects are available as views.

Scripts and packages should manage their own connectionsand prefer the DBI methods for more control.

Usage

sql_query(sql, conn = default_conn())sql_exec(sql, conn = default_conn())

Arguments

sql

A SQL string

conn

An optional connection, defaults todefault_conn()

Value

A data frame with the query result

Examples

# Queriessql_query("SELECT 42")# Statements with side effectssql_exec("CREATE TABLE test (a INTEGER, b VARCHAR)")sql_exec("INSERT INTO test VALUES (1, 'one'), (2, 'two')")sql_query("FROM test")# Data frames available as viewssql_query("FROM mtcars")

[8]ページ先頭

©2009-2025 Movatter.jp