Movatterモバイル変換
[0]ホーム
CRAN Task View: Databases with R
| Maintainer: | Yuan Tang, James Joseph Balamuta |
| Contact: | terrytangyuan at gmail.com |
| Version: | 2023-02-23 |
| URL: | https://CRAN.R-project.org/view=Databases |
| Source: | https://github.com/cran-task-views/Databases/ |
| Contributions: | Suggestions and improvements for this task view are very welcome and can be made through issues or pull requests on GitHub or via e-mail to the maintainer address. For further details see theContributing guide. |
| Citation: | Yuan Tang, James Joseph Balamuta (2023). CRAN Task View: Databases with R. Version 2023-02-23. URL https://CRAN.R-project.org/view=Databases. |
| Installation: | The packages from this task view can be installed automatically using thectv package. For example,ctv::install.views("Databases", coreOnly = TRUE) installs all the core packages orctv::update.views("Databases") installs all packages that are not yet installed and up-to-date. See theCRAN Task View Initiative for more details. |
This CRAN task view contains a list of packages related to accessibility of different databases. This does not include data import/export or data management. Moreover, the task view onHighPerformanceComputing andMachineLearning might provide useful information.
As datasets become larger and larger, it is impossible for people to save them in traditional file formats such as spreadsheet, raw text file, etc., which could not fit on devices with limited storage and could not be easily shared across collaborators. Instead, people nowadays tend to store data in databases for more scalable and reliable data management.
Database systems are often classified based on thedatabase models that they support.Relational databases became dominant in the 1980s. The data in relational databases is modeled as rows and columns in a series of tables with the use ofSQL to express the logic for writing and querying data. The tables are relational, e.g. you have a user who uses your softwares and those softwares have creators and contributors. Non-relational databases became popular in recent years due to huge demand in storing unstructured data with the use ofNoSQL as the query language. Users generally don’t need to define the data schema up front. If there are changing requirements in the applications, non-relational databases can be much easier to use and manage.
The content presented in this task view is undergoing rapid changes in industries and academia. Please send any suggestions to the maintainer via e-mail or submit an issue or pull request in the GitHub repository linked above. All suggestions and corrections by others are gratefully acknowledged.
Relational databases
This section includes packages that provides access to relational databases within R.
- TheDBI package provides a database interface definition for communication between R and relational database management systems. It’s worth noting that some packages try to follow this interface definition (DBI-compliant) but many existing packages don’t.
- TheRODBC package provides access to databases through an ODBC interface. This package is maintained by the R Core Team and depends only on base R. See alternative odbc package below.
- Theodbc package provides a DBI-compliant interface to ODBC drivers. This package is maintained by RStudio and has a number of package dependencies. See alternative RODBC package above.
- TheRMariaDB package provides a DBI-compliant interface toMariaDB andMySQL.
- TheRMySQL package provides the interface to MySQL. Note that this is the legacy DBI interface to MySQL and MariaDB based on old code ported from S-PLUS. A modern MySQL client based on Rcpp is available from the RMariaDB package we listed above.
- Packages forPostgreSQL, an open-source relational database:
- TheRPostgreSQL package andRPostgres package both provide fully DBI-compliant Rcpp-backed interfaces to PostgreSQL.
- Therpostgis package provides the interface to its spatial extensionPostGIS.
- TheRGreenplum provides a fully DBI-compliant interface toGreenplum, an open-source parallel database on top of PostgreSQL.
- TheROracle package is a DBI-compliantOracle database driver based on the OCI.
- Packages forSQLite, a self-contained, high-reliability, embedded, full-featured, public-domain, SQL database engine:
- TheRSQLite package embeds the SQLite database engine in R and provides an interface compliant with the DBI package.
- ThefilehashSQLite package is a simple key-value database using SQLite as the backend.
- Theliteq package provides temporary and permanent message queues for R, built on top of SQLite.
- Theduckdb package provides a DBI interface toDuckDb, an in-process SQL OLAP database management system.
- Thebigrquery package provides the interface toGoogle BigQuery, Google’s fully managed, petabyte scale, low cost analytics data warehouse.
- TheRDruid package on GitHub provides the interface toApache Druid, a high performance analytics data store for event-driven data.
- TheRH2 package provides the interface toH2 Database Engine, the Java SQL database.
- Theinfluxdbr package provides the interface toInfluxDB, a time series database designed to handle high write and query loads.
- TheRPresto package implements a DBI-compliant interface toPresto, an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes.
- TheRJDBC package is an implementation of R’s DBI interface using JDBC as a back-end. This allows R to connect to any DBMS that has a JDBC driver.
- Theimplyr package provides the back-end forApache Impala, which enables low-latency SQL queries on data stored in the Hadoop Distributed File System (HDFS), Apache HBase, Apache Kudu, Amazon Simple Storage Service (S3), Microsoft Azure Data Lake Store (ADLS), and Dell EMC Isilon.
- Thedbx package provides intuitive functions for high performance batch operations and safe inserts/updates/deletes without writing SQL on top ofDBI. It is designed for both research and production environments and supports multiple database backends such as Postgres, MySQL, MariaDB, and SQLite.
- Thesparklyr package provides provides adplyr interface toApache Spark DataFrames as well as an R interface to Spark’s distributed machine learning pipelines.
- TheHmisc provides a wrapper function
Hmisc::mdb.get() that uses themdbtools utility to read from Microsoft Access database on Unix-alike systems. - TheDatabaseConnector provides a DBI compatible interface to various database platforms using either JDBC or DBI drivers.
Non-relational databases
This section includes packages that provides access to non-relational databases within R.
- Packages forRedis, an open-source, in-memory data structure store that can be used as a database, cache and message broker:
- TheRcppRedis package provides interface to Redis usinghiredis.
- Theredux package provides a low-level interface to Redis, allowing execution of arbitrary Redis commands with almost no interface, and a high-level generated interface to more than 200 redis commands.
- Packages forElasticsearch, an open-source, RESTful, distributed search and analytics engine:
- Theelastic package provides a general purpose interface to Elasticsearch.
- Theuptasticsearch package is a Elasticsearch client tailored to data science workflows.
- Themongolite package provides a high-level, high-performanceMongoDB client based onmongo-c-driver, including support for aggregation, indexing, map-reduce, streaming, SSL encryption and SASL authentication.
- TheR4CouchDB package provides a collection of functions for basic database and document management operations inCouchDB.
- Packages forAmazon DynamoDB, a fast, flexible NoSQL database
- Theaws.dynamodb package on GitHub provides access to inside from the
cloudyr development team. - Thepaws.database package provides an interface using thepaws suite of tools.
- Therrocksdb package on GitHub provides access toRocksDB.
Database tools
This section includes packages that provides tools for working and testing with databases, database table manipulations, etc.
- TheMSSQL package extends the functionality of the RODBC package to work with Microsoft SQL Server databases. Makes it easier to browse the database and examine individual tables and views.
- Thepool package enables the creation of object pools, which make it less computationally expensive to fetch a new object.
- TheDBItest package is a helper that tests DBI back ends for conformity to the interface.
- Thedbplyr package is adplyr back-end for databases that allows you to work with remote database tables as if they are in-memory data frames. Basic features works with any database that has a DBI back-end; more advanced features require SQL translation to be provided by the package author.
- Thesqldf package provides functionalities to manipulate R Data Frames Using SQL.
- Thepointblank package provides tools to validate data tables in databases such as PostgreSQL and MySQL.
- Thedittodb package provides functionality to test database interactions with anyDBI compliant database backend. It includes functionality to use fixtures instead of direct database calls during testing as well as functionality to record those fixtures when interacting with a real database for later use in tests.
- Thetfio package provides the ability to useApache Ignite, which handles distributed database management for high-performance computing with in-memory speed.
- Thedbr package on GitHub provides convenient database connections and queries from R using YAML configuration files and templates.
- Therocker package provides aR6 class interface for handling relational database connections usingDBI as backend. The purpose is having an intuitive object allowing straightforward handling of SQL databases.
- TheSQRL package streamlines exploratory and interactive sessions on ODBC databases, and allows R code within SQL scripts.
- Theoctopus package provides an interactive shiny application for database management to view tables and schemas, upload files, send queries, and more.
CRAN packages
| Core: | DBI,odbc,RODBC. |
| Regular: | bigrquery,DatabaseConnector,DBItest,dbplyr,dbx,dittodb,dplyr,duckdb,elastic,filehashSQLite,Hmisc,implyr,influxdbr,liteq,mongolite,MSSQL,octopus,paws,paws.database,pointblank,pool,R4CouchDB,R6,RcppRedis,redux,RGreenplum,RH2,RJDBC,RMariaDB,RMySQL,rocker,ROracle,rpostgis,RPostgres,RPostgreSQL,RPresto,RSQLite,sparklyr,sqldf,SQRL,tfio,uptasticsearch. |
Related links
Other resources
[8]ページ先頭