Movatterモバイル変換

Presto (SQL query engine)

From Wikipedia, the free encyclopedia

Distributed query engine

Presto
Original authors	Martin Traverso, Dain Sundstrom, David Phillips, Eric Hwang
Initial release	10 November 2013; 12 years ago (10 November 2013)

Written in	Java
Operating system	Cross-platform
Standard	SQL
Type	Data warehouse
License	Apache License 2.0
Website	prestodb.io

Presto (including PrestoDB, and PrestoSQL which was re-branded toTrino) is a distributed query engine forbig data using theSQL query language. Its architecture allows users to query data sources such asHadoop,Cassandra,Kafka,AWS S3,Alluxio,MySQL,MongoDB andTeradata,^[1] and allows use of multiple data sources within a query. Presto is community-drivenopen-source software released under theApache License.

History

[edit]

Presto was originally designed and developed atFacebook, Inc. (later renamed Meta) for their data analysts to run interactive queries on its largedata warehouse inApache Hadoop. The first four developers were Martin Traverso, Dain Sundstrom, David Phillips, and Eric Hwang.Before Presto, the data analysts at Facebook relied onApache Hive for running SQL analytics on their multi-petabyte data warehouse.^[2]Hive was deemed too slow for Facebook's scale and Presto was invented to fill the gap to run fast queries.^[3] Original development started in 2012 and deployed at Facebook later that year. In November 2013, Facebook announced its open source release.^[3]^[4]

In 2014,Netflix disclosed they used Presto on 10petabytes of data stored in theAmazon Simple Storage Service (S3).^[5] In November, 2016, Amazon announced a service calledAthena that was based on Presto.^[6] In 2017,Teradata spun out a company called Starburst Data to commercially support Presto, which included staff acquired from Hadapt in 2014.^[7] Teradata's QueryGrid software allowed Presto to access a Teradata relational database.^[8]

In January 2019, the Presto Software Foundation was announced. The foundation is a not-for-profit organization for the advancement of the Presto open source distributed SQL query engine.^[9]^[10] At the same time, Presto development forked: PrestoDB maintained by Facebook, and PrestoSQL maintained by the Presto Software Foundation, with some cross pollination of code.

In September 2019, Facebook donated PrestoDB to theLinux Foundation, establishing thePresto Foundation.^[11] Neither the creators of Presto, nor the top contributors and committers, were invited to join this foundation.^[12]

By 2020, all four of the original Presto developers had joined Starburst.^[13]In December 2020, PrestoSQL was rebranded asTrino, since Facebook had obtained a trademark on the name "Presto" (also donated to the Linux Foundation).^[14]

Another company called Ahana was announced in 2020 to commercialize the PrestoDB fork as a cloud service and was acquired byIBM in 2023.^[15]

Architecture

[edit]

Presto's architecture is very similar to otherdatabase management systems usingcluster computing, sometimes calledmassively parallel processing (MPP). One coordinator works in sync with multiple workers. Clients submit SQL statements that are parsed and planned, following which parallel tasks are scheduled to workers. Workers jointly process rows from the data sources and produce results that are returned to the client. Compared to the originalApache Hive execution model which used the HadoopMapReduce mechanism on each query, Presto does not write intermediate results to disk, resulting in a significant speed improvement. Presto is written inJava.

A Presto query can combine data from multiple sources. Presto offers connectors to data sources including files inAlluxio,Hadoop Distributed File System (often called adata lake),Amazon S3,MySQL,PostgreSQL,Microsoft SQL Server,Amazon Redshift,Apache Kudu,Apache Phoenix,Apache Kafka,Apache Cassandra,Apache Accumulo,MongoDB andRedis. Unlike other Hadoop distribution-specific tools, such asApache Impala, Presto can work with any variant of Hadoop or without it. Presto supports separation of compute and storage and may be deployed on-premises or usingcloud computing.

References

[edit]

^1.1. Teradata Distribution of Presto — Teradata Distribution of Presto 0.167-t.0.2 Documentation
^Mike Volpi (November 20, 2019)."Starburst and Presto: with Stellar Velocity".Index Ventures Blog. RetrievedJanuary 27, 2022.
^^a ^bJoab Jackson (November 6, 2013)."Facebook goes open source with query engine for big data".Computer World. RetrievedApril 26, 2017.
^Jordan Novet (June 6, 2013)."Facebook unveils Presto engine for querying 250 PB data warehouse".Giga Om. Archived fromthe original on June 8, 2013. RetrievedApril 26, 2017.
^Eva Tse; Zhenxiao Luo; Nezih Yigitbasi (October 7, 2014)."Using Presto in our Big Data Platform on AWS".Netflix technical blog. RetrievedApril 26, 2017.
^Jeff Barr (November 30, 2016)."Amazon Athena – Interactive SQL Queries for Data in Amazon S3".AWS News Blog. RetrievedJanuary 27, 2022.
^Philip Howard (December 21, 2017)."Teradata spins off Starburst". Bloor. RetrievedJanuary 26, 2022.
^Lindsay Clark (December 17, 2020)."Hey Presto! Teradata admits its vision is dead by hooking QueryGrid analytics platform up to rival data warehouses".The Register. RetrievedJanuary 26, 2022.
^"Presto Software Foundation Launches to Advance Presto Open Source Community".Press release. January 31, 2019. RetrievedJanuary 2, 2022.
^"Presto's New Foundation Signals Growth for the Big Data SQL Engine".The New Stack. 2019-01-31. Retrieved2019-02-01.
^"Facebook, Uber, Twitter and Alibaba form Presto Foundation to Tackle Distributed Data Processing at Scale". 23 September 2019. Retrieved2019-11-12.
^Piotr Findeisen (November 22, 2019)."What's the relationship between prestosql and prestodb?".Comment on issue #38 of Trino Github. RetrievedJanuary 27, 2022.
^"Original Presto Co-Creators Reunite on the Starburst Technical Leadership Team".Press release. September 22, 2020. RetrievedJanuary 26, 2022.
^Martin Traverso, Dain Sundstrom, David Phillips (December 27, 2020)."We're rebranding PrestoSQL as Trino".Trino blog. RetrievedJanuary 26, 2022.{{cite web}}: CS1 maint: multiple names: authors list (link)
^Gillin, Paul (14 April 2023)."IBM acquires Ahana, joins the Presto Foundation".SiliconANGLE. Retrieved20 April 2023.

External links

[edit]

v t e Linux Foundation
Sub-foundations	Cloud Native Computing Foundation Cloud Foundry OpenJS Foundation LF Energy Presto Foundation Open Source Security Foundation Overture Maps Foundation
Initiatives	Open Container Initiative Core Infrastructure Initiative OpenAPI Initiative
Projects	Open Mainframe Project SONiC Hyperledger

Retrieved from "https://en.wikipedia.org/w/index.php?title=Presto_(SQL_query_engine)&oldid=1323979762"

Categories:

Hidden categories:

[8]ページ先頭

Movatterモバイル変換

History

Architecture

See also

References

External links