Movatterモバイル変換

Apache Impala

From Wikipedia, the free encyclopedia

Open-source SQL query engine

Apache Impala

Developer	Apache Software Foundation
Initial release	April 28, 2013; 12 years ago (2013-04-28)

Stable release	4.5.0 / March 4, 2025; 11 months ago (2025-03-04)

Written in	C++,Java
Operating system	Cross-platform
Type	RelationalHadoop-analytics
License	Apache License 2.0
Website	impala.apache.org
Repository	Impala Repository

Apache Impala is anopen source massively parallel processing (MPP) SQL query engine for data stored in acomputer cluster runningApache Hadoop.^[1] Impala has been described as the open-source equivalent ofGoogle F1, which inspired its development in 2012.^[2]

Description

[edit]

Apache Impala is a query engine that runs on Apache Hadoop. The project was announced in October 2012 with a publicbeta test distribution^[3]^[4] and became generally available in May 2013.^[5]

Impala brings scalableparallel database technology to Hadoop, enabling users to issue low-latencySQL queries to data stored inHDFS andApache HBase without requiring data movement or transformation. Impala is integrated with Hadoop to use the same file and data formats, metadata, security and resource management frameworks used byMapReduce,Apache Hive,Apache Pig and other Hadoop software.

Impala is promoted for analysts and data scientists to perform analytics on data stored in Hadoop via SQL orbusiness intelligence tools. The result is that large-scale data processing (via MapReduce) and interactive queries can be done on the same system using the same data and metadata – removing the need to migrate data sets into specialized systems and/or proprietary formats simply to perform analysis.

Features include:

SupportsHDFS,S3,Microsoft Azure Blob Storage,Apache HBase andApache Kudu storage,
Reads Hadoop file formats, including text,LZO,SequenceFile,Avro,RCFile,Parquet andORC
Supports Hadoop security (Kerberos authentication,Ldap),
Fine-grained, role-based authorization withApache Ranger
Uses metadata,ODBC driver, and SQL syntax fromApache Hive.

In early 2013, acolumn-oriented file format calledParquet was announced for architectures including Impala.^[6]In December 2013,Amazon Web Services announced support for Impala.^[7]In early 2014,MapR added support for Impala.^[8]In 2015, another format calledKudu was announced, whichCloudera proposed to donate to theApache Software Foundation along with Impala.^[9]Impala graduated to an Apache Top-Level Project (TLP) on 28 November 2017.^[10]

References

[edit]

^"Apache Impala". Retrieved15 September 2017.
^Cade Metz (October 24, 2012)."Man Busts Out of Google, Rebuilds Top-Secret Query Machine".Wired Magazine. RetrievedOctober 10, 2016.
^Larry Digna (October 24, 2012)."Cloudera aims to bring real-time queries to Hadoop, big data".Between the lines blog. ZDNet. RetrievedJanuary 20, 2014.
^Andrew Brust (October 25, 2012)."Cloudera's Impala brings Hadoop to SQL and BI".ZDNet. RetrievedJanuary 20, 2014.
^Marcel Kornacker, Justin Erickson (May 1, 2013)."Cloudera Impala 1.0: It's Here, It's Real, It's Already the Standard for SQL on Hadoop". Archived fromthe original on April 13, 2014. RetrievedApril 10, 2014.
^"Parquet: Columnar Storage for Hadoop".Project web site. 2013. RetrievedJanuary 20, 2014.
^"Announcing Support for Impala with Amazon Elastic MapReduce". Amazon.com. December 12, 2013. RetrievedJanuary 20, 2014.
^"Impala for MapR". MapR.com. February 2, 2014. RetrievedApril 10, 2014.
^David Ramel (November 18, 2015)."Cloudera to Donate Impala and Kudu Big Data Projects to Apache".Application Development Trends. RetrievedOctober 10, 2016.
^"The Apache Software Foundation Announces Apache Impala as a Top-Level Project". November 28, 2017. RetrievedNovember 30, 2017.

External links

[edit]

Apache Impala project website
Impala GitHub project source code

v t e The Apache Software Foundation
Top-level projects	Accumulo ActiveMQ Airavata Airflow Allura Ambari Ant Aries Arrow Apache HTTP Server APR Avro Axis Axis2 Beam Bloodhound Brooklyn Calcite Camel CarbonData Cassandra Cayenne CloudStack Cocoon Cordova CouchDB cTAKES CXF Derby Directory Drill Druid Empire-db Felix Flex Flink Flume FreeMarker Geronimo Groovy Guacamole Gump Hadoop HBase Helix Hive Iceberg Ignite Impala Jackrabbit James Jena JMeter Kafka Kudu Kylin Lucene Mahout Maven MINA mod_perl MyFaces Mynewt NiFi NetBeans Nutch NuttX OFBiz Oozie OpenEJB OpenJPA OpenNLP OрenOffice ORC PDFBox Parquet Phoenix POI Pig Pinot Pivot Qpid Roller RocketMQ Samza Shiro SINGA Sling Solr Spark Storm SpamAssassin Struts 1 Subversion Superset SystemDS Tapestry Thrift Tika TinkerPop Tomcat Trafodion Traffic Server UIMA Velocity Wicket Xalan Xerces XMLBeans Yetus ZooKeeper
Commons	BCEL BSF Daemon Jelly Logging
Incubator	Taverna
Other projects	Batik FOP Ivy Log4j
Attic	Apex AxKit Beehive iBATIS Click Continuum Deltacloud Etch Giraph Hama Harmony Jakarta Marmotta MXNet ODE River Shale Slide Sqoop Stanbol Tuscany Wave XML
Licenses	Apache License
Category

Retrieved from "https://en.wikipedia.org/w/index.php?title=Apache_Impala&oldid=1330310232"

Categories:

Hidden categories:

[8]ページ先頭

Movatterモバイル変換

Description

See also

References

External links