Aggregate functions

Functions for computing asingle result from a set of input values. Elasticsearch SQL supports aggregate functions only alongside grouping (implicit or explicit).

`AVG`

AVG(numeric_field)

Input:

numeric field. If this field contains onlynull values, the function returnsnull. Otherwise, the function ignoresnull values in this field.

Output:double numeric value

Description: Returns theAverage (arithmetic mean) of input values.

SELECT AVG(salary) AS avg FROM emp;      avg---------------48248.55

SELECT AVG(salary / 12.0) AS avg FROM emp;      avg---------------4020.7125

`COUNT`

COUNT(expression)

Input:

a field name, wildcard (*) or any numeric value. ForCOUNT(*) orCOUNT(<literal>), all values are considered, includingnull or missing ones. ForCOUNT(<field_name>),null values are not considered.

Output: numeric value

Description: Returns the total number (count) of input values.

SELECT COUNT(*) AS count FROM emp;     count---------------100

`COUNT(ALL)`

COUNT(ALL field_name)

Input:

a field name. If this field contains onlynull values, the function returnsnull. Otherwise, the function ignoresnull values in this field.

Output: numeric value

Description: Returns the total number (count) of allnon-null input values.COUNT(<field_name>) andCOUNT(ALL <field_name>) are equivalent.

SELECT COUNT(ALL last_name) AS count_all, COUNT(DISTINCT last_name) count_distinct FROM emp;   count_all   |  count_distinct---------------+------------------100            |96

SELECT COUNT(ALL CASE WHEN languages IS NULL THEN -1 ELSE languages END) AS count_all, COUNT(DISTINCT CASE WHEN languages IS NULL THEN -1 ELSE languages END) count_distinct FROM emp;   count_all   |  count_distinct---------------+---------------100            |6

`COUNT(DISTINCT)`

COUNT(DISTINCT field_name)

Input:

a field name

Output: numeric value. If this field contains onlynull values, the function returnsnull. Otherwise, the function ignoresnull values in this field.

Description: Returns the total number ofdistinct non-null values in input values.

SELECT COUNT(DISTINCT hire_date) unique_hires, COUNT(hire_date) AS hires FROM emp;  unique_hires  |     hires----------------+---------------99              |100

SELECT COUNT(DISTINCT DATE_TRUNC('YEAR', hire_date)) unique_hires, COUNT(DATE_TRUNC('YEAR', hire_date)) AS hires FROM emp; unique_hires  |     hires---------------+---------------14             |100

`FIRST/FIRST_VALUE`

FIRST(    field_name    [, ordering_field_name])

Input:

target field for the aggregation
optional field used for ordering

Output: same type as the input

Description: Returns the first non-null value (if such exists) of thefield_name input column sorted by theordering_field_name column. Ifordering_field_name is not provided, only thefield_name column is used for the sorting. E.g.:

a	b
100	1
200	1
1	2
2	2
10	null
20	null
null	null

SELECT FIRST(a) FROM t

will result in:

FIRST(a)
1

and

SELECT FIRST(a, b) FROM t

will result in:

FIRST(a, b)
100

SELECT FIRST(first_name) FROM emp;   FIRST(first_name)--------------------Alejandro

SELECT gender, FIRST(first_name) FROM emp GROUP BY gender ORDER BY gender;   gender   |   FIRST(first_name)------------+--------------------null        |   BerniF           |   AlejandroM           |   Amabile

SELECT FIRST(first_name, birth_date) FROM emp;   FIRST(first_name, birth_date)--------------------------------Remzi

SELECT gender, FIRST(first_name, birth_date) FROM emp GROUP BY gender ORDER BY gender;    gender    |   FIRST(first_name, birth_date)--------------+--------------------------------null          |   LillianF             |   SumantM             |   Remzi

FIRST_VALUE is a name alias and can be used instead ofFIRST, e.g.:

SELECT gender, FIRST_VALUE(first_name, birth_date) FROM emp GROUP BY gender ORDER BY gender;    gender    |   FIRST_VALUE(first_name, birth_date)--------------+--------------------------------------null          |   LillianF             |   SumantM             |   Remzi

SELECT gender, FIRST_VALUE(SUBSTRING(first_name, 2, 6), birth_date) AS "first" FROM emp GROUP BY gender ORDER BY gender;    gender     |     first---------------+---------------null           |illianF              |umantM              |emzi

Note

FIRST cannot be used in a HAVING clause.

Note

FIRST cannot be used with columns of typetext unless the field is alsosaved as a keyword.

`LAST/LAST_VALUE`

LAST(    field_name    [, ordering_field_name])

Input:

target field for the aggregation
optional field used for ordering

Output: same type as the input

Description: It’s the inverse ofFIRST/FIRST_VALUE. Returns the last non-null value (if such exists) of thefield_name input column sorted descending by theordering_field_name column. Ifordering_field_name is not provided, only thefield_name column is used for the sorting. E.g.:

a	b
10	1
20	1
1	2
2	2
100	null
200	null
null	null

SELECT LAST(a) FROM t

will result in:

LAST(a)
200

and

SELECT LAST(a, b) FROM t

will result in:

LAST(a, b)
2

SELECT LAST(first_name) FROM emp;   LAST(first_name)-------------------Zvonko

SELECT gender, LAST(first_name) FROM emp GROUP BY gender ORDER BY gender;   gender   |   LAST(first_name)------------+-------------------null        |   PatricioF           |   XinglinM           |   Zvonko

SELECT LAST(first_name, birth_date) FROM emp;   LAST(first_name, birth_date)-------------------------------Hilari

SELECT gender, LAST(first_name, birth_date) FROM emp GROUP BY gender ORDER BY gender;   gender  |   LAST(first_name, birth_date)-----------+-------------------------------null       |   EberhardtF          |   ValdiodioM          |   Hilari

LAST_VALUE is a name alias and can be used instead ofLAST, e.g.:

SELECT gender, LAST_VALUE(first_name, birth_date) FROM emp GROUP BY gender ORDER BY gender;   gender  |   LAST_VALUE(first_name, birth_date)-----------+-------------------------------------null       |   EberhardtF          |   ValdiodioM          |   Hilari

SELECT gender, LAST_VALUE(SUBSTRING(first_name, 3, 8), birth_date) AS "last" FROM emp GROUP BY gender ORDER BY gender;    gender     |     last---------------+---------------null           |erhardtF              |ldiodioM              |lari

Note

LAST cannot be used inHAVING clause.

Note

LAST cannot be used with columns of typetext unless the field is alsosaved as a keyword.

`MAX`

MAX(field_name)

Input:

a numeric field. If this field contains onlynull values, the function returnsnull. Otherwise, the function ignoresnull values in this field.

Output: same type as the input

Description: Returns the maximum value across input values in the fieldfield_name.

SELECT MAX(salary) AS max FROM emp;      max---------------74999

SELECT MAX(ABS(salary / -12.0)) AS max FROM emp;       max-----------------6249.916666666667

Note

MAX on a field of typetext orkeyword is translated intoLAST/LAST_VALUE and therefore, it cannot be used inHAVING clause.

`MIN`

MIN(field_name)

Input:

a numeric field. If this field contains onlynull values, the function returnsnull. Otherwise, the function ignoresnull values in this field.

Output: same type as the input

Description: Returns the minimum value across input values in the fieldfield_name.

SELECT MIN(salary) AS min FROM emp;      min---------------25324

Note

MIN on a field of typetext orkeyword is translated intoFIRST/FIRST_VALUE and therefore, it cannot be used inHAVING clause.

`SUM`

SUM(field_name)

Input:

a numeric field. If this field contains onlynull values, the function returnsnull. Otherwise, the function ignoresnull values in this field.

Output:bigint for integer input,double for floating points

Description: Returns the sum of input values in the fieldfield_name.

SELECT SUM(salary) AS sum FROM emp;      sum---------------4824855

SELECT ROUND(SUM(salary / 12.0), 1) AS sum FROM emp;      sum---------------402071.3

Statistics

`KURTOSIS`

KURTOSIS(field_name)

Input:

a numeric field. If this field contains onlynull values, the function returnsnull. Otherwise, the function ignoresnull values in this field.

Output:double numeric value

Description:

Quantify the shape of the distribution of input values in the fieldfield_name.

SELECT MIN(salary) AS min, MAX(salary) AS max, KURTOSIS(salary) AS k FROM emp;      min      |      max      |        k---------------+---------------+------------------25324          |74999          |2.0444718929142986

Note

KURTOSIS cannot be used on top of scalar functions or operators but only directly on a field. So, for example, the following is not allowed and an error is returned:

SELECT KURTOSIS(salary / 12.0), gender FROM emp GROUP BY gender

`MAD`

MAD(field_name)

Input:

a numeric field. If this field contains onlynull values, the function returnsnull. Otherwise, the function ignoresnull values in this field.

Output:double numeric value

Description:

Measure the variability of the input values in the fieldfield_name.

SELECT MIN(salary) AS min, MAX(salary) AS max, AVG(salary) AS avg, MAD(salary) AS mad FROM emp;      min      |      max      |      avg      |      mad---------------+---------------+---------------+---------------25324          |74999          |48248.55       |10096.5

SELECT MIN(salary / 12.0) AS min, MAX(salary / 12.0) AS max, AVG(salary/ 12.0) AS avg, MAD(salary / 12.0) AS mad FROM emp;       min        |       max       |      avg      |       mad------------------+-----------------+---------------+-----------------2110.3333333333335|6249.916666666667|4020.7125      |841.3750000000002

`PERCENTILE`

PERCENTILE(    field_name,    percentile[,    method[,    method_parameter]])

Input:

a numeric field. If this field contains onlynull values, the function returnsnull. Otherwise, the function ignoresnull values in this field.
a numeric expression (must be a constant and not based on a field). Ifnull, the function returnsnull.
optional string literal for thepercentile algorithm. Possible values:tdigest orhdr. Defaults totdigest.
optional numeric literal that configures thepercentile algorithm. Configurescompression fortdigest ornumber_of_significant_value_digits forhdr. The default is the same as that of the backing algorithm.

Output:double numeric value

Description:

Returns the nthpercentile (represented bynumeric_exp parameter) of input values in the fieldfield_name.

SELECT languages, PERCENTILE(salary, 95) AS "95th" FROM emp       GROUP BY languages;   languages   |      95th---------------+-----------------null           |74482.41              |71122.82              |70271.43              |71926.04              |69352.155              |56371.0

SELECT languages, PERCENTILE(salary / 12.0, 95) AS "95th" FROM emp       GROUP BY languages;   languages   |       95th---------------+------------------null           |6206.8666666666671              |5926.92              |5855.9499999999993              |5993.8333333333334              |5779.3458333333335              |4697.583333333333

SELECT    languages,    PERCENTILE(salary, 97.3, 'tdigest', 100.0) AS "97.3_TDigest",    PERCENTILE(salary, 97.3, 'hdr', 3) AS "97.3_HDR"FROM empGROUP BY languages;   languages   | 97.3_TDigest    |   97.3_HDR---------------+-----------------+---------------null           |74720.036        |74992.01              |72316.132        |73712.02              |71792.436        |69936.03              |73326.23999999999|74992.04              |71753.281        |74608.05              |61176.16000000001|56368.0

`PERCENTILE_RANK`

PERCENTILE_RANK(    field_name,    value[,    method[,    method_parameter]])

Input:

a numeric field. If this field contains onlynull values, the function returnsnull. Otherwise, the function ignoresnull values in this field.
a numeric expression (must be a constant and not based on a field). Ifnull, the function returnsnull.
optional string literal for thepercentile algorithm. Possible values:tdigest orhdr. Defaults totdigest.
optional numeric literal that configures thepercentile algorithm. Configurescompression fortdigest ornumber_of_significant_value_digits forhdr. The default is the same as that of the backing algorithm.

Output:double numeric value

Description:

Returns the nthpercentile rank (represented bynumeric_exp parameter) of input values in the fieldfield_name.

SELECT languages, PERCENTILE_RANK(salary, 65000) AS rank FROM emp GROUP BY languages;   languages   |      rank---------------+-----------------null           |73.657665699620621              |73.72916251577342              |88.880056070106433              |79.436626232958294              |85.704463896434935              |96.79075152940749

SELECT languages, PERCENTILE_RANK(salary/12, 5000) AS rank FROM emp GROUP BY languages;   languages   |       rank---------------+------------------null           |66.912408759124091              |66.707667076670762              |84.132668950482713              |61.0529926256216844              |76.556464439900015              |94.00696864111498

SELECT    languages,    ROUND(PERCENTILE_RANK(salary, 65000, 'tdigest', 100.0), 2) AS "rank_TDigest",    ROUND(PERCENTILE_RANK(salary, 65000, 'hdr', 3), 2) AS "rank_HDR"FROM empGROUP BY languages;   languages   | rank_TDigest  |   rank_HDR---------------+---------------+---------------null           |73.66          |80.01              |73.73          |73.332              |88.88          |89.473              |79.44          |76.474              |85.7           |83.335              |96.79          |95.24

`SKEWNESS`

SKEWNESS(field_name)

Input:

a numeric field. If this field contains onlynull values, the function returnsnull. Otherwise, the function ignoresnull values in this field.

Output:double numeric value

Description:

Quantify the asymmetric distribution of input values in the fieldfield_name.

SELECT MIN(salary) AS min, MAX(salary) AS max, SKEWNESS(salary) AS s FROM emp;      min      |      max      |        s---------------+---------------+------------------25324          |74999          |0.2707722118423227

Note

SKEWNESS cannot be used on top of scalar functions but only directly on a field. So, for example, the following is not allowed and an error is returned:

SELECT SKEWNESS(ROUND(salary / 12.0, 2), gender FROM emp GROUP BY gender

`STDDEV_POP`

STDDEV_POP(field_name)

Input:

a numeric field. If this field contains onlynull values, the function returnsnull. Otherwise, the function ignoresnull values in this field.

Output:double numeric value

Description:

Returns thepopulation standard deviation of input values in the fieldfield_name.

SELECT MIN(salary) AS min, MAX(salary) AS max, STDDEV_POP(salary) AS stddev FROM emp;      min      |      max      |      stddev---------------+---------------+------------------25324          |74999          |13765.125502787832

SELECT MIN(salary / 12.0) AS min, MAX(salary / 12.0) AS max, STDDEV_POP(salary / 12.0) AS stddev FROM emp;       min        |       max       |     stddev------------------+-----------------+-----------------2110.3333333333335|6249.916666666667|1147.093791898986

`STDDEV_SAMP`

STDDEV_SAMP(field_name)

Input:

a numeric field. If this field contains onlynull values, the function returnsnull. Otherwise, the function ignoresnull values in this field.

Output:double numeric value

Description:

Returns thesample standard deviation of input values in the fieldfield_name.

SELECT MIN(salary) AS min, MAX(salary) AS max, STDDEV_SAMP(salary) AS stddev FROM emp;      min      |      max      |      stddev---------------+---------------+------------------25324          |74999          |13834.471662090747

SELECT MIN(salary / 12.0) AS min, MAX(salary / 12.0) AS max, STDDEV_SAMP(salary / 12.0) AS stddev FROM emp;       min        |       max       |     stddev------------------+-----------------+-----------------2110.3333333333335|6249.916666666667|1152.872638507562

`SUM_OF_SQUARES`

SUM_OF_SQUARES(field_name)

Input:

a numeric field. If this field contains onlynull values, the function returnsnull. Otherwise, the function ignoresnull values in this field.

Output:double numeric value

Description:

Returns the sum of squares of input values in the fieldfield_name.

SELECT MIN(salary) AS min, MAX(salary) AS max, SUM_OF_SQUARES(salary) AS sumsq       FROM emp;      min      |      max      |     sumsq---------------+---------------+----------------25324          |74999          |2.51740125721E11

SELECT MIN(salary / 24.0) AS min, MAX(salary / 24.0) AS max, SUM_OF_SQUARES(salary / 24.0) AS sumsq FROM emp;       min        |       max        |       sumsq------------------+------------------+-------------------1055.1666666666667|3124.9583333333335|4.370488293767361E8

`VAR_POP`

VAR_POP(field_name)

Input:

a numeric field. If this field contains onlynull values, the function returnsnull. Otherwise, the function ignoresnull values in this field.

Output:double numeric value

Description:

Returns thepopulation variance of input values in the fieldfield_name.

SELECT MIN(salary) AS min, MAX(salary) AS max, VAR_POP(salary) AS varpop FROM emp;      min      |      max      |     varpop---------------+---------------+----------------25324          |74999          |1.894786801075E8

SELECT MIN(salary / 24.0) AS min, MAX(salary / 24.0) AS max, VAR_POP(salary / 24.0) AS varpop FROM emp;       min        |       max        |      varpop------------------+------------------+------------------1055.1666666666667|3124.9583333333335|328956.04185329855

`VAR_SAMP`

VAR_SAMP(field_name)

Input:

a numeric field. If this field contains onlynull values, the function returnsnull. Otherwise, the function ignoresnull values in this field.

Output:double numeric value

Description:

Returns thesample variance of input values in the fieldfield_name.

SELECT MIN(salary) AS min, MAX(salary) AS max, VAR_SAMP(salary) AS varsamp FROM emp;      min      |      max      |     varsamp---------------+---------------+----------------25324          |74999          |1.913926061691E8

SELECT MIN(salary / 24.0) AS min, MAX(salary / 24.0) AS max, VAR_SAMP(salary / 24.0) AS varsamp FROM emp;       min        |       max        |     varsamp------------------+------------------+----------------1055.1666666666667|3124.9583333333335|332278.830154847

Movatterモバイル変換

Aggregate functions

General purpose

`AVG`

`COUNT`

`COUNT(ALL)`

`COUNT(DISTINCT)`

`FIRST/FIRST_VALUE`

`LAST/LAST_VALUE`

`MAX`

`MIN`

`SUM`

Statistics

`KURTOSIS`

`MAD`

`PERCENTILE`

`PERCENTILE_RANK`

`SKEWNESS`

`STDDEV_POP`

`STDDEV_SAMP`

`SUM_OF_SQUARES`

`VAR_POP`

`VAR_SAMP`