PostgreSQL 9.3.25 Documentation | ||||
---|---|---|---|---|
Prev | Up | Chapter 9. Functionsand Operators | Next |
Aggregate functions compute a singleresult from a set of input values. The built-in aggregate functionsare listed inTable 9-47andTable9-48. The special syntax considerations for aggregate functionsare explained inSection 4.2.7. ConsultSection 2.7 for additionalintroductory information.
Table 9-47. General-Purpose Aggregate Functions
Function | Argument Type(s) | Return Type | Description |
---|---|---|---|
array_agg(expression) | any | array of the argument type | input values, including nulls, concatenated into an array |
avg(expression) | smallint,int,bigint,real,double precision,numeric, orinterval | numeric for any integer-type argument,double precision for a floating-pointargument, otherwise the same as the argument data type | the average (arithmetic mean) of all input values |
bit_and(expression) | smallint,int,bigint, orbit | same as argument data type | the bitwise AND of all non-null input values, or null ifnone |
bit_or(expression) | smallint,int,bigint, orbit | same as argument data type | the bitwise OR of all non-null input values, or null ifnone |
bool_and(expression) | bool | bool | true if all input values are true, otherwise false |
bool_or(expression) | bool | bool | true if at least one input value is true, otherwise false |
count(*) | bigint | number of input rows | |
count(expression) | any | bigint | number of input rows for which the value ofexpression is not null |
every(expression) | bool | bool | equivalent tobool_and |
json_agg(expression) | any | json | aggregates values as a JSON array |
max(expression) | any array, numeric, string, or date/time type | same as argument type | maximum value ofexpressionacross all input values |
min(expression) | any array, numeric, string, or date/time type | same as argument type | minimum value ofexpressionacross all input values |
string_agg(expression,delimiter) | (text,text) or(bytea,bytea) | same as argument types | input values concatenated into a string, separated bydelimiter |
sum(expression) | smallint,int,bigint,real,double precision,numeric,interval, ormoney | bigint forsmallintorint arguments,numeric forbigint arguments,otherwise the same as the argument data type | sum ofexpression across allinput values |
xmlagg(expression) | xml | xml | concatenation of XML values (see alsoSection9.14.1.7) |
It should be noted that except forcount
, these functions return a null value whenno rows are selected. In particular,sum
of no rows returns null, not zero as onemight expect, andarray_agg
returnsnull rather than an empty array when there are no input rows. Thecoalesce
function can be used tosubstitute zero or an empty array for null when necessary.
Note: Boolean aggregates
bool_and
andbool_or
correspond to standard SQL aggregatesevery
andany
orsome
. As forany
andsome
, it seems that there is an ambiguity builtinto the standard syntax:SELECT b1 = ANY((SELECT b2 FROM t2 ...)) FROM t1 ...;Here
ANY
can be considered eitheras introducing a subquery, or as being an aggregate function, ifthe subquery returns one row with a Boolean value. Thus thestandard name cannot be given to these aggregates.
Note: Users accustomed to working with other SQL databasemanagement systems might be disappointed by the performance of the
count
aggregate when it is applied tothe entire table. A query like:SELECT count(*) FROM sometable;will require effort proportional to the size of the table:PostgreSQL will need to scaneither the entire table or the entirety of an index which includesall rows in the table.
The aggregate functionsarray_agg
,json_agg
,string_agg
, andxmlagg
, as well as similar user-defined aggregatefunctions, produce meaningfully different result values dependingon the order of the input values. This ordering is unspecified bydefault, but can be controlled by writing anORDER BY clause within the aggregate call, as showninSection4.2.7. Alternatively, supplying the input values from a sortedsubquery will usually work. For example:
SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
But this syntax is not allowed in the SQL standard, and is notportable to other database systems.
Table9-48 shows aggregate functions typically used in statisticalanalysis. (These are separated out merely to avoid cluttering thelisting of more-commonly-used aggregates.) Where the descriptionmentionsN, it means the number ofinput rows for which all the input expressions are non-null. In allcases, null is returned if the computation is meaningless, forexample whenN is zero.
Table 9-48. Aggregate Functions for Statistics
Function | Argument Type | Return Type | Description |
---|---|---|---|
corr(Y,X) | double precision | double precision | correlation coefficient |
covar_pop(Y,X) | double precision | double precision | population covariance |
covar_samp(Y,X) | double precision | double precision | sample covariance |
regr_avgx(Y,X) | double precision | double precision | average of the independent variable (sum(X)/N) |
regr_avgy(Y,X) | double precision | double precision | average of the dependent variable (sum(Y)/N) |
regr_count(Y,X) | double precision | bigint | number of input rows in which both expressions are nonnull |
regr_intercept(Y,X) | double precision | double precision | y-intercept of the least-squares-fit linear equation determinedby the (X,Y) pairs |
regr_r2(Y,X) | double precision | double precision | square of the correlation coefficient |
regr_slope(Y,X) | double precision | double precision | slope of the least-squares-fit linear equation determined bythe (X,Y) pairs |
regr_sxx(Y,X) | double precision | double precision | sum(X^2) -sum(X)^2/N ("sum ofsquares" of the independent variable) |
regr_sxy(Y,X) | double precision | double precision | sum(X*Y) -sum(X) * sum(Y)/N("sum of products" of independent timesdependent variable) |
regr_syy(Y,X) | double precision | double precision | sum(Y^2) -sum(Y)^2/N ("sum ofsquares" of the dependent variable) |
stddev(expression) | smallint,int,bigint,real,double precision, ornumeric | double precision for floating-pointarguments, otherwisenumeric | historical alias forstddev_samp |
stddev_pop(expression) | smallint,int,bigint,real,double precision, ornumeric | double precision for floating-pointarguments, otherwisenumeric | population standard deviation of the input values |
stddev_samp(expression) | smallint,int,bigint,real,double precision, ornumeric | double precision for floating-pointarguments, otherwisenumeric | sample standard deviation of the input values |
variance (expression) | smallint,int,bigint,real,double precision, ornumeric | double precision for floating-pointarguments, otherwisenumeric | historical alias forvar_samp |
var_pop (expression) | smallint,int,bigint,real,double precision, ornumeric | double precision for floating-pointarguments, otherwisenumeric | population variance of the input values (square of thepopulation standard deviation) |
var_samp (expression) | smallint,int,bigint,real,double precision, ornumeric | double precision for floating-pointarguments, otherwisenumeric | sample variance of the input values (square of the samplestandard deviation) |