Movatterモバイル変換


[0]ホーム

URL:


You don’t need a vector database - just use Postgres for everything. Read the case study on switching from Pinecone to Neon
PostgreSQL Tutorial
PostgreSQL Tutorial
/Window Functions

PostgreSQL Window Functions

Summary: in this tutorial, you will learn how to use the PostgreSQL window functions to perform the calculation across a set of rows related to the current row.

Setting up sample tables

First,create two tables namedproducts andproduct_groups for the demonstration:

CREATE TABLE product_groups (group_idserial PRIMARY KEY,group_nameVARCHAR (255)NOT NULL);CREATE TABLE products (product_idserial PRIMARY KEY,product_nameVARCHAR (255)NOT NULL,priceDECIMAL (11,2),group_idINT NOT NULL,FOREIGN KEY (group_id)REFERENCES product_groups (group_id));

Second,insert some rows into these tables:

INSERT INTO product_groups (group_name)VALUES('Smartphone'),('Laptop'),('Tablet');INSERT INTO products (product_name, group_id,price)VALUES('Microsoft Lumia',1,200),('HTC One',1,400),('Nexus',1,500),('iPhone',1,900),('HP Elite',2,1200),('Lenovo Thinkpad',2,700),('Sony VAIO',2,700),('Dell Vostro',2,800),('iPad',3,700),('Kindle Fire',3,150),('Samsung Galaxy Tab',3,200);

Introduction to PostgreSQL window functions

The easiest way to understand the window functions is to start by reviewing theaggregate functions. An aggregate function aggregates data from a set of rows into a single row.

The following example uses theAVG() aggregate function to calculate the average price of all products in theproducts table.

SELECTAVG (price)FROMproducts;

PostgreSQL Window Function - AVG functionTo apply the aggregate function to subsets of rows, you use theGROUP BY clause. The following example returns the average price for every product group.

SELECTgroup_name,AVG (price)FROMproductsINNER JOIN product_groupsUSING (group_id)GROUP BYgroup_name;

PostgreSQL Window Function - AVG function with GROUP BYAs you see clearly from the output, theAVG() function reduces the number of rows returned by the queries in both examples.

Similar to an aggregate function, a window function operates on a set of rows. However, it does not reduce the number of rows returned by the query.

The termwindow describes the set of rows on which the window function operates. A window function returns values from the rows in a window.

For instance, the following query returns the product name, the price, product group name, along with the average prices of each product group.

SELECTproduct_name,price,group_name,AVG (price)OVER (   PARTITION BY group_name)FROMproductsINNER JOINproduct_groupsUSING (group_id);

In this query, theAVG() function works as awindow function that operates on a set of rows specified by theOVER clause. Each set of rows is called a window.

The new syntax for this query is theOVER clause:

AVG(price)OVER (PARTITION BY group_name)

In this syntax, thePARTITION BY distributes the rows of the result set into groups and theAVG() function is applied to each group to return the average price for each.

Note that a window function always performs the calculation on the result set after theJOIN,WHERE,GROUP BY andHAVING clause and before the finalORDER BY clause in the evaluation order.

PostgreSQL Window Function Syntax

PostgreSQL has a sophisticatedsyntax for window function call. The following illustrates the simplified version:

window_function(arg1, arg2,..)OVER (   [PARTITION BY partition_expression]   [ORDER BY sort_expression [ASC | DESC] [NULLS {FIRST | LAST }])

In this syntax:

window_function(arg1,arg2,...)

Thewindow_function is the name of the window function. Some window functions do not accept any argument.

PARTITION BY clause

ThePARTITION BY clause divides rows into multiple groups or partitions to which the window function is applied. Like the example above, we used the product group to divide the products into groups (or partitions).

ThePARTITION BY clause is optional. If you skip thePARTITION BY clause, the window function will treat the whole result set as a single partition.

ORDER BY clause

TheORDER BY clause specifies the order of rows in each partition to which the window function is applied.

TheORDER BY clause uses theNULLS FIRST orNULLS LAST option to specify whether nullable values should be first or last in the result set. The default isNULLS LAST option.

frame_clause

Theframe_clause defines a subset of rows in the current partition to which the window function is applied. This subset of rows is called a frame.

If you use multiple window functions in a query:

SELECT    wf1()OVER(PARTITION BY c1ORDER BY c2),    wf2()OVER(PARTITION BY c1ORDER BY c2)FROM table_name;

you can use theWINDOW clause to shorten the query as shown in the following query:

SELECT   wf1()OVER w,   wf2()OVER w,FROM table_nameWINDOW wAS (PARTITION BY c1ORDER BY c2);

It is also possible to use theWINDOW clause even though you call one window function in a query:

SELECT wf1()OVER wFROM table_nameWINDOW wAS (PARTITION BY c1ORDER BY c2);

PostgreSQL window function List

The following table lists all window functions provided by PostgreSQL. Note that some aggregate functions such asAVG(),MIN(),MAX(),SUM(), andCOUNT() can be also used as window functions.

NameDescription
CUME_DISTReturn the relative rank of the current row.
DENSE_RANKRank the current row within its partition without gaps.
FIRST_VALUEReturn a value evaluated against the first row within its partition.
LAGReturn a value evaluated at the row that is at a specified physical offset row before the current row within the partition.
LAST_VALUEReturn a value evaluated against the last row within its partition.
LEADReturn a value evaluated at the row that is offset rows after the current row within the partition.
NTILEDivide rows in a partition as equally as possible and assign each row an integer starting from 1 to the argument value.
NTH_VALUEReturn a value evaluated against the nth row in an ordered partition.
PERCENT_RANKReturn the relative rank of the current row (rank-1) / (total rows – 1)
RANKRank the current row within its partition with gaps.
ROW_NUMBERNumber the current row within its partition starting from 1.

The ROW_NUMBER(), RANK(), and DENSE_RANK() functions

TheROW_NUMBER(),RANK(), andDENSE_RANK() functions assign an integer to each row based on its order in its result set.

TheROW_NUMBER() function assigns a sequential number to each row in each partition. See the following query:

SELECTproduct_name,group_name,price,ROW_NUMBER ()OVER (PARTITION BY group_nameORDER BYprice)FROMproductsINNER JOIN product_groupsUSING (group_id);

PostgreSQL Window Function - ROW_NUMBER functionTheRANK() function assigns ranking within an ordered partition. If rows have the same values, the RANK() function assigns the same rank, with the next ranking(s) skipped.

See the following query:

SELECTproduct_name,group_name,  price,RANK ()OVER (PARTITION BY group_nameORDER BYprice)FROMproductsINNER JOIN product_groupsUSING (group_id);

PostgreSQL Window Function - RANK functionIn the laptop product group, bothSony VAIO andLenovo Thinkpad products have the same price, therefore, they receive the same rank 1. The next row in the group isDell Vostro that receives the rank 3 because the rank 2 is skipped.

Similar to theRANK() function, theDENSE_RANK() function assigns a rank to each row within an ordered partition, but the ranks have no gap. In other words, the same ranks are assigned to multiple rows and no ranks are skipped.

SELECTproduct_name,group_name,price,DENSE_RANK ()OVER (PARTITION BY group_nameORDER BYprice)FROMproductsINNER JOIN product_groupsUSING (group_id);

PostgreSQL Window Function - DENSE_RANK functionWithin the laptop product group, rank 1 is assigned twice toSony VAIO andLenovo Thinkpad. The next rank is 2 assigned toDell Vostro.

The FIRST_VALUE and LAST_VALUE functions

TheFIRST_VALUE() function returns a value evaluated against the first row within its partition, whereas theLAST_VALUE() function returns a value evaluated against the last row in its partition.

The following statement uses theFIRST_VALUE() to return the lowest price for every product group.

SELECTproduct_name,group_name,price,FIRST_VALUE (price)OVER (PARTITION BY group_nameORDER BYprice)AS lowest_price_per_groupFROMproductsINNER JOIN product_groupsUSING (group_id);

PostgreSQL Window Function - FIRST_VALUE functionThe following statement uses theLAST_VALUE() function to return the highest price for every product group.

SELECTproduct_name,group_name,price,LAST_VALUE (price)OVER (PARTITION BY group_nameORDER BYpriceRANGE BETWEEN UNBOUNDED PRECEDINGAND UNBOUNDED FOLLOWING)AS highest_price_per_groupFROMproductsINNER JOIN product_groupsUSING (group_id);

PostgreSQL Window Function - LAST_VALUE functionNotice that we added the frame clauseRANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING because by default the frame clause isRANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.

The LAG and LEAD functions

TheLAG() function has the ability to access data from the previous row, while theLEAD() function can access data from the next row.

BothLAG() andLEAD() functions have the same syntax as follows:

LAG  (expression [,offset] [,default]) over_clause;LEAD (expression [,offset] [,default]) over_clause;

In this syntax:

  • expression – a column or expression to compute the returned value.
  • offset – the number of rows preceding (LAG)/ following (LEAD) the current row. It defaults to 1.
  • default – the default returned value if theoffset goes beyond the scope of the window. Thedefault isNULL if you skip it.

The following statement uses theLAG() function to return the prices from the previous row and calculates the difference between the price of the current row and the previous row.

SELECTproduct_name,group_name,price,LAG (price,1)OVER (PARTITION BY group_nameORDER BYprice)AS prev_price,price- LAG (price,1)OVER (PARTITION BY group_nameORDER BYprice)AS cur_prev_diffFROMproductsINNER JOIN product_groupsUSING (group_id);

PostgreSQL Window Function - LAG functionThe following statement uses theLEAD() function to return the prices from the next row and calculates the difference between the price of the current row and the next row.

SELECTproduct_name,group_name,price,LEAD (price,1)OVER (PARTITION BY group_nameORDER BYprice)AS next_price,price- LEAD (price,1)OVER (PARTITION BY group_nameORDER BYprice)AS cur_next_diffFROMproductsINNER JOIN product_groupsUSING (group_id);

PostgreSQL Window Function - LEAD functionIn this tutorial, we have introduced you to the PostgreSQL window functions and shown you some examples of using them to query data.

Last updated on

Was this page helpful?
Thank you for your feedback!

[8]ページ先頭

©2009-2025 Movatter.jp