Perform efficient top-k retrieval Stay organized with collections Save and categorize content based on your preferences.
Note: This feature is available with the Spanner Enterprise edition and Enterprise Plus edition. For more information, see theSpanner editions overview.
Many applications query a database to populate a single page in theirapplications. In such applications, the application doesn't need all of thematches, but only the top-k matches based on index sort order. Search indexescan implement this type of search very efficiently. This page describes how tocreate and search an index that has top-k matching.
Create search indexes for top-k matches
To configure a search index for top-k matching, useORDER BY to order thesearch index by a specific column. Queries need to have anORDER BY clausethat exactly matches the search index sort order (including ascending versusdescending direction) and aLIMIT clause that requests the query to stop afterfinding k-matching rows.
You can also implement pagination using these clauses. For more information, seePaginate search queries.
For some use cases, it might make sense to maintain multiple search indexessorted by different columns. Likepartitioning,it's a trade-off between storage and write cost versus query latency.
For example, consider a table that uses the following schema:
GoogleSQL
CREATETABLEAlbums(AlbumIdSTRING(MAX)NOTNULL,RecordTimestampINT64NOTNULL,ReleaseTimestampINT64NOTNULL,ListenTimestampINT64NOTNULL,AlbumTitleSTRING(MAX),AlbumTitle_TokensTOKENLISTAS(TOKENIZE_FULLTEXT(AlbumTitle))HIDDEN)PRIMARYKEY(AlbumId);CREATESEARCHINDEXAlbumsRecordTimestampIndexONAlbums(AlbumTitle_Tokens,SingerId_Tokens)STORING(ListenTimestamp)ORDERBYRecordTimestampDESCCREATESEARCHINDEXAlbumsReleaseTimestampIndexONAlbums(AlbumTitle_Tokens)STORING(ListenTimestamp)ORDERBYReleaseTimestampDESCPostgreSQL
CREATETABLEalbums(albumidcharactervaryingNOTNULL,recordtimestampbigintNOTNULL,releasetimestampbigintNOTNULL,listentimestampbigintNOTNULL,albumtitlecharactervarying,albumtitle_tokensspanner.tokenlistGENERATEDALWAYSAS(spanner.tokenize_fulltext(albumtitle))VIRTUALHIDDEN,PRIMARYKEY(albumid));CREATESEARCHINDEXalbumsrecordtimestampindexONAlbums(albumtitle_tokens,singerid_tokens)INCLUDE(listentimestamp)ORDERBYrecordtimestampDESCCREATESEARCHINDEXalbumsreleasetimestampindexONAlbums(albumtitle_tokens)INCLUDE(listentimestamp)ORDERBYreleasetimestampDESCQuery search indexes for top-k matches
As stated previously, queries need to have anORDER BY clausethat exactly matches the search index sort order (including ascending versusdescending direction) and aLIMIT clause that requests the query to stop afterfinding k-matching rows.
The following list analyzes the efficiency of some common queries.
This query is very efficient. It selects the
AlbumsRecordTimestampIndexindex. Even if there are many albums with the word "happy", the query onlyscans a small number of rows:GoogleSQL
SELECTAlbumIdFROMAlbumsWHERESEARCH(AlbumTitle_Tokens,'happy')ORDERBYRecordTimestampDESCLIMIT10PostgreSQL
SELECTalbumidFROMalbumsWHEREspanner.search(albumtitle_tokens,'happy')ORDERBYrecordtimestampDESCLIMIT10The same query, requesting sort order by
ReleaseTimestampin descendingorder, uses theAlbumsReleaseTimestampIndexindex and is equallyefficient:GoogleSQL
SELECTAlbumIdFROMAlbumsWHERESEARCH(AlbumTitle_Tokens,'happy')ORDERBYReleaseTimestampDESCLIMIT10PostgreSQL
SELECTalbumidFROMalbumsWHEREspanner.search(albumtitle_tokens,'happy')ORDERBYreleasetimestampDESCLIMIT10A query that requests sort order by
ListenTimestampdoesn't execute atop-k query efficiently. It has to fetch all matching albums, sort them byListenTimestamp,and return the top 10. Such a query uses more resourcesif there's a large number of documents that contain the term "happy".GoogleSQL
SELECTAlbumIdFROMAlbumsWHERESEARCH(AlbumTitle_Tokens,'happy')ORDERBYListenTimestampDESCLIMIT10PostgreSQL
SELECTalbumidFROMalbumsWHEREspanner.search(albumtitle_tokens,'happy')ORDERBYlistentimestampDESCLIMIT10Similarly, a query doesn't run efficiently if it requests that results areordered using the
RecordTimestampcolumn in ascending order. It scans allrows with the word "happy", despite having aLIMIT.GoogleSQL
SELECTAlbumIdFROMAlbumsWHERESEARCH(AlbumTitle_Tokens,'happy')ORDERBYRecordTimestampASCLIMIT10PostgreSQL
SELECTalbumidFROMalbumsWHEREspanner.search(albumtitle_tokens,'happy')ORDERBYrecordtimestampASCLIMIT10
What's next
- Learn aboutfull-text search queries.
- Learn how torank search results.
- Learn how topaginate search results.
- Learn how tomix full-text and non-text queries.
- Learn how tosearch multiple columns.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.