Reads

This page describes the types of read requests you can send toBigtable, discusses performance implications, and presents afew recommendations for specific types of queries. Before you read this page,you should be familiar withthe overview of Bigtable.

Overview

Read requests to Bigtable stream back the contents of therequested rows in key order, meaning they are returned in the order in whichthey are stored. You are able to read any writes that have returned a response.

The queries that your table supports should help determine the type of read thatis best for your use case. Bigtable read requests fall into twogeneral categories:

Reading a single row
Scans, or reading multiple rows

Reads are atomic at the row level. This means that when you send a read requestfor a row, Bigtable returns either the entire row or, in the eventthe request fails, none of the row. A partial row is never returned unless youspecifically request one.

We strongly recommend that you use our Cloud Bigtableclient libraries to read data from a table instead of calling the APIdirectly.Code samples showing how to send read requests areavailable in multiple languages. All read requests make theReadRows API call.

Reading data with Data Boost serverless compute

Bigtable Data Boost lets you run batch read jobs and querieswithout affecting daily application traffic. Data Boost is aserverless compute service that you can use to read your Bigtabledata while your core application uses your cluster's nodes for compute.

Data Boost is ideal for scans and is not recommended for single-row reads.You can't use Data Boost for reverse scans. For more information andeligibility criteria, see theData Boostoverview.

Single-row reads

You can request a single row based on the row key. Single-row reads, also knownaspoint reads, are not compatible withData Boost. Code samples are availablefor the following variations:

Scans

Scans are the most common way to read Bigtable data. You can reada range of contiguous rows or multiple ranges of rows fromBigtable, by specifying a row key prefix or specifying beginningand ending row keys. Code samples are available for the following variations:

Reverse scans

Reverse scans let you read a range of rows backwards either by specifying a rowkey prefix or a range of rows. The row key prefix is used as the initiatingpoint of scan to read backwards. If you specify a range of rows, the end row key is used as theinitiating point of scan.

Note: If you are using the HBase client libraries and scanning a range of rows in reverse, specify the start row key as the initiating point of scan.

Scanning in reverse order can be useful for the following scenarios:

You want to find an event (row) and then read the previous N number ofevents.

You want to find the highest value prior to a given value. This can behelpful when you store time series data using a timestamp as a row keysuffix.

Reverse scans are less efficient than forward scans. In general,design your row keys so that most scans are forward. Use reverse scans forshort scans, such as 50 rows or less, to maintain low-latency response time.

To scan in reverse, you set the value for theReadRowsRequest fieldreversedto true. The default is false.

Reverse scans are available when you use the followingclient libraries:

Bigtable client library for C++ version 2.18.0 or later
Bigtable client library for Go version 1.21.0 or later
Bigtable client library for Java version 2.24.1 or later
Bigtable HBase client for Java version 2.10.0 or later

For code samples demonstrating how to use reverse scans, seeScan inreverse.

Use case examples

The following examples show how reverse scans can be used to find the last timea customer changed their password and price fluctuations for a product around a particularday.

Password resets

Consider an assumption that your row keys each contain a customer ID and adate, in the format123ABC#2022-05-02, and one of the columns ispassword_reset, which stores the hour when the password was reset.Bigtable automatically stores the data lexicographically, like thefollowing. Note that the column does not exist for rows (days) when the passwordwas not reset.

`123ABC#2022-02-12,password_reset:03``123ABC#2022-04-02,password_reset:11``123ABC#2022-04-14``123ABC#2022-05-02``223ABC#2022-05-22`

If you want to find the last time that customer123ABC reset their password,you can scan in reverse a range of123ABC# to123ABC#<DATE>, using today'sdate or a date in the future, for all rows that contain the columnpassword_reset with a row limit of 1.

Price changes

In this example, your row keys contain values for product, model, and timestamp,and one of the columns contains the price for the product and model at a giventime.

`productA#model2#1675604471,price:82.63``productA#model2#1676219411,price:82.97``productA#model2#1677681011,price:83.15``productA#model2#1680786011,price:83.99``productA#model2#1682452238,price:83.12`

If you want to find price fluctuations surrounding the price on February 14,2023, even though a row key for that particular date doesn't exist in thetable, you can do a forward scan starting from row keyproductA#model2#1676376000 for N number of rows, and then do a reverse scanfor the same number of rows from the same starting row. The two scans give youthe prices before and after the given time.

Filtered reads

If you only need rows that contain specific values, or partial rows, you canuse a filter with your read request. Filters allow you to be highly selective inthe data that you want.

Filters also let you make sure that reads match thegarbage collection policies that your table is using. Thisis particularly useful if you frequently write new timestamped cells to existingcolumns. Because garbage collection can take up to a week to remove expireddata, using a timestamp range filter to read data can ensure you don't read moredata than you need.

Theoverview of filters provides detailed explanations of the typesof filters that you can use.Using filters shows examples inmultiple languages.

Read data from an authorized view

To read data from an authorized view, you must use one of the following:

gcloud CLI
Bigtable client for Java

The other Bigtable client libraries don't yet support viewaccess.

Any method that calls theReadRows orSampleRowKeys method of theBigtable Data API is supported. You provide the authorized view IDin addition to the table ID when you create your client.

Read data from a continuous materialized view

Preview

This product or feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the Service Specific Terms. Pre-GA products and features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.

You can read data from acontinuous materializedview using SQL or theReadRows Data APIcall. Continuous materialized views are read-only. Data in a materialized viewis typed based on thequery that definesit.

SQL

To read data from a continuous materialized view using SQL, you can use eitherthe Bigtable Studio query editor or one of theclient librariesthat support SQL queries.

SQL automatically exposes query results as typed columns, so there's no needto handle encoding in your query.

When you create a continuous materialized view, Bigtableautomatically creates arow key schemafor the table that defines the structured row keys for the view. For moreinformation about querying structured row keys with SQL, seeStructured row keyqueries.

Data API

If you plan to read from a continuous materialized view with aReadRows callfrom one of the client libraries for Bigtable, you should reviewthe SQL query used to define the view. Take note of whether the view has adefined_key column, which is recommended for views that are meant to be readusingReadRows, and if it has a_timestamp column.

You also must know each column's type and decode the column data in yourapplication code.

Aggregated values in a continuous materialized view are stored using encodingdescribed in the following table, based on the output type of the column fromthe view definition.

Type	Encoding
BOOL	1 byte value, 1 = true, 0 = false
BYTES	No encoding
INT64 (or INT, SMALLINT, INTEGER, BIGINT, TINYINT, BYTEINT)	64-bit big-endian
FLOAT64	64-bit IEEE 754, excluding NaN and +/-inf
STRING	UTF-8
TIME/TIMESTAMP	64-bit integer representing the number of microseconds since the Unixepoch (consistent with GoogleSQL)

For more information, seeEncoding in the Data API reference.

In addition to knowing the type for each column in the view, you need to knowthe column family and column qualifier. The default column family is calleddefault, and the column qualifier is the alias specified in the definingquery. For example, consider a continuous materialized view defined with thisquery:

SELECT_key,SUM(clicks)ASsum_clicksFROMmytableGROUPBYsum_clicks

When you query the view withReadRows you provide the columnfamilydefault and the column qualifiersum_clicks.

Reads and performance

Reads that use filters are slower than reads without filters, and they increaseCPU utilization. On the other hand, they can significantly reduce the amount ofnetwork bandwidth that you use, by limiting the amount of data that is returned.In general, filters should be used to controlthroughput efficiency, notlatency.

If you want to optimize your read performance, consider the followingstrategies:

Restrict the rowset as much as possible. Limiting the number of rows thatyour nodes have to scan is the first step toward improving time to first byteand overall query latency. If you don't restrict the rowset,Bigtable will almost certainly have to scan your entire table.This is why we recommend that youdesign your schema in a waythat allows your most common queries to work this way.
For additional performance tuning after you've restricted the rowset, tryadding abasic filter. Restricting the set of columns or the number ofversions returned generally doesn't increase latency and can sometimes helpBigtable seek more efficiently past irrelevant data in each row.
If you want to fine-tune your read performance even more after the first twostrategies, consider using a morecomplicated filter. You might try this fora few reasons:
- You're still getting back a lot of data you don't want.
- You want to simplify your application code by pushing the query down intoBigtable.
Be aware, however, that filters requiring conditions, interleaves, or regular expressionmatching on large values tend to do more harm than good if they allow most ofthe scanned data through. This harm comes in the form of increased CPUutilization in your cluster without large savings client-side.

In addition to these strategies,avoid reading a large number ofnon-contiguous row keys or row ranges in a single read request. When yourequest hundreds of row keys or row ranges in a single request,Bigtable scans the table and reads the requested rowssequentially. This lack of parallelism affects theoverall latency, and anyreads that hit ahot node can increase thetail latency. The morerow ranges requested, the longer the read takes to complete. If this latency isunacceptable, you should instead send multiple concurrent requests that eachretrieve fewer row ranges.

Tip: We recommend that aReadRows read request should contain up to 100 row keys.

In general, reading more row ranges in a single request optimizes throughput,but not latency. Reading fewer row ranges in multiple concurrent requestsoptimizes latency, but not throughput. Finding the right balance between latencyand throughput will depend on your application's requirements, and can bearrived at by adjusting the concurrent read request count and the number of rowranges in one request.

Large rows

Bigtable enforces the following limits that apply to large rows:

256 MB is the maximum size of a row. If you need toread a row that has grown larger than the limit, you can paginate yourrequest and use a cells per row limit filter and acells per row offset filter. Be aware that if a writearrives for the row between the paginated read requests, the read might not beatomic.
512 KB is the maximum size of aReadRows API call. If you exceed the limit,Bigtable returns anINVALID_ARGUMENT error.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.

Movatterモバイル変換