Introduction to optimizing query performance
This document provides an overview of optimization techniques that can improvequery performance in BigQuery. In general, queries that do less workperform better. They run faster and consume fewer resources, which can result inlower costs and fewer failures.
Query performance
Evaluating query performance in BigQuery involves several factors:
- Input data and data sources (I/O):How many bytes does your query read?
- Communication between nodes (shuffling):How many bytes does your query pass to the next stage? How many bytes does yourquery pass to each slot?
- Computation:How much CPU work does your query require?
- Outputs (materialization):How many bytes does your query write?
- Capacity and concurrency:How many slots are available and how many other queries are running at the same time?
- Query patterns:Are your queries following SQL best practices?
To evaluate specific queries or whether you are experiencing resource contention,you can useCloud Monitoring or theBigQuery administrative resource charts to monitor how your BigQuery jobs consumeresources over time. You can also use Gemini Cloud Assist toanalyze your jobs.If you identify a slow or resource-intensive query, you canfocus your performance optimizations on that query.
Some query patterns, especially those generated by business intelligence tools,can be accelerated by usingBigQuery BI Engine.BI Engine is a fast, in-memory analysis service that accelerates manySQL queries in BigQuery by intelligently caching the data you usemost frequently. BI Engine is built into BigQuery,which means you can often get better performance without any query modifications.
As with any systems, optimizing for performance sometimes involves tradeoffs.For example, using advanced SQL syntax can sometimes introduce complexity andreduce the queries' understandability for people who aren't SQL experts.Spending time on micro-optimizations for noncritical workloads could also divertresources away from building new features for your applications or from identifyingmore important optimizations. To help you achieve the highest possible return oninvestment, we recommend that you focus your optimizations on the workloads thatmatter most to your data analytics pipelines.
Optimization for capacity and concurrency
BigQuery offers two pricing models for queries:on-demand pricing andcapacity-based pricing. The on-demand model provides a shared pool of capacity, and pricing isbased on the amount of data that is processed by each query you run.
Thecapacity-based model isrecommended if you want to budget a consistent, monthly expenditureor if you need more capacity than is available with the on-demand model.When you use capacity-based pricing, you allocate dedicated query processingcapacity that is measured inslots.The cost of all bytes processed is included in the capacity-based price.In addition to fixedslot commitments,you can useautoscaling slots,which provide dynamic capacity based on your query workload.
The performance of queries that are run repeatedly on the same data can vary,and the variation is generally larger for queries using on-demand slots thanit is for queries using slot reservations.
During SQL query processing, BigQuery breaks down thecomputational capacity required to execute each stage of a query intoslots. BigQuery automatically determines thenumber of queries that can run concurrently as follows:
- On-demand model: number of slots available in the project
- Capacity-based model: number of slots available in the reservation
Queries that require more slots than are available arequeueduntil processing resources become available. After a query begins execution,BigQuery calculates how many slots each query stageuses based on the stage size and complexity and the number of slotsavailable. BigQuery uses a technique calledfair schedulingto ensure that each query has enough capacity to progress.
Access to more slots doesn't always result in faster performance for a query. However, alarger pool of slots can improve the performance of large or complexqueries, and the performance of highly concurrent workloads. Toimprove query performance, you canmodify your slot reservationsor set a higher limit forslots autoscaling.
Query plan and execution graph
BigQuery generates aquery plan each time that you run a query.Understanding this plan is critical for effective query optimization.The query plan includes execution statistics such as bytes read and slot timeconsumed. The query plan also includes details about the different stages of execution, which can help youdiagnose and improve query performance. Thequery execution graph provides a graphical interface for viewing the query plan and diagnosing query performance issues.
You can also use thejobs.get API methodor theINFORMATION_SCHEMA.JOBS viewto retrieve the query plan and timeline information. This information is used byBigQuery Visualizer,an open source tool that visually represents the flow of execution stages in aBigQuery job.
When BigQuery executes a query job, it converts the declarativeSQL statement into a graph of execution. This graph is broken up into a seriesof query stages, which themselves are composed of more granular sets ofexecution steps. BigQuery uses a heavily distributed parallelarchitecture to run these queries. The BigQuery stages model theunits of work thatmany potential workers might execute in parallel. Stages communicate with oneanother through afast, distributed shuffle architecture.
To estimate how computationally expensive a query is, you can look at thetotal number of slot seconds the query consumes. The lower the number of slotseconds, the better, because it means more resources are available to otherqueries running in the same project at the same time.
The query execution graph can help you understand howBigQuery executes queries and if certain stages dominate resourceutilization. For example, aJOIN stage that generates far more output rows thaninput rows might indicate an opportunity to filter earlier in the query.However, the managed nature of the service limits whether some details aredirectly actionable. For best practices and techniques to improve queryexecution and performance, seeOptimize query computation.
What's next
- Learn how to troubleshoot query execution issues using theBigQuery audit logs.
- Learn othercost-controlling techniquesfor BigQuery.
- View near real-time metadata about BigQuery jobs usingthe
INFORMATION_SCHEMA.JOBSview. - Learn how to monitor your BigQuery usage using theBigQuery System Tables Reports.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-02-19 UTC.