Analyze data using pipe syntax
This tutorial shows you how to write queries using pipe syntax to analyze data.
Pipe syntax is an extension to GoogleSQL that supports a linear querystructure designed to make your queries easier to read, write, and maintain.Pipe syntax consists of the pipe symbol|>, apipe operatorname, and any arguments. For more information, see the following resources:
- For an introduction to pipe syntax, seeWork with pipe query syntax.
- For full syntax details, see thePipe query syntaxreference documentation.
In this tutorial, you build a complex query in pipe syntax using the publiclyavailablebigquery-public-data.austin_bikeshare.bikeshare_trips table,which contains data about bicycle trips.
Objectives
- View table data by starting a query with a
FROMclause. - Add columns by using the
EXTENDpipe operator. - Aggregate data by day and week by using the
AGGREGATEpipe operator. - Aggregate data over a sliding window by using the
CROSS JOINpipe operator. - Filter data by using the
WHEREpipe operator. - Compare the linear query structure of pipe syntax to the nested querystructure of standard syntax when performing multi-level aggregations.
Before you begin
To get started using a BigQuery public dataset, you must create or select a project. The first terabyte of data processed per month is free, so you can start querying public datasets without enabling billing. If you intend to go beyond thefree tier, you must also enable billing.
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
- Create a project: To create a project, you need the Project Creator role (
roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission.Learn how to grant roles.
Verify that billing is enabled for your Google Cloud project.
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
- Create a project: To create a project, you need the Project Creator role (
roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission.Learn how to grant roles.
Verify that billing is enabled for your Google Cloud project.
- BigQuery is automatically enabled in new projects. To activate BigQuery in a preexisting project,
Enable the BigQuery API.
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM role (
roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enablepermission.Learn how to grant roles.
For more information about the different ways to run queries, seeRun a query.
View table data
To retrieve all the data from thebikeshare_trips table,run the following query:
Pipe syntax
FROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`;Standard syntax
SELECT*FROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`;In pipe syntax, the query can start with aFROM clausewithout aSELECT clause to return table results.
The result is similar to the following:
+----------+-----------------+---------+-----------+-------------------------+-----+| trip_id | subscriber_type | bike_id | bike_type | start_time | ... |+----------+-----------------+---------+-----------+-------------------------+-----+| 28875008 | Pay-as-you-ride | 18181 | electric | 2023-02-12 12:46:32 UTC | ... || 28735401 | Explorer | 214 | classic | 2023-01-13 12:01:45 UTC | ... || 29381980 | Local365 | 21803 | electric | 2023-04-20 08:43:46 UTC | ... || ... | ... | ... | ... | ... | ... |+----------+-----------------+---------+-----------+-------------------------+-----+
Add columns
In thebikeshare_trips table, thestart_time column is a timestamp, but youmight want to add a column that only shows the date of the trip. To add a columnin pipe syntax, use theEXTEND pipe operator:
Pipe syntax
FROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`|>EXTENDCAST(start_timeASDATE)ASdate;Standard syntax
SELECT*,CAST(start_timeASDATE)ASdateFROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`;The result is similar to the following:
+----------+-----------------+---------+-----------+-------------------------+------------+-----+| trip_id | subscriber_type | bike_id | bike_type | start_time | date | ... |+----------+-----------------+---------+-----------+-------------------------+------------+-----+| 28875008 | Pay-as-you-ride | 18181 | electric | 2023-02-12 12:46:32 UTC | 2023-02-12 | ... || 28735401 | Explorer | 214 | classic | 2023-01-13 12:01:45 UTC | 2023-01-13 | ... || 29381980 | Local365 | 21803 | electric | 2023-04-20 08:43:46 UTC | 2023-04-20 | ... || ... | ... | ... | ... | ... | ... | ... |+----------+-----------------+---------+-----------+-------------------------+------------+-----+
Aggregate daily data
You can group by date to find the total number of trips taken and the bikes usedper day.
- Use the
AGGREGATEpipe operatorwith theCOUNTfunction to find the total number of trips taken and bikesused. Use the
GROUP BYclause to group the results by date.
Pipe syntax
FROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`|>EXTENDCAST(start_timeASDATE)ASdate|>AGGREGATECOUNT(*)AStrips,COUNT(DISTINCTbike_id)ASdistinct_bikesGROUPBYdate;Standard syntax
SELECTCAST(start_timeASDATE)ASdate,COUNT(*)AStrips,COUNT(DISTINCTbike_id)ASdistinct_bikesFROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`GROUPBYdate;The result is similar to the following:
+------------+-------+----------------+| date | trips | distinct_bikes |+------------+-------+----------------+| 2023-04-20 | 841 | 197 || 2023-01-27 | 763 | 148 || 2023-06-12 | 562 | 202 || ... | ... | ... |+------------+-------+----------------+Order results
To sort the results in descending order by thedate column, add theDESCsuffix to theGROUP BY clause:
Pipe syntax
FROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`|>EXTENDCAST(start_timeASDATE)ASdate|>AGGREGATECOUNT(*)AStrips,COUNT(DISTINCTbike_id)ASdistinct_bikesGROUPBYdateDESC;Standard syntax
SELECTCAST(start_timeASDATE)ASdate,COUNT(*)AStrips,COUNT(DISTINCTbike_id)ASdistinct_bikesFROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`GROUPBYdateORDERBYdateDESC;The result is similar to the following:
+------------+-------+----------------+| date | trips | distinct_bikes |+------------+-------+----------------+| 2024-06-30 | 331 | 90 || 2024-06-29 | 395 | 123 || 2024-06-28 | 437 | 137 || ... | ... | ... |+------------+-------+----------------+In pipe syntax, you can add the sorting suffix directly to theGROUP BY clausewithout using theORDER BY pipe operator.Adding the suffix to theGROUP BY clause is one of several optionalshorthand ordering features withAGGREGATEthat pipe syntax supports. In standard syntax, this isn't possible and youmust use theORDER BY clause for sorting.
Aggregate weekly data
Now that you have data on the number of bikes used each day, you can buildon your query to find the number of distinct bikes used over each seven-daywindow.
To update the rows in your table to display weeks instead of days, use theDATE_TRUNC functionin theGROUP BY clause and set the granularity toWEEK:
Pipe syntax
FROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`|>EXTENDCAST(start_timeASDATE)ASdate|>AGGREGATECOUNT(*)AStrips,COUNT(DISTINCTbike_id)ASdistinct_bikes,GROUPBYDATE_TRUNC(date,WEEK)ASdateDESC;Standard syntax
SELECTDATE_TRUNC(CAST(start_timeASDATE),WEEK)ASdate,COUNT(*)AStrips,COUNT(DISTINCTbike_id)ASdistinct_bikes,FROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`GROUPBYdateORDERBYdateDESC;The result is similar to the following:
+------------+-------+----------------+| date | trips | distinct_bikes |+------------+-------+----------------+| 2024-06-30 | 331 | 90 || 2024-06-23 | 3206 | 213 || 2024-06-16 | 3441 | 212 || ... | ... | ... |+------------+-------+----------------+Aggregate over a sliding window
The results in the preceding section show trips in afixed window betweenstart and end dates, such as2024-06-23 through2024-06-29. Instead, you might want to seetrips in asliding window, over a seven-day period thatmoves forward in time with each new day. In other words, for any given date youmight want to know about the number of trips taken and bikes used over thefollowing week.
To apply a sliding window to your data, first copy each trip forward sixadditionalactive days from its start date. Then, compute the dates of theactive days by using theDATE_ADD function. Finally, aggregate the trips andbike IDs for each active day.
To copy your data forward, use the
GENERATE_ARRAYfunction and across join:Pipe syntax
FROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`|>EXTENDCAST(start_timeASDATE)ASdate|>CROSSJOINUNNEST(GENERATE_ARRAY(0,6))ASdiff_days;Standard syntax
SELECT*,CAST(start_timeASDATE)ASdateFROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`CROSSJOINUNNEST(GENERATE_ARRAY(0,6))ASdiff_days;The
GENERATE_ARRAYfunction creates an array with seven elements,0to6. TheCROSS JOIN UNNESToperation creates seven copies of each row, with anewdiff_dayscolumn that contains one of the array element values from0to6for each row. You can use thediff_daysvalues as theadjustment to the original date to slide the window forward by that manydays, up to seven days past the original date.To see the calculated active dates for trips, use the
EXTENDpipe operatorwith theDATE_ADDfunction to create a column calledactive_datethat contains the start date plus the value in thediff_dayscolumn:Pipe syntax
FROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`|>EXTENDCAST(start_timeASDATE)ASdate|>CROSSJOINUNNEST(GENERATE_ARRAY(0,6))ASdiff_days|>EXTENDDATE_ADD(date,INTERVALdiff_daysDAY)ASactive_date;Standard syntax
SELECT*,DATE_ADD(date,INTERVALdiff_daysDAY)ASactive_dateFROM(SELECT*,CAST(start_timeASDATE)ASdateFROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`CROSSJOINUNNEST(GENERATE_ARRAY(0,6))ASdiff_days)For example, a trip that starts on
2024-05-20is alsoconsidered active on each day through2024-05-26.Finally, aggregate trips IDs and bike IDs and group by
active_date:Pipe syntax
FROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`|>EXTENDCAST(start_timeASDATE)ASdate|>CROSSJOINUNNEST(GENERATE_ARRAY(0,6))ASdiff_days|>EXTENDDATE_ADD(date,INTERVALdiff_daysDAY)ASactive_date|>AGGREGATECOUNT(DISTINCTbike_id)ASactive_7d_bikes,COUNT(trip_id)ASactive_7d_tripsGROUPBYactive_dateDESC;Standard syntax
SELECTDATE_ADD(date,INTERVALdiff_daysDAY)ASactive_date,COUNT(DISTINCTbike_id)ASactive_7d_bikes,COUNT(trip_id)ASactive_7d_tripsFROM(SELECT*,CAST(start_timeASDATE)ASdateFROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`CROSSJOINUNNEST(GENERATE_ARRAY(0,6))ASdiff_days)GROUPBYactive_dateORDERBYactive_dateDESC;The result is similar to the following:
+-------------+-----------------+-----------------+| active_date | active_7d_bikes | active_7d_trips |+-------------+-----------------+-----------------+| 2024-07-06 | 90 | 331 || 2024-07-05 | 142 | 726 || 2024-07-04 | 186 | 1163 || ... | ... | ... |+-------------+-----------------+-----------------+
Filter future dates
In the preceding query, the dates extend into the future up to six days beyondthe last date in your data. Tofilter out dates that extend beyond the end of your data, set amaximum date in your query:
- Add another
EXTENDpipe operator that uses awindow function with anOVERclause to compute the maximum date in thetable. - Use the
WHEREpipe operator to filter out thegenerated rows that are past the maximum date.
Pipe syntax
FROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`|>EXTENDCAST(start_timeASDATE)ASdate|>EXTENDMAX(date)OVER()ASmax_date|>CROSSJOINUNNEST(GENERATE_ARRAY(0,6))ASdiff_days|>EXTENDDATE_ADD(date,INTERVALdiff_daysDAY)ASactive_date|>WHEREactive_date<=max_date|>AGGREGATECOUNT(DISTINCTbike_id)ASactive_7d_bikes,COUNT(trip_id)ASactive_7d_tripsGROUPBYactive_dateDESC;Standard syntax
SELECTDATE_ADD(date,INTERVALdiff_daysDAY)ASactive_date,COUNT(DISTINCTbike_id)ASactive_7d_bikes,COUNT(trip_id)ASactive_7d_tripsFROM(SELECT*FROM(SELECT*,DATE_ADD(date,INTERVALdiff_daysDAY)ASactive_date,MAX(date)OVER()ASmax_dateFROM(SELECT*,CAST(start_timeASDATE)ASdate,FROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`CROSSJOINUNNEST(GENERATE_ARRAY(0,6))ASdiff_days))WHEREactive_date<=max_date)GROUPBYactive_dateORDERBYactive_dateDESC;The result is similar to the following:
+-------------+-----------------+-----------------+| active_date | active_7d_bikes | active_7d_trips |+-------------+-----------------+-----------------+| 2024-06-30 | 212 | 3031 || 2024-06-29 | 213 | 3206 || 2024-06-28 | 219 | 3476 || ... | ... | ... |+-------------+-----------------+-----------------+What's next
- For more information about how pipe syntax works, seeWork with pipe query syntax.
- For more technical information, see thePipe query syntaxreference documentation.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.