Analyze data using pipe syntax

This tutorial shows you how to write queries using pipe syntax to analyze data.

Pipe syntax is an extension to GoogleSQL that supports a linear querystructure designed to make your queries easier to read, write, and maintain.Pipe syntax consists of the pipe symbol|>, apipe operatorname, and any arguments. For more information, see the following resources:

In this tutorial, you build a complex query in pipe syntax using the publiclyavailablebigquery-public-data.austin_bikeshare.bikeshare_trips table,which contains data about bicycle trips.

Objectives

Before you begin

To get started using a BigQuery public dataset, you must create or select a project. The first terabyte of data processed per month is free, so you can start querying public datasets without enabling billing. If you intend to go beyond thefree tier, you must also enable billing.

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.create permission.Learn how to grant roles.
    Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

    Go to project selector

  3. Verify that billing is enabled for your Google Cloud project.

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.create permission.Learn how to grant roles.
    Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

    Go to project selector

  5. Verify that billing is enabled for your Google Cloud project.

  6. BigQuery is automatically enabled in new projects. To activate BigQuery in a preexisting project,

    Enable the BigQuery API.

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enable permission.Learn how to grant roles.

    Enable the API

For more information about the different ways to run queries, seeRun a query.

View table data

To retrieve all the data from thebikeshare_trips table,run the following query:

Pipe syntax

FROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`;

Standard syntax

SELECT*FROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`;

In pipe syntax, the query can start with aFROM clausewithout aSELECT clause to return table results.

The result is similar to the following:

+----------+-----------------+---------+-----------+-------------------------+-----+| trip_id  | subscriber_type | bike_id | bike_type | start_time              | ... |+----------+-----------------+---------+-----------+-------------------------+-----+| 28875008 | Pay-as-you-ride | 18181   | electric  | 2023-02-12 12:46:32 UTC | ... || 28735401 | Explorer        | 214     | classic   | 2023-01-13 12:01:45 UTC | ... || 29381980 | Local365        | 21803   | electric  | 2023-04-20 08:43:46 UTC | ... || ...      | ...             | ...     | ...       | ...                     | ... |+----------+-----------------+---------+-----------+-------------------------+-----+

Add columns

In thebikeshare_trips table, thestart_time column is a timestamp, but youmight want to add a column that only shows the date of the trip. To add a columnin pipe syntax, use theEXTEND pipe operator:

Pipe syntax

FROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`|>EXTENDCAST(start_timeASDATE)ASdate;

Standard syntax

SELECT*,CAST(start_timeASDATE)ASdateFROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`;

The result is similar to the following:

+----------+-----------------+---------+-----------+-------------------------+------------+-----+| trip_id  | subscriber_type | bike_id | bike_type | start_time              | date       | ... |+----------+-----------------+---------+-----------+-------------------------+------------+-----+| 28875008 | Pay-as-you-ride | 18181   | electric  | 2023-02-12 12:46:32 UTC | 2023-02-12 | ... || 28735401 | Explorer        | 214     | classic   | 2023-01-13 12:01:45 UTC | 2023-01-13 | ... || 29381980 | Local365        | 21803   | electric  | 2023-04-20 08:43:46 UTC | 2023-04-20 | ... || ...      | ...             | ...     | ...       | ...                     | ...        | ... |+----------+-----------------+---------+-----------+-------------------------+------------+-----+

Aggregate daily data

You can group by date to find the total number of trips taken and the bikes usedper day.

  • Use theAGGREGATE pipe operatorwith theCOUNT function to find the total number of trips taken and bikesused.
  • Use theGROUP BY clause to group the results by date.

Pipe syntax

FROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`|>EXTENDCAST(start_timeASDATE)ASdate|>AGGREGATECOUNT(*)AStrips,COUNT(DISTINCTbike_id)ASdistinct_bikesGROUPBYdate;

Standard syntax

SELECTCAST(start_timeASDATE)ASdate,COUNT(*)AStrips,COUNT(DISTINCTbike_id)ASdistinct_bikesFROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`GROUPBYdate;

The result is similar to the following:

+------------+-------+----------------+| date       | trips | distinct_bikes |+------------+-------+----------------+| 2023-04-20 | 841   | 197            || 2023-01-27 | 763   | 148            || 2023-06-12 | 562   | 202            || ...        | ...   | ...            |+------------+-------+----------------+

Order results

To sort the results in descending order by thedate column, add theDESCsuffix to theGROUP BY clause:

Pipe syntax

FROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`|>EXTENDCAST(start_timeASDATE)ASdate|>AGGREGATECOUNT(*)AStrips,COUNT(DISTINCTbike_id)ASdistinct_bikesGROUPBYdateDESC;

Standard syntax

SELECTCAST(start_timeASDATE)ASdate,COUNT(*)AStrips,COUNT(DISTINCTbike_id)ASdistinct_bikesFROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`GROUPBYdateORDERBYdateDESC;

The result is similar to the following:

+------------+-------+----------------+| date       | trips | distinct_bikes |+------------+-------+----------------+| 2024-06-30 | 331   | 90             || 2024-06-29 | 395   | 123            || 2024-06-28 | 437   | 137            || ...        | ...   | ...            |+------------+-------+----------------+

In pipe syntax, you can add the sorting suffix directly to theGROUP BY clausewithout using theORDER BY pipe operator.Adding the suffix to theGROUP BY clause is one of several optionalshorthand ordering features withAGGREGATEthat pipe syntax supports. In standard syntax, this isn't possible and youmust use theORDER BY clause for sorting.

Aggregate weekly data

Now that you have data on the number of bikes used each day, you can buildon your query to find the number of distinct bikes used over each seven-daywindow.

To update the rows in your table to display weeks instead of days, use theDATE_TRUNC functionin theGROUP BY clause and set the granularity toWEEK:

Pipe syntax

FROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`|>EXTENDCAST(start_timeASDATE)ASdate|>AGGREGATECOUNT(*)AStrips,COUNT(DISTINCTbike_id)ASdistinct_bikes,GROUPBYDATE_TRUNC(date,WEEK)ASdateDESC;

Standard syntax

SELECTDATE_TRUNC(CAST(start_timeASDATE),WEEK)ASdate,COUNT(*)AStrips,COUNT(DISTINCTbike_id)ASdistinct_bikes,FROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`GROUPBYdateORDERBYdateDESC;

The result is similar to the following:

+------------+-------+----------------+| date       | trips | distinct_bikes |+------------+-------+----------------+| 2024-06-30 | 331   | 90             || 2024-06-23 | 3206  | 213            || 2024-06-16 | 3441  | 212            || ...        | ...   | ...            |+------------+-------+----------------+

Aggregate over a sliding window

The results in the preceding section show trips in afixed window betweenstart and end dates, such as2024-06-23 through2024-06-29. Instead, you might want to seetrips in asliding window, over a seven-day period thatmoves forward in time with each new day. In other words, for any given date youmight want to know about the number of trips taken and bikes used over thefollowing week.

To apply a sliding window to your data, first copy each trip forward sixadditionalactive days from its start date. Then, compute the dates of theactive days by using theDATE_ADD function. Finally, aggregate the trips andbike IDs for each active day.

  1. To copy your data forward, use theGENERATE_ARRAY function and across join:

    Pipe syntax

    FROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`|>EXTENDCAST(start_timeASDATE)ASdate|>CROSSJOINUNNEST(GENERATE_ARRAY(0,6))ASdiff_days;

    Standard syntax

    SELECT*,CAST(start_timeASDATE)ASdateFROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`CROSSJOINUNNEST(GENERATE_ARRAY(0,6))ASdiff_days;

    TheGENERATE_ARRAY function creates an array with seven elements,0 to6. TheCROSS JOIN UNNEST operation creates seven copies of each row, with anewdiff_days column that contains one of the array element values from0 to6 for each row. You can use thediff_days values as theadjustment to the original date to slide the window forward by that manydays, up to seven days past the original date.

  2. To see the calculated active dates for trips, use theEXTEND pipe operatorwith theDATE_ADD function to create a column calledactive_datethat contains the start date plus the value in thediff_days column:

    Pipe syntax

    FROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`|>EXTENDCAST(start_timeASDATE)ASdate|>CROSSJOINUNNEST(GENERATE_ARRAY(0,6))ASdiff_days|>EXTENDDATE_ADD(date,INTERVALdiff_daysDAY)ASactive_date;

    Standard syntax

    SELECT*,DATE_ADD(date,INTERVALdiff_daysDAY)ASactive_dateFROM(SELECT*,CAST(start_timeASDATE)ASdateFROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`CROSSJOINUNNEST(GENERATE_ARRAY(0,6))ASdiff_days)

    For example, a trip that starts on2024-05-20 is alsoconsidered active on each day through2024-05-26.

  3. Finally, aggregate trips IDs and bike IDs and group byactive_date:

    Pipe syntax

    FROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`|>EXTENDCAST(start_timeASDATE)ASdate|>CROSSJOINUNNEST(GENERATE_ARRAY(0,6))ASdiff_days|>EXTENDDATE_ADD(date,INTERVALdiff_daysDAY)ASactive_date|>AGGREGATECOUNT(DISTINCTbike_id)ASactive_7d_bikes,COUNT(trip_id)ASactive_7d_tripsGROUPBYactive_dateDESC;

    Standard syntax

    SELECTDATE_ADD(date,INTERVALdiff_daysDAY)ASactive_date,COUNT(DISTINCTbike_id)ASactive_7d_bikes,COUNT(trip_id)ASactive_7d_tripsFROM(SELECT*,CAST(start_timeASDATE)ASdateFROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`CROSSJOINUNNEST(GENERATE_ARRAY(0,6))ASdiff_days)GROUPBYactive_dateORDERBYactive_dateDESC;

    The result is similar to the following:

    +-------------+-----------------+-----------------+| active_date | active_7d_bikes | active_7d_trips |+-------------+-----------------+-----------------+| 2024-07-06  | 90              | 331             || 2024-07-05  | 142             | 726             || 2024-07-04  | 186             | 1163            || ...         | ...             | ...             |+-------------+-----------------+-----------------+

Filter future dates

In the preceding query, the dates extend into the future up to six days beyondthe last date in your data. Tofilter out dates that extend beyond the end of your data, set amaximum date in your query:

  1. Add anotherEXTEND pipe operator that uses awindow function with anOVER clause to compute the maximum date in thetable.
  2. Use theWHERE pipe operator to filter out thegenerated rows that are past the maximum date.

Pipe syntax

FROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`|>EXTENDCAST(start_timeASDATE)ASdate|>EXTENDMAX(date)OVER()ASmax_date|>CROSSJOINUNNEST(GENERATE_ARRAY(0,6))ASdiff_days|>EXTENDDATE_ADD(date,INTERVALdiff_daysDAY)ASactive_date|>WHEREactive_date<=max_date|>AGGREGATECOUNT(DISTINCTbike_id)ASactive_7d_bikes,COUNT(trip_id)ASactive_7d_tripsGROUPBYactive_dateDESC;

Standard syntax

SELECTDATE_ADD(date,INTERVALdiff_daysDAY)ASactive_date,COUNT(DISTINCTbike_id)ASactive_7d_bikes,COUNT(trip_id)ASactive_7d_tripsFROM(SELECT*FROM(SELECT*,DATE_ADD(date,INTERVALdiff_daysDAY)ASactive_date,MAX(date)OVER()ASmax_dateFROM(SELECT*,CAST(start_timeASDATE)ASdate,FROM`bigquery-public-data.austin_bikeshare.bikeshare_trips`CROSSJOINUNNEST(GENERATE_ARRAY(0,6))ASdiff_days))WHEREactive_date<=max_date)GROUPBYactive_dateORDERBYactive_dateDESC;

The result is similar to the following:

+-------------+-----------------+-----------------+| active_date | active_7d_bikes | active_7d_trips |+-------------+-----------------+-----------------+| 2024-06-30  | 212             | 3031            || 2024-06-29  | 213             | 3206            || 2024-06-28  | 219             | 3476            || ...         | ...             | ...             |+-------------+-----------------+-----------------+

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.