Using Sensitive Data Protection to scan BigQuery data

Knowing where your sensitive data exists is often the first step in ensuringthat it is properly secured and managed. This knowledge can help reduce the riskof exposing sensitive details such as credit card numbers, medical information,Social Security numbers, driver's license numbers, addresses, full names, andcompany-specific secrets. Periodic scanning of your data can also help withcompliance requirements and ensure best practices are followed as your datagrows and changes with use. To help meet compliance requirements, useSensitive Data Protection to inspect your BigQuery tables andto help protect your sensitive data.

There are two ways to scan your BigQuery data:

  • Sensitive data profiling. Sensitive Data Protection can generate profiles aboutBigQuery data across an organization, folder, or project.Dataprofiles contain metrics and metadata about your tables and help youdetermine wheresensitive and high-riskdata reside. Sensitive Data Protectionreports these metrics at the project, table, and column levels. For moreinformation, seeData profiles forBigQuery data.

  • On-demand inspection. Sensitive Data Protection can perform a deep inspection ona single table or a subset of columns and report its findings down to the celllevel. This kind of inspection can help you identify individual instances ofspecific datatypes, such as the preciselocation of a credit card number inside a table cell. You can do an on-demandinspection through the Sensitive Data Protection page in theGoogle Cloud console, theBigQuery page in the Google Cloud console,or programmatically through the DLP API.

This page describes how to do an on-demand inspection through theBigQuery page in the Google Cloud console.

Sensitive Data Protection is a fully managed service that lets Google Cloud customersidentify and protect sensitive data at scale. Sensitive Data Protection uses morethan 150 predefined detectors to identify patterns, formats, and checksums.Sensitive Data Protection also provides a set of tools to de-identify your dataincluding masking, tokenization, pseudonymization, date shifting, and more, allwithout replicating customer data.

To learn more about Sensitive Data Protection, see theSensitive Data Protectiondocumentation.

Before you begin

  1. Get familiar withSensitive Data Protection pricing andhow to keep Sensitive Data Protection costs under control.
  2. Enable the DLP API.

    Enable the API

  3. Ensure that the user creating your Sensitive Data Protection jobs is granted anappropriate predefined Sensitive Data ProtectionIAM role orsufficientpermissions to run Sensitive Data Protectionjobs.

Note: When you enable the DLP API, a service account is createdwith a name similar toservice-project_number@dlp-api.iam.gserviceaccount.com.This service account is granted the DLP API Service Agent role, which lets theservice account authenticate with the BigQuery API. For moreinformation, seeService accounton the Sensitive Data Protection IAM permissions page.

Scanning BigQuery data using the Google Cloud console

To scan BigQuery data, you create a Sensitive Data Protection jobthat analyzes a table. You can scan a BigQuery table quickly by usingtheScan with Sensitive Data Protection option in the BigQuery Google Cloud console.

To scan a BigQuery table using Sensitive Data Protection:

  1. In the Google Cloud console, go to the BigQuery page.

    Go to BigQuery

  2. In the left pane, clickExplorer:

    Highlighted button for the Explorer pane.

    If you don't see the left pane, clickExpand left pane to open the pane.

  3. In theExplorer pane, expand your project, clickDatasets, andthen click your dataset.

  4. ClickOverview> Tables, and then select your table.

  5. ClickOpen> Scan with Sensitive Data Protection.The Sensitive Data Protection job creation page opens in a new tab.

  6. ForStep 1: Choose input data, enter a job ID. The values in theLocation section are automatically generated. Also, theSamplingsection is automatically configured to run a sample scan against your data, butyou can adjust the settings as needed.

  7. ClickContinue.

  8. Optional: ForStep 2: Configure detection, you can configure what typesof data to look for, calledinfoTypes.

    Do one of the following:

    • To select from the list of predefinedinfoTypes, clickManageinfoTypes. Then, select the infoTypes you want to search for.
    • To use an existinginspection template,in theTemplate name field, enter the template's full resource name.

    For more information oninfoTypes, seeInfoTypes and infoType detectors in theSensitive Data Protection documentation.

  9. ClickContinue.

  10. Optional: ForStep 3: Add actions, turn onSave to BigQueryto publish your Sensitive Data Protection findings to a BigQuerytable. If you don't store findings, the completed job contains onlystatistics about the number of findings and theirinfoTypes. Savingfindings to BigQuery saves details about the precise location andconfidence of each individual finding.

  11. Optional: If you turned onSave to BigQuery, in theSave toBigQuery section, enter the following information:

    • Project ID: the project ID where your results are stored.
    • Dataset ID: the name of the dataset that stores your results.
    • Optional:Table ID: the name of the table that stores yourresults. If no table ID is specified, a default name is assigned toa new table similar to the following:dlp_googleapis_date_1234567890.If you specify an existing table, findings are appended to it.

    To include the actual content that was detected, turn onInclude quote.

  12. ClickContinue.

  13. Optional: ForStep 4: Schedule, configure a time span or schedule byselecting eitherSpecify time span orCreate a trigger to run the jobon a periodic schedule.

  14. ClickContinue.

  15. Optional: On theReview page, examine the details of your job. If needed,adjust the previous settings.

  16. ClickCreate.

  17. After the Sensitive Data Protection job completes, you are redirected to the jobdetails page, and you're notified by email. You can view the results of thescan on the job details page, or you can click the link tothe Sensitive Data Protection job details page in the job completion email.

  18. If you chose to publish Sensitive Data Protection findings toBigQuery, on theJob details page, clickView Findings inBigQuery to open the table in the Google Cloud console. You can then query thetable and analyze your findings. For more information on querying your resultsin BigQuery, seeQuerying Sensitive Data Protection findings in BigQueryin the Sensitive Data Protection documentation.

What's next

If you want to redact or otherwise de-identify the sensitive data that theSensitive Data Protection scan found, see the following:

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.