Profile Cloud Storage data in an organization or folder

This page describes how to configure Cloud Storage data discovery at thelevel of an organization or folder. If you want to profile a project, seeProfile Cloud Storage data in a single project.

For more information about the discovery service, seeDataprofiles.

Before you begin

  1. Confirm that you have the IAM permissions that are required toconfigure data profiles at theorganization level.

    If you don't have the Organization Administrator(roles/resourcemanager.organizationAdmin) or Security Admin(roles/iam.securityAdmin) role, you can still create a scanconfiguration. However, after you create the scan configuration, someone witheither of those roles mustgrant data profiling access to your service agent.

  2. You must have an inspection template in each region where you have data to beprofiled. If you want to use a single template for multiple regions, you can usea template that is stored in theglobal region. If organizationalpolicies prevent you from creating an inspection template in theglobal region, thenyou must set a dedicated inspection template for each region. For moreinformation, seeData residency considerations.

    This task lets you create an inspection template in theglobal region only.If you need dedicated inspection templates for one or more regions, you mustcreate thosetemplatesbefore performing this task.

  3. To send Pub/Sub notifications to a topic when certain eventsoccur—such as when Sensitive Data Protection profiles a newbucket—create a Pub/Subtopic before performing this task.

  4. You can configure Sensitive Data Protection to automatically attach tags to yourresources. This feature lets you conditionally grant access to thoseresources based on their calculated sensitivity levels. If you want to usethis feature, you must first complete the tasks inControlIAM access to resources based on datasensitivity.

To generate data profiles, you need aservice agent container and a service agentwithin it. This task lets you create them automatically.

Create a scan configuration

  1. Go to theCreate scan configuration page.

    Go to Create scan configuration

  2. Go to your organization. On the toolbar, click the project selector andselect your organization.

The following sections provide more information about the steps in theCreatescan configuration page. At the end of each section, clickContinue.

Select a discovery type

SelectCloud Storage.

Select scope

Do one of the following:

  • To configure profiling at the organization level, selectScan entireorganization.
  • To configure profiling at the level of a folder, selectScan selectedfolder. ClickBrowse and select the folder.

Manage schedules

If thedefault profilingfrequency suitsyour needs, you can skip this section of theCreate scan configuration page.

Configure this section for the following reasons:

  • To make fine-grained adjustments to the profiling frequency of all your dataor certain subsets of your data.
  • To specify the buckets that you don't want to profile.
  • To specify the buckets that you don't want profiled more than once.

To make fine-grained adjustments to profiling frequency, follow these steps:

  1. ClickAdd schedule.
  2. Specify whether the discovery service should profile the bucketsthat you selected or exclude them from profiling.

    • If you never want the buckets that match the filters to be profiled, turnoffDo profile this data.

    • If you want the buckets that match the filters and conditions to beprofiled at least once, leaveDo profile this data on.

  3. In theFilters section, add regular expressions and tags to selectthe buckets that you want to include in the schedule. To be includedin the schedule, a bucket must meet the following requirements:

    • If you add regular expression filters, the bucket must matchat least one of the regular expressions that you specify.
    • If you add tag filters, the bucket must have all the tags thatyou specify.

    For example, suppose that you added two regular expression filters and threetag filters in a schedule. To be included in this schedule's scope, abucket must match at least one of the regular expression filters andall of the tag filters.

    To add filters, follow these steps:

    1. To add a regular expression filter, specify at least one of thefollowing:

      • A project ID or a regular expression that specifies one or more projects
      • A bucket name or a regular expression that specifies one or more buckets

      Regular expressions must followRE2 syntax.

      To add more regular expression filters, clickAdd filter and repeatthis step.

    2. To add a tag filter, selectTag Filtering. Specify one or moreexisting tags that you want to include in the schedule's scope.

  4. ClickFrequency.

  5. In theFrequency section, you specify whether the discoveryservice should reprofile your data and what events should trigger areprofile operation. For more information, seeFrequency of data profilegeneration.

    1. ForOn a schedule, specify how often you want the the buckets to be reprofiled. The buckets are reprofiled regardless of whether they underwent any changes.
    2. ForWhen inspect template changes, specify whether you want your data to be reprofiled when the associated inspection template is updated, and if so, how often.Note: You specify the inspection templates to use in theSelect inspection template step on this page.

      An inspection template change is detected when either of the following occurs:

      • The name of an inspection template changes in your scan configuration.
      • TheupdateTime of an inspection template changes.

    3. For example, if you set an inspection template for theus-west1 region and you update that inspection template, then only data in theus-west1 region will be reprofiled.

  6. Optional: ClickConditions.

    In theConditions section, you specify any conditions that thebuckets—defined in your filters—must meet beforeSensitive Data Protection profiles them.

    If needed, set the following:

    Example conditions

    Suppose that you have the following configuration:

    In this case, Sensitive Data Protection excludes any bucket that wascreated on or before May 4, 2022, 11:59 PM. Among the buckets that werecreated after that date and time, Sensitive Data Protection profiles onlythe buckets that are at least 24 hours old and have Autoclass disabled.Within those buckets, Sensitive Data Protection profiles onlythe objects that are in the Standard and Nearline storage classes.

  7. ClickDone.

  8. Optional: To add more schedules, clickAdd schedule and repeat theprevious steps.

  9. To specify precedence between schedules, reorder them using the

    The order of the schedules specifies how conflicts between schedules areresolved. If a bucket matches the filters of two different schedules,the schedule higher in the schedules list dictates the profiling frequencyfor that bucket.

    Note: If your discovery pricing mode issubscription mode, the rate at which Sensitive Data Protection profiles your data is affected by how much capacity you purchased. To determine your daily profiling capacity, seeMonitoring utilization. If you haveunder-provisioned capacity, then the profiling frequencies that you set in your schedules might not be followed. If there is a backlog of data to be profiled, the schedule order doesn't dictate the order in which Sensitive Data Protection profiles the data in the backlog. Rather, all data resources in scope get a randomly assigned slot in the queue.
  10. Optional: Edit or turn offCatch-all schedule.

    The last schedule in the list is the catch-all schedule. This schedule coversthe buckets in your selected scope that don't match any of theschedules that you created. The catch-all schedule follows thesystemdefault profilingfrequency.

    • To adjust the catch-all schedule, clickEdit schedule, and then adjustthe settings as needed.
    • To prevent Sensitive Data Protection from profiling any resource that iscovered by the catch-all schedule, turn offProfile the resourcesthat don't match any custom schedule.

Select inspection template

Depending on how you want to provide an inspection configuration, choose one ofthe following options. Regardless of which option you choose,Sensitive Data Protection scans your data in the region where that data is stored.That is, your data doesn't leave its region of origin.

Option 1: Create an inspection template

Choose this option if you want to create a new inspection template in theglobal region.

  1. ClickCreate new inspection template.
  2. Optional: To modify the default selection of infoTypes, clickManage infoTypes.

    For more information about how to manage built-in and custom infoTypes, seeManage infoTypes through theGoogle Cloud console.

    You must have at least one infoType selected to continue.

  3. Optional: Configure the inspection template further by adding rulesetsand setting a confidence threshold. For more information, seeConfigure detection.

When Sensitive Data Protection creates the scan configuration, it stores thisnew inspection template in theglobal region.

Option 2: Use an existing inspection template

Choose this option if you have existing inspection templates that youwant to use.

  1. ClickSelect existing inspection template.
  2. Enter the full resource name of the inspection template that you want to use. TheRegion field is automatically populated with the name of the region where your inspection template is stored.

    The inspection template that you enter must be in the same region as the data to be profiled.

    To respect data residency, Sensitive Data Protection doesn't use an inspection template outside the region where that template is stored.

    To find the full resource name of an inspection template, follow these steps:

    1. Go to your inspection templates list. This page opens on a separate tab.

      Go to inspection templates

    2. Select the project that contains the inspection template that you want to use.
    3. SelectConfiguration> Templates> Inspect, and then click the template ID of the template that you want to use.
    4. On the page that opens, copy the full resource name of the template. The full resource name follows this format:
      projects/PROJECT_ID/locations/REGION/inspectTemplates/TEMPLATE_ID
    5. On theCreate scan configuration page, in theTemplate name field, paste the full resource name of the template.
  3. To add an inspection template for another region, clickAdd inspection template and enter the template's full resource name. Repeat this for each region where you have a dedicated inspection template.
  4. Optional: Add an inspection template that's stored in theglobal region. Sensitive Data Protection automatically uses that template for data in regions where you don't have a dedicated inspection template.
  5. Caution: If you don't include an inspection template that's stored in theglobal region, Sensitive Data Protection can't profile data in regions that don't have a dedicated inspection template. For more information, seeData residency considerations.

Add actions

This section describes how to specify actions that you wantSensitive Data Protection to take after profiling a bucket. These actionsare useful if you want to send insights gathered from data profiles to otherGoogle Cloud services.

Note: For information about how other Google Cloud services may charge you for configuring actions, seePricing for exporting data profiles.

Publish to Google Security Operations

Metrics gathered from dataprofiles can add context to your Google Security Operations findings. The addedcontext can help you determine the most important security issues to address.

For example, if you're investigating a particular service agent,Google Security Operations can determine what resources the service agent accessedand whether any of those resources have high-sensitivity data.

To send your data profiles to your Google Security Operations instance, turn onPublish to Google Security Operations.

If you don't have a Google Security Operations instance enabled for yourorganization—through thestandaloneproduct orthroughSecurity Command CenterEnterprise—turning on this option has noeffect.

Publish to Security Command Center

Findings from data profiles provide context when you triage and develop responseplans for your vulnerability and threat findings inSecurity Command Center.

Note: You can also configure Security Command Center to automatically prioritize resources for theattack path simulation feature according to the calculated sensitivity of the data that the resources contain. For more information, seeSet resource priority values automatically by data sensitivity.Before you can use this action, Security Command Center must be activated at theorganization level. Turning on Security Command Center at the organization levelenables the flow of findings from integrated services likeSensitive Data Protection. Sensitive Data Protection works withSecurity Command Center in all service tiers.

If Security Command Center isn't activated at the organization level,Sensitive Data Protection findings won't appear inSecurity Command Center. For more information, seeCheck the activation level ofSecurity Command Center.

To send the results of your data profiles to Security Command Center, make sure thePublish to Security Command Center option is turned on.

For more information, seePublish data profiles toSecurity Command Center.

Save data profile copies to BigQuery

Sensitive Data Protection saves a copy of each generated data profilein a BigQuery table. If you don't provide the details of yourpreferred table, Sensitive Data Protection creates a dataset and table in theservice agent container.By default, the dataset is namedsensitive_data_protection_discovery andthe table is nameddiscovery_profiles.

Important: The output table usesDataProfileBigQueryRowSchemaas its schema. This schema can change as Sensitive Data Protection addsfeatures. Make sure that your workflows can handle schema changes, for example,by ignoring unknown fields.

This action lets you keep a history of all of your generated profiles. Thishistory can be useful for creating audit reports andvisualizing dataprofiles. You can alsoload this information into other systems.

Also, this option lets you see all of your data profiles in a single view,regardless of which region your data resides in. Although you can alsoview thedata profiles through theGoogle Cloud console, theconsole displays the profiles in only one region at a time.

When Sensitive Data Protection fails to profile a bucket, it periodicallyretries. To minimize noise in the exported data, Sensitive Data Protectionexports only the successfully generated profiles to BigQuery.

Sensitive Data Protection starts exporting profiles from the time you turn onthis option. Profiles that were generated before you turned on exporting aren'tsaved to BigQuery.

Note:Your service agent must have write access on the table where the profile copies will be saved. If you don't have a service agent yet, Sensitive Data Protection lets you create one later in theCreate scan configuration page.

For example queries that you can use when analyzing data profiles,seeAnalyze data profiles.

Save sample discovery findings to BigQuery

Sensitive Data Protection can add sample findings to aBigQuery table of your choice. Sample findings represent a subsetof all findings and might not represent all infoTypes that were discovered.Normally, the system generates around 10 sample findings per bucket, butthis number can vary for each discovery run.

Each finding includes the actual string (also calledquote) that was detectedand its exact location.

This action is useful if you want to evaluate whether yourinspectionconfiguration is correctlymatching the type of information that you want to flag as sensitive. Using theexported data profiles and the exported sample findings, you can runqueries to get more information about the specific items that were flagged, theinfoTypes they matched, their exact locations, their calculated sensitivitylevels, and other details.

Important: The output table usesDataProfileFindingas its schema. This schema can change as Sensitive Data Protection addsfeatures. Make sure that your workflows can handle schema changes, for example,by ignoring unknown fields.
Example query: Show sample findings relatedto file store data profiles

This example requires bothSave data profile copies to BigQuery andSave sample discovery findings to BigQuery to be enabled.

The following query uses anINNER JOIN operation on boththe table of exported data profiles and the table of exported sample findings. In the resultingtable, each record shows the finding's quote, the infoType that it matched, the resource thatcontains the finding, and the calculated sensitivity level of the resource.

SELECTfindings_table.quote,findings_table.infotype.name,findings_table.location.container_name,profiles_table.file_store_profile.file_store_pathasbucket_name,profiles_table.file_store_profile.sensitivity_scoreasbucket_sensitivity_scoreFROM`FINDINGS_TABLE_PROJECT_ID.FINDINGS_TABLE_DATASET_ID.FINDINGS_TABLE_ID_latest_v1`ASfindings_tableINNERJOIN`PROFILES_TABLE_PROJECT_ID.PROFILES_TABLE_DATASET_ID.PROFILES_TABLE_ID_latest_v1`ASprofiles_tableONfindings_table.data_profile_resource_name=profiles_table.file_store_profile.name

To save sample findings to a BigQuery table, follow thesesteps:

  1. Turn onSave sample discovery findings to BigQuery.

  2. Enter the details of the BigQuerytable where you want to save the sample findings.

    The table that you specify for this action must be different from thetable used for theSave data profile copies to BigQuery action.

    • ForProject ID, enter the ID of an existing project where you wantto export the findings to.

    • ForDataset ID, enter the name of an existing dataset in the project.

    • ForTable ID, enter the name of the BigQuery table wherewant to save the findings to. If this table doesn't exist,Sensitive Data Protection automatically creates it for you using the namethat you provide.

Note:Your service agent must have write access on the table. If you don't have a service agent yet, Sensitive Data Protection lets you create one later in theCreate scan configuration page.

For information about the contents of each finding that is saved in theBigQuery table, seeDataProfileFinding.

Attach tags to resources

Turning onAttach tags to resources instructsSensitive Data Protection to automatically tag your data according to itscalculated sensitivity level. This section requires you to first complete thetasks inControl IAM access to resources based on datasensitivity.

To automatically tag a resource according to its calculated sensitivity level,follow these steps:

  1. Turn on theTag resources option.
  2. For each sensitivity level (high, moderate, low, and unknown), enter thepath of the tag value that you created for the given sensitivity level.

    If you skip a sensitivity level, no tag is attached for it.

  3. To automatically lower the data risk level of aresource when the sensitivity level tag is present, selectWhen a tag isapplied to a resource, lower the data risk of its profile to LOW. Thisoption helps you measure the improvement in your data security and privacyposture.

    Important: This option overrides thecalculated data risklevelof the profiled resource.
  4. Select one or both of the following options:

    • Tag a resource when it is profiled for the first time.
    • Tag a resource when its profile is updated. Selectthis option if you want Sensitive Data Protection to overwrite thesensitivity level tag value on succeeding discovery runs. Consequently, aprincipal's access to a resource changes automatically as the calculateddata sensitivity level for that resource increases or decreases.

      Don't select this option if you plan to manually update the sensitivitylevel tag values that the discovery service attached to your resources.If you select this option, Sensitive Data Protection can overwriteyour manual updates.

Publish to Pub/Sub

Turning onPublish to Pub/Sub lets you take programmaticactions based on profiling results. You can use Pub/Subnotifications to develop a workflow for catching and remediating findingswith significant data risk or sensitivity.

To send notifications to a Pub/Sub topic, follow these steps:

  1. Turn onPublish to Pub/Sub.

    A list of options appears. Each option describes an event that causesSensitive Data Protection to send a notification to Pub/Sub.

  2. Select the events that should trigger a Pub/Sub notification.

    If you selectSend a Pub/Sub notification each time a profile is updated,Sensitive Data Protection sends a notification when there's a change in thesensitivity level, data risk level, detected infoTypes, public access, andother importantmetrics in theprofile.

  3. For each event you select, follow these steps:

    1. Enter the name of the topic. The name must be in the following format:

      projects/PROJECT_ID/topics/TOPIC_ID

      Replace the following:

      • PROJECT_ID: the ID of the project associated with thePub/Sub topic.
      • TOPIC_ID: the ID of the Pub/Sub topic.
    2. Specify whether to include the full bucket profile in thenotification, or just the full resource name of the bucket thatwas profiled.

    3. Set the minimum data risk and sensitivity levels that must be met forSensitive Data Protection to send a notification.

    4. Specify whether only one or both of the data risk and sensitivityconditions must be met. For example, if you chooseAND, thenboth the data risk and the sensitivity conditions must bemet before Sensitive Data Protection sends a notification.

Note:Your service agent must have publishing access on the Pub/Sub topic. An example of a role that has publishing access is the Pub/Sub Publisher role (roles/pubsub.publisher). If you don't have a service agent yet, Sensitive Data Protection lets you create one later in theCreate scan configuration page. If there are configuration or permission issues with the Pub/Sub topic,Sensitive Data Protection retries sending the Pub/Sub notification for up totwo weeks. After two weeks, the notification is discarded.

Manage service agent container and billing

In this section, you specify the projectto use as aservice agent container.You can have Sensitive Data Protection automatically create a new project,or you can choose an existing project.

Regardless of whether you're using a newly created service agent or reusing anexisting one, make sure it has read access to the data to be profiled.

Automatically create a project

If you don't have the permissions needed to create a project in theorganization, you need toselect an existing project insteador obtain the required permissions. For information about the requiredpermissions, seeRoles required to work with data profiles at the organizationor folderlevel.

To automatically create a project to use as your service agent container,follow these steps:

  1. In theService agent container field, review the suggested project ID andedit it as needed.
  2. ClickCreate.
  3. Optional: Update the default project name.
  4. Select the account to bill for all billable operations related to this newproject, including operations that aren't related to discovery.

    Note: If you already have an organization-leveldiscoverysubscription,this billing account is still required to create the project. However, forall discovery operations, you are billed through the project associated withyour subscription.
  5. ClickCreate.

Sensitive Data Protection creates the new project. The service agent withinthis project will be used to authenticate to Sensitive Data Protection andother APIs.

Select an existing project

To select an existing project as your service agent container, click theService agent container field and select the project.

Set fallback processing locations for images

In general, Sensitive Data Protection processes your data in the locationwhere the data is stored. However, images can only be processed in amulti-region or in theglobal region. If you set a fallback location, thenSensitive Data Protection uses your fallback location to process images thataren't in a multi-region or in theglobal region. If you skip this section,then those images aren't processed.

To set fallback locations for image processing, select one or both of thefollowing:

  • Fall back to the multi-region: If an image can't be processed in itsoriginal location, then the image is processed in themulti-region that corresponds tothe image's original location. If the image's original location has nocorresponding multi-region, then the image is skipped.
  • Fall back to global: If an image can't be processed in its originallocation, then the image is processed in theglobal region.

If you select both options, Sensitive Data Protection chooses which locationto use as a fallback location.

Set location to store configuration

Click theResource location list, and select the region where youwant to store this scan configuration. All scan configurations that youlater create will also be stored in this location.

Where you choose to store your scan configuration doesn't affect the data to bescanned. Your datais scanned in the same region where that data is stored. For more information,seeData residency considerations.

Note: If you already have an existing scan configuration, you can't change the valueset in this field. All scan configurations are stored in the same location.If you want to change the location of all your scan configurations, you mustdelete them,recreate them, and store them in the new location.

Review and create

  1. If you want to make sure that profiling doesn't start automatically after you create the scan configuration, selectCreate scan in paused mode.

    This option is useful in the following cases:

    • Your Google Cloud administrator still needs togrant data profiling access to the service agent.
    • You want to create multiple scan configurations and you want some configurations tooverride others.
    • You opted to save data profiles to BigQuery and you want to make sure the service agent has write access to the BigQuery table where the data profile copies will be saved.
    • You opted to save sample discovery findings to BigQuery and you want to make sure that the service agent has write access to the BigQuery table where the sample findings will be saved.
    • You configured Pub/Sub notifications and you want togrant publishing access to the service agent.
    • You enabled theAttach tags to resources action and you need togrant the service agent access to the sensitivity level tag.
  2. Review your settings and clickCreate.

    Sensitive Data Protection creates the scan configuration and adds it to the discovery scan configurations list.

To view or manage your scanconfigurations, seeManage scanconfigurations.

Note:We regularly improve our detection algorithm. If we find that your organizationor project would benefit from a new improvement that we implement, we mightautomatically regenerate your data profiles and redo theactions in your scanconfiguration. You won't incur Sensitive Data Protection charges for thisoperation. However, because we will redo the actions, you might incur chargesfor your use of other Google Cloud services. For example, if you configuredSensitive Data Protection to save the data profiles to BigQuery, youmight incur BigQuery charges.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.