Metrics reference

This page lists and describes all metrics that are gathered in data profiles.

There are three types of data profiles—project data profiles,table data profiles, andcolumn data profiles.

Project data profiles

Each project data profile has the following fields. The values for thesefields are aggregated based on the resources profiled within the project.

Insights

Project data profiles provide the following insights:

Data risk
Level of risk associated with the data at its current state. For moreinformation, seeSensitivity and data risk levels.
Sensitivity
Score indicating the sensitivity level for this project. For more information,seeSensitivity and data risk levels.

Metadata

Project data profiles provide the following metadata:

Last profile generated
Date and time the profile was last generated.
Project ID
ID of the project that was profiled.
Resource name
Fully qualified name of the data profile.
Status
Icon that indicates the status of the profiling operation.

Table data profiles

Each table data profile has the following fields:

Insights

Table data profiles provide the following insights:

Data risk
Level of risk associated with the data at its current state. For moreinformation, seeSensitivity and data risk levels.
Sensitivity
Score indicating the sensitivity level for this table.For more information, seeSensitivity and data risk levels.

Metadata

Table data profiles provide the following metadata:

Database
The database containing the table that was profiled. This field applies onlytoCloud SQLdiscovery.
Dataset ID
ID of the dataset that contains this table.
Encryption
Whether encryption for this table is managed by Google or by yourorganization.
Expiration time
Optional. The time when this table expires.
Failed column count
The number of columns skipped in this table because of an error.
Inspect config snapshot
Snapshot of theinspection templatethat was used when the profile was generated. For more information, seeData profile snapshots.
Instance
The instance containing the table that was profiled. This field applies onlytoCloud SQLdiscovery.
Last profile generated
Date and time the profile was last generated.
Latest update in BigQuery
Date and time this table was last modified.
Project ID
ID of the project that contains this table.
Public

Whether this table is available to all users or restricted to certain users.

Note: See theknownissue related tothis field.
Resource labels

Labels that the table had at the time theprofile was generated.

Resource tags

Tags that the table had at thetime the profile was generated.

Resource name

Fully qualified name of the data profile.

Row count

Number of rows in this table when the profile was generated.

Scanned column count

The number of columns profiled in this table.

Service account

Number of service accounts with IAM permissions to access thistable.

Status

Indication of whether the profile succeeded in generation.

Table ID

ID of this table.

Table creation time

Date and time the table was created.

Table size

The size of this table when the profile was generated.

Type

Thetype of discoveryperformed.

Column data profiles

Each column data profile has the following fields:

Insights

Column data profiles provide the following insights:

Data risk
Level of risk associated with the data at its current state. For moreinformation, seeSensitivity and data risk levels.
Sensitivity
Score indicating the sensitivity level for this column. For more information,seeSensitivity and data risk levels.
Predicted infoType

If a singlebuilt-in orcustom infoType clearly predominatesover others in the column, Sensitive Data Protection sets this field to thatinfoType. Otherwise, this field has no value.

To view a list of all infoTypes detected in the column, refer totheOther infoTypes field.

Sensitive Data Protection scans for only the infoTypes that you specified in theinspection template. Thus,only those infoTypes can appear in thePredicted infoType field. Forexample, if the column has email addresses, but you didn't include theEMAIL_ADDRESS infoType detector in your inspection template, then this fielddoesn't containEMAIL_ADDRESS.

If the column data predominantly matches several closely related infoTypesthat belong to the same general category, Sensitive Data Protectionsets this field to the more general infoType. For example, if the columnpredominantly has a mix ofPASSPORT,AUSTRALIA_PASSPORT, andCANADA_PASSPORT infoTypes, thePredicted infoType field is set toPASSPORT. TheOther infoTypes field shows the more specific infoTypesand their estimated prevalence.

Other infoTypes

InfoTypes detected in the column that don't have a strong enough signal to beconsidered that column'spredicted infoType. In this document, seePredicted infoType.

For data profiles generated after October 13, 2022, each infoType listed inthis field has anestimated prevalence. The estimated prevalence is anapproximate percentage of non-null rows in which the infoType was detected.

For example, suppose you have a column that has the following metrics:

  • Predicted infoType:FDA_CODE
  • Other infoTypes:PERSON_NAME (2%),STREET_ADDRESS (1%)

In this example, there is a strong indication that the column contains FDAcodes. Sensitive Data Protection also determined that approximately 2% ofnon-null rows in the column might contain person names and 1% might containstreet addresses.

Sensitive Data Protection scans for only the infoTypes that you specified in theinspection template. Thus,only those infoTypes can appear in theOther infoTypes field. For example,if the column has email addresses, but you didn't include theEMAIL_ADDRESSinfoType detector in your inspection template, then this field doesn't containEMAIL_ADDRESS.

Estimated null proportion

Approximate proportion of null values in this column, categorized as high,medium, low, or very low. This value is high if a large proportion of entriesin this column is null.

Estimated uniqueness

An estimate of how much of the data in this column is unique, categorized ashigh, medium, or low. A high uniqueness level suggests that the columncontains distinct values. A high presence of unique values can indicate thatthe column contains identifiers.

A low uniqueness level suggests that the column contains many common valuessuch as enums or boolean values.

If Sensitive Data Protection determines that there aren't enough rowsin the table for it to calculate this metric, this value is blank.

Free text score

The probability that this column contains freeform text. A value close to 1indicates the column is likely to contain freeform or natural-language text.Possible values range from 0 through 1.

A high free text score can increase a column'sdata risk and sensitivitylevels.

Metadata

Column data profiles provide the following metadata:

Database
The database containing the table column that was profiled. This field appliesonly toCloud SQLdiscovery.
Data type
The data type of the contents of this column.
Dataset ID
ID of the dataset that contains this table column.
Field ID
Name of the column.
Instance
The instance containing the table column that was profiled. This field appliesonly toCloud SQLdiscovery.
Instance location
Location of the instance containing the table column that was profiled. Thisfield applies only to Cloud SQL discovery.
Last profile generated
Date and time the profile was last generated.
Policy tags
Indicates if a policy tag is applied to the column. For information onbest practices for using policy tags, seeUsing policy tags inBigQuery.
Project ID
ID of the project that contains this table column.
Resource name
Fully qualified name of the data profile.
Status
Icon that indicates the status of the profiling operation.
Table ID
ID of the table that contains this column.

File store data profiles

Sensitive Data Protection uses the termfile store torefer to a file storage bucket or container.

Each file store data profile has the following fields.

Insights

File store data profiles provide the following insights:

Data risk
Level of risk associated with the data at its current state. For moreinformation, seeSensitivity and data risk levels.
File clusters
Provides a summary for each file cluster that was detected when this filestore was profiled. For more information about each summary, seeFile clustersummaries on this page.
Sensitivity
Score indicating the sensitivity level for this file store.For more information, seeSensitivity and data risk levels.

Metadata

File store data profiles provide the following metadata:

Data storage locations

If you profiled adual-region Cloud Storagebucket, then this field lists the tworegions.

If you profiled a file store from another cloud provider, then this value isthe region where the cloud provider stores the file store.

Encryption

Whether encryption for this file store is managed by Google or by yourorganization.

File store type

The source of the of data that was profiled—Cloud Storage,Amazon S3, or Azure Blob Storage.

File store path

The name of the file store.

Inspect config snapshot

Snapshot of theinspection templatethat was used when the profile was generated. For more information, seeData profile snapshots.

Location type

Type of location where the file store is stored:region,dual-region, ormulti-region.

Profile first created

Date and time the profile was created for the first time.

Profile last generated

Date and time the profile was last generated.

Parent ID

The resource that owns the data that was profiled.

  • If the data profile is for a Google Cloud resource, then this is the IDof the project that contains the data.
  • If the data profile is for an Amazon S3 bucket, then this is the IDof the AWS account that contains the bucket.
  • If the data profile is for an Azure Blob Storage container, thenthis is the ID of the Azure subscription that contains the container.
Public

Whether this file store is available to all users or restricted to certainusers.

Resource labels

Labels that the file store had at the time the profile was generated.

Resource tags

Tags that the file store had atthe time the profile was generated.

Resource location

Region or multi-region that contains the file store.

If you profiled adual-region Cloud Storagebucket, then this value depends onwhether the bucket is stored in a predefined dual region:

  • Forpredefined dual regions,Sensitive Data Protection sets this value to the predefined dual-regionname.
  • For standard dual regions, Sensitive Data Protection sets this value tothe multi-region that contains the dual regions. For information about howregions map to multi-regions, seeDualregions.
Resource name

Fully qualified name of the data profile.

Status

Indication of whether the profile succeeded in generation.

File cluster summaries

When a file store data profile is generated, the files are grouped into fileclusters. Sensitive Data Protection provides a summary for each file cluster.

Each file cluster summary has the following fields:

Data risk
Level of risk associated with the data in this file cluster. For moreinformation, seeSensitivity and data risk levels.
Errors
Any errors detected when the file store data profile was generated.
File extensions scanned
List of file types detected and scanned to generate the file store dataprofile.
File extensions seen
List of file types detected but not necessarily scanned.
InfoTypes
List ofbuilt-in andcustom infoTypesthat were detected in this file cluster.
Sensitivity
Score indicating the sensitivity level for this file cluster.For more information, seeSensitivity and data risk levels.
Type

Indicates the category of files in this cluster. For more information aboutall supported file clusters, seeSupported file clusters in discoveryoperations.

Note: If Sensitive Data Protection scans a file within anarchive file, the value of this field isARCHIVE_FILE_EXTENSION/SCANNED_FILE_EXTENSION—forexample,zip/csv.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.