Manage Storage Insights dataset configurations Stay organized with collections Save and categorize content based on your preferences.
This page shows you how to manageStorage Insights datasets configurationsto control the source, scope, and retention of your data. You'll learn how toview, list, update, and delete configurations, as well as how to view, query andunlink your linked datasets.
Get the required roles
To get the permissions that you need to manage dataset configurations, ask your administrator to grant you the following IAM roles on your source projects:
- To list, update, delete, and view dataset configurations:Storage Insights Admin (
roles/storageinsights.admin) - To view and unlink datasets:
- Storage Insights Analyst (
roles/storageinsights.analyst) - BigQuery Admin (
roles/bigquery.admin)
- Storage Insights Analyst (
- To delete linked datasets:BigQuery Admin (
roles/bigquery.admin) - To view and query datasets in BigQuery:
- Storage Insights Viewer (
roles/storageinsights.viewer) - BigQuery Job User (
roles/bigquery.jobUser) - BigQuery Data Viewer (
roles/bigquery.dataViewer)
- Storage Insights Viewer (
For more information about granting roles, seeManage access to projects, folders, and organizations.
These predefined roles contain the permissions required to manage dataset configurations. To see the exact permissions that are required, expand theRequired permissions section:
Required permissions
The following permissions are required to manage dataset configurations:
- View and list dataset configuration:
storageinsights.datasetConfigs.getstorageinsights.datasetConfigs.liststorage.buckets.getObjectInsights
- Update and delete dataset configuration:
storageinsights.datasetConfigs.updatestorageinsights.datasetConfigs.deletestorage.buckets.getObjectInsights
- Unlink to BigQuery dataset:
storageinsights.datasetConfigs.unlinkDataset - Query BigQuery linked datasets:
bigquery.jobs.create or bigquery.jobs.*
You might also be able to get these permissions withcustom roles or otherpredefined roles.
View and query linked datasets
To view and query linked datasets, follow these steps:
- In the Google Cloud console, go to the Cloud StorageStorage Insights page.
Your project shows a list of created dataset configurations.
Click the BigQuery linked dataset for the dataset configurationyou want to view.
The Google Cloud console displays the BigQuery linked dataset.For information about the dataset schema of metadata, seeDataset schema of metadata.
You can query tables and views in your linked datasets in the same way youwouldquery any other BigQuery table.
Unlink a dataset
To stop the dataset configuration from publishing to the BigQuerydataset, unlink the dataset. To unlink a dataset, complete the following steps:
Console
- In the Google Cloud console, go to the Cloud StorageStorage Insights page.
Click the name of the dataset configuration that generated the datasetyou want to unlink.
In theBigQuery linked dataset section, clickUnlink dataset.
Command line
To unlink the dataset, run the
gcloud storage insights dataset-configs delete-linkcommand:gcloud storage insights dataset-configs delete-linkDATASET_CONFIG_ID --location=LOCATION
Replace:
DATASET_CONFIG_IDwith the name of thedataset configuration that generated the dataset you want to unlink.LOCATIONwith thelocation of yourdataset and dataset configuration. For example,us-central1.
You can also specify a full dataset configuration path. For example:
gcloud storage insights dataset-configs delete-link projects/DESTINATION_PROJECT_ID/locations/LOCATION/datasetConfigs/DATASET_CONFIG_ID
Replace:
DESTINATION_PROJECT_IDwith the ID of theproject that contains the dataset configuration. For more informationabout project IDs, seeCreating and managing projects.DATASET_CONFIG_IDwith the name of thedataset configuration that generated the dataset you want to unlink.LOCATIONwith thelocationof your dataset and dataset configuration. For example,us-central1.
JSON API
Have gcloud CLIinstalled and initialized, which lets you generate an access token for the
Authorizationheader.Create a JSON file that contains the following information:
{"name":"DATASET_NAME"}
Replace:
DATASET_NAMEwith the name of the dataset you want to unlink. For example,my_project.my_dataset276daa7e_2991_4f4f_b9d4_e354b48426a2.Use
cURLto call theJSON API with anunlinkDatasetDatasetConfig request:curl --request POST --data-binary @JSON_FILE_NAME \"https://storageinsights.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/datasetConfigs/DATASET_CONFIG_ID:unlinkDataset?" \ --header "Authorization: Bearer $(gcloud auth print-access-token --impersonate-service-account=SERVICE_ACCOUNT)" \ --header "Accept: application/json" \ --header "Content-Type: application/json"
Replace:
JSON_FILE_NAMEwith the path to theJSON file you created in the previous step.PROJECT_IDwith theID of the project that the dataset configuration belongs to.LOCATIONwith thelocation of thedataset and dataset configuration. For example,us-central1.DATASET_CONFIG_IDwith the nameof the dataset configuration that generated the dataset you wantto unlink.SERVICE_ACCOUNTwith the service account. For example,test-service-account@test-project.iam.gserviceaccount.com.
View a dataset configuration
To view a dataset configuration, complete the following steps:
Console
- In the Google Cloud console, go to the Cloud StorageStorage Insights page.
Click the name of the dataset configuration you want to view.
The dataset configuration details are displayed.
Command line
To describe a dataset configuration, run the
gcloud storage insights dataset-configs describecommand:gcloud storage insights dataset-configs describeDATASET_CONFIG_ID \ --location=LOCATION
Replace:
DATASET_CONFIG_IDwith the nameof the dataset configuration.LOCATIONwith the location of the dataset anddataset configuration.
You can also specify a full dataset configuration path. For example:
gcloud storage insights dataset-configs describe projects/DESTINATION_PROJECT_ID/locations/LOCATION/datasetConfigs/DATASET_CONFIG_ID
Replace:
DESTINATION_PROJECT_IDwith the ID of theproject that contains the dataset configuration. For more informationabout project IDs, seeCreating and managing projects.DATASET_CONFIG_IDwith the name of thedataset configuration that generated the dataset you want to view.LOCATIONwith thelocationof your dataset and dataset configuration. For example,us-central1.
JSON API
Have gcloud CLIinstalled and initialized, which lets you generate an access token for the
Authorizationheader.Use
cURLto call theJSON API with anGetDatasetConfig request:curl -X GET \"https://storageinsights.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/datasetConfigs/DATASET_CONFIG_ID" \ --header "Authorization: Bearer $(gcloud auth print-access-token --impersonate-service-account=SERVICE_ACCOUNT)" \ --header "Accept: application/json" \ --header "Content-Type: application/json"
Replace:
PROJECT_IDwith theID of the project that the dataset configuration belongs to.LOCATIONwith thelocation of thedataset and dataset configuration. For example,us-central1.DATASET_CONFIG_IDwith the nameof the dataset configuration.SERVICE_ACCOUNTwith the service account. For example,test-service-account@test-project..
List dataset configurations
To list the dataset configurations in a project, complete the following steps:
Console
- In the Google Cloud console, go to the Cloud StorageStorage Insights page.
The list of dataset configurations is displayed.
Command line
To list dataset configurations in a project, run the
gcloud storage insights dataset-configs listcommand:gcloud storage insights dataset-configs list --location=LOCATION
Replace:
LOCATIONwith thelocation of thedataset and dataset configuration. For example,us-central1.
You can use the following optional flags to specify the behavior of thelisting call:
Use
--page-sizeto specify the maximum number of resultsto return per page.Use
--filter=FILTERto filter results. Formore information on how to use the--filterflag, rungcloud topic filtersand refer to the documentation.Use
--sort-by=SORT_BY_VALUEto specifya comma-separated list of resource field key names to sort by.For example,--sort-by=DATASET_CONFIG_ID.
JSON API
Have gcloud CLIinstalled and initialized, which lets you generate an access token for the
Authorizationheader.Use
cURLto call theJSON API with aGetDatasetConfig request:curl -X GET \"https://storageinsights.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/datasetConfigs" \ --header "Authorization: Bearer $(gcloud auth print-access-token --impersonate-service-account=SERVICE_ACCOUNT)" \ --header "Accept: application/json" \ --header "Content-Type: application/json"
Replace:
PROJECT_IDwith theID of the project that the dataset configuration belongs to.LOCATIONwith thelocation of thedataset and dataset configuration. For example,us-central1.SERVICE_ACCOUNTwith the service account. For example,test-service-account@test-project.iam.gserviceaccount.com.
Update a dataset configuration
To update a dataset configuration, complete the following steps:
Console
- In the Google Cloud console, go to the Cloud StorageStorage Insights page.
Click the name of the dataset configuration you want to update.
In theDataset configuration tab, clickEditto update the fields.
Command line
To update a dataset configuration, run the
gcloud storage insights dataset-configs updatecommand:gcloud storage insights dataset-configs updateDATASET_CONFIG_ID \ --location=LOCATION
Replace:
DATASET_CONFIG_IDwith the nameof the dataset configuration.LOCATIONwith the location of the datasetand dataset configuration.
Use the following flags to update properties of the dataset configuration:
Use
--skip-verificationto skip checks and failures fromthe verification process, which includes checks for requiredIAM permissions. If used, some or all buckets mightbe excluded from the dataset.Use
--retention-period-days=DAYSto specify themoving number of days of data to capture in the dataset snapshot. Forexample,90.Use
--activity-data-retention-period-days=ACTIVITY_RETENTION_PERIOD_DAYSto specify the retention period for theactivity data in thedataset. By default, activity data is included in the dataset, andinherits the retentionperiod of the dataset. To override the dataset retention period,specify the number of days to retain activity data for. To excludeactivity data, set theACTIVITY_RETENTION_PERIOD_DAYS to0.Use
--description=DESCRIPTIONto writea description for the dataset configuration.Use
--organization=ORGANIZATION_IDto specifythe organization ID of the source project. If unspecified, defaults tothe source project's organization ID.
JSON API
Have gcloud CLIinstalled and initialized, which lets you generate an access token for the
Authorizationheader.Create a JSON file that contains the following optional information:
{"organization_number":"ORGANIZATION_ID","source_projects":{"project_numbers":"PROJECT_NUMBERS"},"retention_period_days":"RETENTION_PERIOD","activityDataRetentionPeriodDays":"ACTIVITY_DATA_RETENTION_PERIOD_DAYS"}
Replace:
ORGANIZATION_IDwith the resource ID oftheorganization to which the source projects belong to. Ifunspecified, defaults to the source project's organization ID.PROJECT_NUMBERSwith theproject numbers to include in the dataset. You canspecify one or more projects in a list format.RETENTION_PERIODwith the movingnumber of days of data to capture in the dataset snapshot. Forexample,90.ACTIVITY_DATA_RETENTION_PERIOD_DAYSwiththe number of days ofactivity data to capture in thedataset snapshot. By default, activity data is included in thedataset, and inherits the retentionperiod of the dataset. To override the dataset retention period,specify the number of days to retain activity data for. To excludeactivity data, set theACTIVITY_RETENTION_PERIOD_DAYS to0.
To update the dataset configuration, use
cURLto call theJSON API with aPatchDatasetConfig request:curl -X PATCH --data-binary @JSON_FILE_NAME \"https://storageinsights.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/datasetConfigs/DATASET_CONFIG_ID?updateMask=UPDATE_MASK" \ --header "Authorization: Bearer $(gcloud auth print-access-token --impersonate-service-account=SERVICE_ACCOUNT)" \ --header "Accept: application/json" \ --header "Content-Type: application/json"
Replace:
JSON_FILE_NAMEwith the path to the JSON file you created in the previous step.PROJECT_IDwith theID of the project that the dataset configuration belongs to.LOCATIONwith thelocation of thedataset and dataset configuration. For example,us-central1.DATASET_CONFIG_IDwith the name ofthe dataset configuration you want to update.UPDATE_MASKis the comma-separated list of field names thatthis request updates. The fields use thefieldMask format and are part of theDatasetConfigresource.SERVICE_ACCOUNTwith the service account. For example,test-service-account@test-project.iam.gserviceaccount.com.
Delete a dataset configuration
To delete a dataset configuration, complete the following steps:
Console
- In the Google Cloud console, go to the Cloud StorageStorage Insights page.
Click the name of the dataset configuration you want to delete.
ClickDelete.
Command line
To delete a dataset configuration, run the
gcloud storage insights dataset-configs deletecommand:gcloud storage insights dataset-configs deleteDATASET_CONFIG_ID \ --location=LOCATION
Replace:
DATASET_CONFIG_IDwith the nameof the dataset configuration you want to delete.LOCATIONwith thelocation of thedataset and dataset configuration. For example,us-central1.
Use the following flags to delete a dataset configuration:
- Use
--auto-delete-linkto unlink the dataset that wasgenerated from the dataset configuration you want to delete. You mustunlink a dataset before you can delete the dataset configuration thatgenerated the dataset.
You can also specify a full dataset configuration path. For example:
gcloud storage insights dataset-configs delete projects/DESTINATION_PROJECT_ID/locations/LOCATION/datasetConfigs/DATASET_CONFIG_ID
JSON API
Have gcloud CLIinstalled and initialized, which lets you generate an access token for the
Authorizationheader.Use
cURLto call theJSON API with anDeleteDatasetConfig request:curl -X DELETE \ "https://storageinsights.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/datasetConfigs/DATASET_CONFIG_ID" \ --header "Authorization: Bearer $(gcloud auth print-access-token --impersonate-service-account=SERVICE_ACCOUNT)" \ --header "Accept: application/json" \ --header "Content-Type: application/json"
Replace:
PROJECT_IDwith theID of the project that the dataset configuration belongs to.LOCATIONwith thelocation of thedataset and dataset configuration. For example,us-central1.DATASET_CONFIG_IDwith the name ofthe dataset configuration you want to delete.SERVICE_ACCOUNTwith the service account. For example,test-service-account@test-project.iam.gserviceaccount.com.
What's next
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-17 UTC.