gcloud alpha dataplex datascans create data-profile Stay organized with collections Save and categorize content based on your preferences.
- NAME
- gcloud alpha dataplex datascans create data-profile - create a Dataplex data profile scan job
- SYNOPSIS
gcloud alpha dataplex datascans create data-profile(DATASCAN:--location=LOCATION)(--data-source-entity=DATA_SOURCE_ENTITY|--data-source-resource=DATA_SOURCE_RESOURCE)[--description=DESCRIPTION][--display-name=DISPLAY_NAME][--labels=[KEY=VALUE,…]][--async|--validate-only][--data-profile-spec-file=DATA_PROFILE_SPEC_FILE|--enable-catalog-publishing--exclude-field-names=EXCLUDE_FIELD_NAMES--export-results-table=EXPORT_RESULTS_TABLE--include-field-names=INCLUDE_FIELD_NAMES--row-filter=ROW_FILTER--sampling-percent=SAMPLING_PERCENT][--incremental-field=INCREMENTAL_FIELD--on-demand=ON_DEMAND|--schedule=SCHEDULE|--one-time--ttl-after-scan-completion=TTL_AFTER_SCAN_COMPLETION][GCLOUD_WIDE_FLAG …]
- DESCRIPTION
(ALPHA)Represents a user-visible job which provides the insightsfor the related data source about the structure, content and relationships (suchas null percent, cardinality, min/max/mean, etc).- EXAMPLES
- To create a data profile scan
data-profile-datascanin projecttest-projectlocated inus-central1on bigqueryresource tabletest-tablein datasettest-dataset,run:gcloudalphadataplexdatascanscreatedata-profiledata-profile-datascan--project=test-project--location=us-central1--data-source-resource="//bigquery.googleapis.com/projects/test-project/datasets/test-dataset/tables/test-table" - POSITIONAL ARGUMENTS
- Datascan resource - Arguments and flags that define the Dataplex datascan youwant to create a data profile scan for. The arguments in this group can be usedto specify the attributes of this resource. (NOTE) Some attributes are not givenarguments in this group but can be set in other ways.
To set the
projectattribute:- provide the argument
datascanon the command line with a fullyspecified name; - provide the argument
--projecton the command line; - set the property
core/project.
This must be specified.
DATASCAN- ID of the datascan or fully qualified identifier for the datascan.
To set the
dataScansattribute:- provide the argument
datascanon the command line.
This positional argument must be specified if any of the other arguments in thisgroup are specified.
- provide the argument
--location=LOCATION- The location of the Dataplex resource.
To set the
locationattribute:- provide the argument
datascanon the command line with a fullyspecified name; - provide the argument
--locationon the command line; - set the property
dataplex/location.
- provide the argument
- provide the argument
- Datascan resource - Arguments and flags that define the Dataplex datascan youwant to create a data profile scan for. The arguments in this group can be usedto specify the attributes of this resource. (NOTE) Some attributes are not givenarguments in this group but can be set in other ways.
- REQUIRED FLAGS
- Data source for the data profile scan.
Exactly one of these must be specified:
--data-source-entity=DATA_SOURCE_ENTITY- Dataplex entity that contains the data for the data profile scan, of the form:
projects/{project_number}/locations/{location_id}/lakes/{lake_id}/zones/{zone_id}/entities/{entity_id}. --data-source-resource=DATA_SOURCE_RESOURCE- Fully-qualified service resource name of the cloud resource that contains thedata for the data profile scan, of the form:
//bigquery.googleapis.com/projects/{project_number}/datasets/{dataset_id}/tables/{table_id}.
- Data source for the data profile scan.
- OPTIONAL FLAGS
--description=DESCRIPTION- Description of the data profile scan.
--display-name=DISPLAY_NAME- Display name of the data profile scan.
--labels=[KEY=VALUE,…]- List of label KEY=VALUE pairs to add.
Keys must start with a lowercase character and contain only hyphens(
-), underscores (_), lowercase characters, andnumbers. Values must contain only hyphens (-), underscores(_), lowercase characters, and numbers. - At most one of --async | --validate-only can be specified.
At most one of these can be specified:
--async- Return immediately, without waiting for the operation in progress to complete.
--validate-only- Validate the create action, but don't actually perform it.
- Data spec for the data profile scan.
At most one of these can be specified:
--data-profile-spec-file=DATA_PROFILE_SPEC_FILE- path to the JSON/YAML file containing the spec for the data profile scan. TheJSON representation reference:https://cloud.google.com/dataplex/docs/reference/rest/v1/DataProfileSpec
- Or at least one of these can be specified:
- Command line spec arguments for the data profile scan.
--enable-catalog-publishing- Publish data profile results to Dataplex catalog.
--exclude-field-names=EXCLUDE_FIELD_NAMES- Names of the fields to exclude from data profile. If specified, the respectivefields will be excluded from data profile, regardless of the fields specified inthe
--include-field-namesflag. --export-results-table=EXPORT_RESULTS_TABLE- path to the resource table to export data profile scan results, of the form:
//bigquery.googleapis.com/projects/{project_number}/datasets/{dataset_id}/tables/{table_id}.The table will be created if not present. --include-field-names=INCLUDE_FIELD_NAMES- Names of the fields to include in data profile. If not specified, all fields atthe time of profile scan job execution are included. The fields listed in the
--exclude-field-namesflag are excluded. --row-filter=ROW_FILTER- A filter applied to all rows in a single data profile scan job.
--sampling-percent=SAMPLING_PERCENT- The percentage of the records to be selected from the dataset for data profilescan.
- Data profile scan execution settings.
--incremental-field=INCREMENTAL_FIELD- Field that contains values that monotonically increase over time (e.g.timestamp).
- Data profile scan scheduling and trigger settings.
At most one of these can be specified:
--on-demand=ON_DEMAND- If set, the scan runs one-time shortly after data profile scan creation.
--schedule=SCHEDULE- Cron schedule (https://en.wikipedia.org/wiki/Cron) for running scansperiodically. To explicitly set a timezone to the cron tab, apply a prefix inthe cron tab: "CRON_TZ=${IANA_TIME_ZONE}" or "TZ=${IANA_TIME_ZONE}". The${IANA_TIME_ZONE} may only be a valid string from IANA time zone database. Forexample,
CRON_TZ=America/New_York 1 * * * *orTZ=America/New_York 1 * * * *. This field is required for RECURRINGscans. - Or at least one of these can be specified:
- Data profile scan one-time trigger settings.
--one-time- If set, the data profile scan runs once, and auto deleted once thettl_after_scan_completion expires.
--ttl-after-scan-completion=TTL_AFTER_SCAN_COMPLETION- The time to live for one-time scans. Default value is 24 hours, minimum value is0 seconds, and maximum value is 365 days. The time is calculated from the datascan job completion time. If value is set as 0 seconds, the scan will beimmediately deleted upon job completion, regardless of whether the job succeededor failed. The value should be a number followed by a unit suffix "s". Example:"100s" for 100 seconds.The argument is only valid when --one-time is set.
- GCLOUD WIDE FLAGS
- These flags are available to all commands:
--access-token-file,--account,--billing-project,--configuration,--flags-file,--flatten,--format,--help,--impersonate-service-account,--log-http,--project,--quiet,--trace-token,--user-output-enabled,--verbosity.Run
$gcloud helpfor details. - NOTES
- This command is currently in alpha and might change without notice. If thiscommand fails with API permission errors despite specifying the correct project,you might be trying to access an API with an invitation-only early accessallowlist. This variant is also available:
gclouddataplexdatascanscreatedata-profile
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-01-21 UTC.