Create tables Stay organized with collections Save and categorize content based on your preferences.
In Dataform, a table is one of the types of objects that make up aworkflow. You can create tables thatreference data from thedata sources declaredfor your workflow or from other tables in your workflow. Dataformcompiles your table definitions into SQL in real time. When you triggerexecution, Dataform runs the SQL code and creates yourdefined tables in BigQuery.
You can create the following table types in atype: "table" SQLX file:
table: a regular table.incremental: an incremental table.view: a table view. For more information,seeIntroduction to views.materialized: a materialized table view. For more information, seeIntroduction to materialized views.
You can also define tablepartitions and clusters.
To keep a record of the purpose of a table or its relation to other tables inyour workflow, you canadd documentationto the table or its selected columns.
To test the data in a table against specific conditions, you can create dataquality test queries calledassertions.Dataform runs assertions every time it updates your workflow andalerts you if any assertions fail.
To override the default schema, database, and name of a selected table, you canoverride table settings.
To disable table creation or run a SQL statement before or aftertable creation, you canconfigure additional actions.
To organize your tables in BigQuery after you run them, youcanadd BigQuery labels.To learn more, seeIntroduction to labels.
To restrict data access at the table column level, you can add BigQuerypolicy tags. To learn more, seeIntroduction to column-level access control.
In addition to defining tables in atype: "table" SQLX file, you cancreate empty tablesby defining a custom SQL query in atype: "operations" SQLX file.You might want to create an empty table so that a different servicecan populate it with data.
Before you begin
In the Google Cloud console, go to theDataform page.
Create and initialize a development workspace in your repository.
Optional:Declare a data source.
Required roles
To get the permissions that you need to complete the tasks in this document, ask your administrator to grant you theDataform Editor (roles/dataform.editor) IAM role on workspaces. For more information about granting roles, seeManage access to projects, folders, and organizations.
You might also be able to get the required permissions throughcustom roles or otherpredefined roles.
Create a table
This section shows you how to create tables withDataform core in Dataform.
About table definitions
To define a table, you define the table type and write aSELECT statement inatype: "table" SQLX file. Dataform then compiles yourDataform core code into SQL, runs the SQL code, and creates yourdefined tables in BigQuery.
In a Dataform coreSELECT statement, you define the table structureand reference other objects of your workflow.
In addition to defining tables in atype: "table" SLQX file, you can createempty tables by defining a custom SQL query in atype: "operations" SQLX file.For more information, seeCreate an empty table.
Reference dependencies withref
To reference a workflow action in aSELECT statement and automaticallyadd it as a dependency, use theref function. Dataform runsdependencies before tables that depend on them to verify the correct pipeline ordering.
Theref function is a built-in Dataform core function that iscritical to dependency management in Dataform. Theref function letsyou reference and automatically depend on the following objects defined in yourDataform workflow, instead of hard coding the schema and table names:
- Tables of allsupported table types.
- Data source declarations.
- Custom SQL operations with the
hasOutputproperty set totrue.
Dataform uses theref function to build a dependency tree of all thetables to be created or updated.
After compiling, Dataform adds boilerplate statements to the SQLstatement, such asCREATE,REPLACE,INSERT, orMERGE.
The following code sample shows a table definition with the use of theref function:
config { type: "table" }SELECT order_date AS date, order_id AS order_id, order_status AS order_status, SUM(item_count) AS item_count, SUM(amount) AS revenueFROM ${ref("store_clean")}GROUP BY 1, 2In theref function, you provide the name of the table or data sourcedeclaration that you want to depend on. This is typically the filename of theSQLX file in which that table or data source declaration is defined.
If a table name is overridden, use the overridden name in theref function.For example, reference a table withconfig { name: "overridden_name" }asref("overridden_name"). For more information, seeOverride table settings andReference a table with an overridden table name.
When you have multiple tables of the same name in different schemas, you canreference a specific table by providing two arguments to theref function:schema name and table name.
The following code sample shows theref function with two arguments tospecify a table within a specific schema:
config { type: "table" }SELECT * FROM ${ref("schema", "store_clean")}You can also add table dependencies manually to theconfig block for tables,assertions, data source declarations, or custom SQL operations that are notreferenced in aref function in theSELECT statement. Dataformruns these dependencies before dependent tables.
The following code sample shows a table dependency in theconfig block:
config { dependencies: [ "unreferenced_table" ] }SELECT * FROM ...For more information on dependency management in your workflow, seeSet dependencies.
Reference other tables withresolve
Theresolve functionlets you reference a table or data source declaration in aSELECT statementlike theref function, but it doesn't add the reference as a dependency. Thismeans that the object referenced using theresolve function does not affectthe execution of the table that uses theresolve function.
For more information on built-in Dataform core functions, seeDataform core reference.
Create a SQLX file for a table definition
Store table definition SQLX files in thedefinitions/ directory. To create anew SQLX file in thedefinitions/ directory, follow these steps:
In Google Cloud console, go to theDataform page.
To open a repository, click the repository name.
To open a development workspace, click the workspace name.
In theFiles pane, next to
definitions/, clickMore.ClickCreate file.
In theAdd a file path field, enter the name of the file followed by
.sqlxafterdefinitions/. For example,definitions/my-table.sqlx.Filenames can only include numbers, letters, hyphens, and underscores.
ClickCreate file.
Define the table type
To create a new table type definition, follow these steps:
- In your development workspace, in theFiles pane, expand the
definitions/directory. - Select the table definition SQLX file that you want to edit.
In the file, enter the following code snippet:
config { type: "TABLE_TYPE" }ReplaceTABLE_TYPE with one of the following table types:
tableincrementalview
Optional: To define a materialized view, enter the
materializedpropertyundertype: "view"in the following format:config { type: "view", materialized: true}For more information, seeITableConfig.
Optional: ClickFormat.
Define table structure and dependencies
To write a table definitionSELECT statement and define the table structureand dependencies, follow these steps:
- In your development workspace, in theFiles pane, expandthe
definitions/directory. - Select the table definition SQLX file that you want to edit.
- Below the
configblock, write aSELECTstatement. - Optional: ClickFormat.
The following code sample shows a table definition with aSELECT statementand theref function:
config { type: "table" }SELECT customers.id AS id, customers.first_name AS first_name, customers.last_name AS last_name, customers.email AS email, customers.country AS country, COUNT(orders.id) AS order_count, SUM(orders.amount) AS total_spentFROM dataform-samples.dataform_sample.crm_customers AS customers LEFT JOIN ${ref('order_stats')} orders ON customers.id = orders.customer_idWHERE customers.id IS NOT NULL AND customers.first_name <> 'Internal account' AND country IN ('UK', 'US', 'FR', 'ES', 'NG', 'JP')GROUP BY 1, 2, 3, 4, 5Add manual table dependencies
To add table dependencies that are not referenced in theSELECT statementbut that need to be run before the current table, follow these steps:
- In your development workspace, in theFiles pane, expandthe
definitions/directory. - Select the table definition SQLX file that you want to edit.
In the
configblock of the table, enter the following code snippet:dependencies: [ "DEPENDENCY_TABLE", ]ReplaceDEPENDENCY_TABLE with the filename of the table you wantto add as a dependency. You can enter multiple filenames.
Optional: ClickFormat.
The following code sample shows two tables added as manual table dependenciesto theconfig block of a table definition file:
config { dependencies: [ "some_table", "some_other_table" ] }Override table settings
You can override the default schema, database, and name of a selected table.
By default, a table follows the schema and database configuration you set inworkflow_settings.yaml. The name of a table is the same as the name of the tabledefinition SQLX file.
To override the schema and name of a selected table, follow these steps:
Go to your development workspace.
In theFiles pane, expand
definitions/.Open a SQLX table definition file.
In the
configblock, enter the following code snippet:{ schema: "OVERRIDDEN_SCHEMA", database: "OVERRIDDEN_DATABASE", name: "OVERRIDDEN_NAME" }Replace the following:
OVERRIDDEN_SCHEMA: the BigQuerydataset in which you want to create the table.OVERRIDDEN_DATABASE: the ID of theBigQuery project in which you want to create the table.OVERRIDDEN_NAME: the name for the table,which is different from the SQLX table definition filename.
Optional: ClickFormat.
For more information, seeReference a table with an overridden table name.
Create BigLake tables for Apache Iceberg in BigQuery
Use Dataform to create BigQuery tables inIceberg table format. These tables are known asBigLake Iceberg tables in BigQuery. For more information about required rolesand other setup tasks, see theBefore you begin steps forBigLake Iceberg table in BigQuery workflows.
Create the Iceberg table definition
To create the tables, define the Iceberg configuration inthebigquery block of a table definition file.
To define the tables, follow these steps:
- Go to your development workspace.
- In theFiles pane, expand
definitions/. - Open a table definition SQLX file.
In the
configblock, add abigqueryblock that contains anicebergblock in the following format:config{type:"table",name:"my_table",uniqueKey:["uniquekey"],bigquery:{iceberg:{bucket_name:"BUCKET_NAME"}}}Replace
BUCKET_NAMEwith the name of theCloud Storage bucket where the table data is stored.Other properties are optional and have default values if not specified. Formore information, see the
icebergproperties.You can provide the values for each table orset workflow-level defaultsin
workflow_settings.yaml. You can override workflow-level defaults byexplicitly setting the properties in theicebergblock.
Create table partitions and clusters
This section shows you how to useDataform core to createtable partitions and clusters. BigQuery supports partitioned tables andtable clustering. For more information, seeIntroduction to partitioned tables andCreating and using clustered tables.
Create a table partition
To create a table partition, follow these steps:
- Go to your development workspace.
- In theFiles pane, expand
definitions/. - Open a table definition SQLX file.
In the
configblock, add thebigqueryblock below the table typedeclaration in the following format:config { type: "table", bigquery: { }}In the
bigqueryblock, enter the following code snippet:partitionBy: "PARTITION_EXPRESSION"ReplacePARTITION_EXPRESSION with anexpression for partitioning the table.
Optional: ClickFormat.
The following code sample shows partitioning a table by hour in atable definition SQLX file:
config { type: "table", bigquery: { partitionBy: "DATETIME_TRUNC(<timestamp_column>, HOUR)" }}The following code sample shows partitioning a table by an integer value in atable definition SQLX file:
config { type: "table", bigquery: { partitionBy: "RANGE_BUCKET(<integer_column>, GENERATE_ARRAY(0, 1000000, 1000))" }}Set a partition filter
To set a partition filter, follow these steps:
- Go to your development workspace.
- In theFiles pane, expand
definitions/. - Open a partitioned table definition SQLX file.
In the
bigqueryblock, enter the following code snippet:requirePartitionFilter : trueOptional: ClickFormat.
The following code sample shows a partition filter set in thebigquery blockof a partitioned table SQLX file:
config { type: "table", bigquery: { partitionBy: "DATE(ts)", requirePartitionFilter : true }}SELECT CURRENT_TIMESTAMP() AS tsFor more information on the partition filter in BigQuery, seeSetting the require partition filter attribute on a partitioned table.
Set a retention period for partitions
To control the retention of all partitions in a partitioned table, follow thesesteps:
- Go to your development workspace.
- In theFiles pane, expand
definitions/. - Open a partitioned table definition SQLX file.
In the
bigqueryblock, enter the following code snippet:partitionExpirationDays:NUMBER_OF_DAYSReplaceNUMBER_OF_DAYS with the number of days that you wantto retain the partitions for.
Optional: ClickFormat.
The following code sample shows a retention period for partitions set to14 days in thebigquery block of a partitioned table SQLX file:
config { type: "table", bigquery: { partitionBy: "DATE(ts)", partitionExpirationDays: 14, }}SELECT CURRENT_TIMESTAMP() AS tsCreate a table cluster
To create a table cluster, follow these steps:
- Go to your development workspace.
- In theFiles pane, expand
definitions/. - Open a table definition SQLX file.
In the
bigqueryblock, enter the following code snippet:clusterBy: ["CLUSTER_COLUMN"]ReplaceCLUSTER_COLUMN with the name of the column by which you want to cluster the table. For more information, seeclustering_column_list.
Optional: ClickFormat.
The following code sample shows a partitioned table clustered byname andrevenue columns:
config { type: "table", bigquery: { partitionBy: "DATE(ts)", clusterBy: ["name", "revenue"] }}SELECT CURRENT_TIMESTAMP() as ts, name, revenueConfigure an incremental table
This section shows you how to useDataform core to configure anincremental table.
About incremental tables
Dataform updates tables differently based on the table type. Duringeach execution of a table or a view, Dataform rebuilds the wholetable or view from scratch.
When you define an incremental table, Dataform builds the incrementaltable from scratch only for the first time. During subsequent executions,Dataform only inserts or merges new rows into the incremental tableaccording to the conditions that you configure.
Dataform inserts new rows only into columns that already exist inthe incremental table. If you make changes to the incremental table definitionquery—for example, by adding a new column—you must decide if you should rebuildthe table from scratch. To rebuild the table, the next time you trigger a tablerun, select theRun with full refresh option. For other options, seeChange an incremental table schema without a full refresh.
Here are some common use cases for incremental tables:
- Performance optimization
- For some kinds of data, such as web logs or analytics data, you might wantto only process new records instead of reprocessing the entire table.
- Latency reduction
- You can use incremental tables to run workflows quickly but frequently,reducing the downstream latency of the output tables.
- Daily snapshots
- You can configure an incremental table to create daily snapshots of thetable data, for example, for longitudinal analysis of user settings storedin a production database.
Process a subset of rows in an incremental table
To determine a subset of rows for Dataform to process during eachexecution, add a conditionalWHERE clause to the incremental table SQLXdefinition file. In theWHERE clause, you can specify an incremental conditionand a non-incremental condition. Dataform applies the incrementalcondition during table execution without a full refresh, and the non-incrementalcondition during execution with a full refresh.
To configure an incremental table, follow these steps:
- Go to your development workspace.
- In theFiles pane, expand
definitions/. - Open an incremental table definition SQLX file.
Enter a
WHEREclause in the following format:config { type: "incremental" }SELECT_STATEMENT${when(incremental(), `WHEREINCREMENTAL_CONDITION`, `WHERENON_INCREMENTAL_CONDITION`) }Replace the following:
- SELECT_STATEMENT: the
SELECTstatement thatdefines your table. - INCREMENTAL_CONDITION: the condition you specify in the
WHEREclause to select rows for Dataform to process during table executionwithout a full refresh. - NON_INCREMENTAL_CONDITION: the condition you specify in the
WHEREclause to select rows for Dataform to process during tableexecution with a full refresh.
- SELECT_STATEMENT: the
Optional: ClickFormat.
The following code sample shows an incremental table that incrementallyprocesses rows of theproductiondb.logs table:
config { type: "incremental" }SELECT timestamp, message FROM ${ref("productiondb", "logs")}${when(incremental(), `WHERE date > (SELECT MAX(date) FROM ${self()}) AND country = "UK"`, `WHERE country = "UK"`)}The following code sample shows an incremental table that creates a snapshotof theproductiondb.customers table:
config { type: "incremental" }SELECT CURRENT_DATE() AS snapshot_date, customer_id, name, account_settings FROM ${ref("productiondb", "customers")}${when(incremental(), `WHERE snapshot_date > (SELECT MAX(snapshot_date) FROM ${self()})`) }Merge rows in an incremental table
To verify that an incremental table contains only one row corresponding to aselected combination of columns, set the selected columns asuniqueKey tomerge rows that have the sameuniqueKey value. When updating the table,Dataform merges rows with the sameuniqueKey value instead ofappending them.
To configure merging in an incremental table, follow these steps:
- Go to your development workspace.
- In theFiles pane, expand
definitions/. - Select an incremental table definition SQLX file
In the
configblock, set the selected columns asuniqueKeyin the following format:uniqueKey: ["COLUMN_NAME"]ReplaceCOLUMN_NAME with the name of a selected column.
Optional: ClickFormat.
The following code sample shows an incremental table with thetransaction_idcolumn set asuniqueKey to verify it always contains one row:
config { type: "incremental", uniqueKey: ["transaction_id"]}SELECT timestamp, action FROM weblogs.user_actions${ when(incremental(), `WHERE timestamp > (SELECT MAX(timestamp) FROM ${self()})`) }Filter rows in an incremental table
In an incremental partitioned table, to avoid Dataform scanning thewhole table to find matching rows, setupdatePartitionFilter to only considera subset of records.
The following code sample shows an incremental partitioned table with mergingconfigured by setting theuniqueKey andupdatePartitionFilter properties:
config { type: "incremental", uniqueKey: ["transaction_id"], bigquery: { partitionBy: "DATE(timestamp)", updatePartitionFilter: "timestamp >= timestamp_sub(current_timestamp(), interval 24 hour)" }}SELECT timestamp, action FROM weblogs.user_actions${ when(incremental(), `WHERE timestamp > (SELECT MAX(timestamp) FROM ${self()})`) }Avoid full table scans when ingesting from a partitioned table
When you create an incremental table that references a partitioned table,we recommend that you build your table query to avoid full table scans ofthe partitioned table during each incremental update.
You can limit the number of partitions that BigQuery scans to updatethe incremental table by using a constant expression in your table query. Toturn a value from the partitioned table into a constant expression, useBigQuery scripting to declare the value as a variable in thepre_operations block. Then, use the variable as a constant expression in aWHERE clause in theSELECT query.
With this configuration, Dataform updates the incremental table basedon the most recent partitions of the referenced partitioned table, withoutscanning the entire table.
To configure an incremental table that references a partitioned table and avoidsfull table scans, follow these steps:
- Go to your development workspace.
- In theFiles pane, expand
definitions/. - Select an incremental table definition SQLX file
- In the
pre_operationsblock,declare a variable with BigQuery scripting. - Filter the
SELECTstatement that defines the table with aWHEREclausethat references the declared variable. - Optional: ClickFormat.
The following code sample shows an incremental table in which the referencedraw_events table is partitioned byevent_timestamp:
config { type: "incremental",}pre_operations { DECLARE event_timestamp_checkpoint DEFAULT ( ${when(incremental(), `SELECT max(event_timestamp) FROM ${self()}`, `SELECT timestamp("2000-01-01")`)} )}SELECT *FROM ${ref("raw_events")}WHERE event_timestamp > event_timestamp_checkpointIn the preceding code sample, theevent_timestamp_checkpoint variable isdefined in thepre_operations block. Theevent_timestamp_checkpoint variableis then used as a constant expression in theWHERE clause.
Rebuild an incremental table from scratch with full refresh
You can force an incremental table to be rebuilt from scratch using either thecommand-line interface with the--full-refresh optionor theRun with full refresh option whentriggering a workflow execution.
When you select the full refresh option, in your development workspace or byusing the Dataform CLI, Dataform ignores the${when(incremental(), ... } parameter during execution and recreates thetable with aCREATE OR REPLACE statement.
Protect an incremental table from full refresh
To protect an incremental table from being rebuilt from scratch and potentialdata loss, you can set the incremental table asprotected. You might want toprevent an incremental table from being rebuilt if your data source is temporary.
To mark an incremental table asprotected, follow these steps:
- Go to your development workspace.
- In theFiles pane, expand
definitions/. - Select an incremental table definition SQLX file.
- In the
configblock, enterprotected: true. - Optional: ClickFormat.
The following code sample shows an incremental table marked asprotected:
config { type: "incremental", protected: true}SELECT ...Change an incremental table schema without a full refresh
Preview
This product or feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of theService Specific Terms. Pre-GA products and features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.
Note: To provide feedback or request support, contactdataform-preview-support@google.com.You can modify an incremental table schema in theSELECT query withouta full refresh of the table by using theonSchemaChange configurationproperty. This property lets you add new columns to orremove existing columns from a table while preserving historical data. Thisapproach helps to prevent data loss and avoids manual updates inBigQuery.
To use this feature, you must installDataform core 3.0.11 or later.
Note: This feature isn't available with theDataform CLI.After you set theonSchemaChange property, Dataformchecks theSELECT query to determine if columns were added or removed.Dataform then performs the action specified by theonSchemaChange property's value.
To set theonSchemaChange property, follow these steps:
- Go to your development workspace.
- In theFiles pane, expand
definitions/. - Select an incremental table definition SQLX file.
In the
configblock, enter the following code snippet:onSchemaChange: "ON_CHANGE_ACTION"Replace
ON_CHANGE_ACTIONwith one of the following actions:IGNORE(default): ignores added columns and shows an error for missingcolumns. IfonSchemaChangeis unset, this is the default behavior whenthe schema changes.FAIL: stops the action if Dataform detects any schemachange, which helps to keep the schema consistent.EXTEND: adds new columns from the query to the incremental table, andaddsNULLvalues for previous records. Shows an error if a column isremoved or missing from the original schema in the query. You can use thissetting to add new columns to your incremental tables during runtime.
Caution: TheSYNCHRONIZE: adds new columns from the query to the incremental table,and addsNULLvalues for previous records. Removes columns that were inthe original schema but that are now missing from the current query.SYNCHRONIZEaction cannot be undone, and you might lose data when you remove columns with this option.Note: Deleting partitioned and clustered columns isn't supported, as deleting these columns can have negative consequences. If a partitioned or clustered column is still deleted, it's handled with theALTER TABLE DROP COLUMNstatement. Dataform doesn't perform checks or validations to prevent the running ofALTER TABLE DROP COLUMNstatements on partitioned or clustered columns. Therefore, if you drop such columns, proceed cautiously and be aware of potential negative consequences.
Optional: ClickFormat.
The following code sample shows an incremental table with theEXTEND actionset for theonSchemaChange property:
config { type: "incremental", onSchemaChange: "EXTEND",}SELECT ...Add table documentation
This section shows you how to add descriptions of a table and its columns andrecords to a Dataform core SQLX file.
You can add table, column, and record descriptions to all table types inDataform: tables, incremental tables, and views.
You might want to document the following:
- The purpose of the table.
- The content or role of columns or records in the table.
- The relationship of the table and other actions of your workflow, forexample, the tables or views that depend on the current table.
- The assertions applied to the table.
- The pre-operations or post-operations applied to the table.
- The owner of the table—that is, the user who created it. This informationmight be useful if multiple team members work on a workflow.
Add a table description
To add a description to a table in a SQLX file, follow these steps:
In Google Cloud console, go to theDataform page.
Select a repository.
Select a development workspace.
In theFiles pane, click the table definition SQLX file that you want toedit.
In the
configblock of the file, enter the table description in thefollowing format:description: "Description of the table",Optional: ClickFormat.
The following code sample shows a table description added to theconfig blockof a SQLX table definition file:
config { type: "table", description: "Description of the table", }Add column and record descriptions
To add descriptions of individual columns and records to a SQLX file, followthese steps:
- In the
configblock of your table definition file, entercolumns: {}. Inside
columns: {}, enter column descriptions in the following format:column_name: "Description of the column",Inside
columns: {}, enter record descriptions in the following format:record_name: { description: "Description of the record", columns: { record_column_name: "Description of the record column" }}Optional: ClickFormat.
The following code sample shows table, column, and record descriptions intheconfig block of a SQLX table definition file:
config { type: "table", description: "Description of the table.", columns: { column1_name: "Description of the first column", column2_name: "Description of the second column", column3_name: "Description of the third column", record_name: { description: "Description of the record.", columns: { record_column1_name: "Description of the first record column", record_column2_name: "Description of the second record column", } } }}SELECT "first_column_value" AS column_1_name, "second_column_value" AS column_2_name, "third_column_value" AS column_3_name, STRUCT("first" AS record_column1_name, "second" AS record_column2_name) AS record_nameReuse column documentation with includes
You can reuse descriptions of columns in Dataform across your SQLworkflow with JavaScript includes. You might want to reuse column documentationif you have multiple columns with the same name and description in your SQLworkflow.
- To create a reusable a column description,define a JavaScript include constantwith the name of the column and its description.
You can define a constant with a description of a single column, or a constantwith a set or column description to reuse the descriptions of all columns in atable. For more information on creating and using includes in Dataform,seeReuse code across a single repository with includes.
The following code sample shows multiple constants with descriptions ofindividual columns defined in theincludes/docs.js JavaScript file:
// filename is includes/docs.jsconstuser_id=`A unique identifier for a user`;constage=`The age of a user`;constcreation_date=`The date this user signed up`;constuser_tenure=`The number of years since the user's creation date`;constbadge_count=`The all-time number of badges the user has received`;constquestions_and_answer_count=`The all-time number of questions and answers the user has created`;constquestion_count=`The all-time number of questions the user has created`;constanswer_count=`The all-time number of answers the user has created`;constlast_badge_received_at=`The time the user received their most recent badge`;constlast_posted_at=`The time the user last posted a question or answer`;constlast_question_posted_at=`The time the user last posted an answer`;constlast_answer_posted_at=`The time the user last posted a question`;module.exports={user_id,age,creation_date,user_tenure,badge_count,questions_and_answer_count,question_count,answer_count,last_badge_received_at,last_posted_at,last_question_posted_at,last_answer_posted_at,};The following code sample shows theuser_id andage constants, defined inincludes/docs.js, used in thedefinitions/my_table.sqlx SQLX tabledefinition file to generate documentation for selected columns in the table:
config { type: "table", description: "Table description.", columns: { user_id: docs.user_id, column2_name: "Description of the second column", column3_name: "Description of the third column", age: docs.age, }}SELECT ...The following code sample shows a constant with a set of column descriptionsdefined in theincludes/docs.js JavaScript file:
// filename is includes/docs.jsconstcolumns={user_id=`A unique identifier for a user`,age=`The age of a user`,creation_date=`The date this user signed up`,user_tenure=`The number of years since the user's creation date`,badge_count=`The all-time number of badges the user has received`,questions_and_answer_count=`The all-time number of questions and answers the user has created`,question_count=`The all-time number of questions the user has created`,answer_count=`The all-time number of answers the user has created`,last_badge_received_at=`The time the user received their most recent badge`,last_posted_at=`The time the user last posted a question or answer`,last_question_posted_at=`The time the user last posted an answer`,last_answer_posted_at=`The time the user last posted a question`,}module.exports={columns};The following code sample shows thecolumns constant, which is defined inincludes/table_docs.js and used in thedefinitions/my_table.sqlx SQLX tabledefinition file to generate documentation for all columns in the table:
config { type: "table",description: "My table description",columns: docs.columns}SELECT 1 AS oneAdd BigQuery labels
This section shows you how to add labels to tables in Dataform.
BigQuery supports adding labels to resources. For more informationon labels in BigQuery, seeIntroduction to labels.
To add a BigQuery label to a table in Dataform,add the label to thebigquery block in theconfig block of thetable definition SQLX file.
To add a BigQuery label to a table definition file, follow these steps:
- Go to your development workspace.
- In theFiles pane, expand
definitions/. - Select a SQLX table definition file.
In the
configblock, add a label in the following format:bigquery: { labels: {LABEL1: "VALUE_OF_LABEL1" } }Replace the following:
- LABEL1: the name of your label
- VALUE_OF_LABEL1: the value of your label
Optional: To add a label with a name that contains special characters,enter the label name in quotes (
"").Optional: ClickFormat.
The following code sample shows thedepartment:shipping andcost-center:logistics labels added to thebigquery block in apartitioned table definition SQLX file:
config { type: "table", bigquery: { partitionBy: "DATE(ts)", labels: { department: "shipping", "cost-center": "logistics" } }}SELECT CURRENT_TIMESTAMP() AS tsWhat's next
- To learn how to test table data with assertions, seeTest data quality.
- To learn how to define tables with JavaScript, seeCreate workflows exclusively with JavaScript.
- To learn how to reuse code with includes, seeReuse code across a single repository with includes.
- To learn how to use the Dataform command-line interface, seeUse the Dataform CLI.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.