Wrangler overview Stay organized with collections Save and categorize content based on your preferences.
Wrangler is a visual data preparation tool within the Cloud Data FusionStudio interface. It lets you clean and transform data before using it inExtract, Transform, Load (ETL) pipelines. Wrangler applies transformations on asample of your data in one place (called aPreview) before running the logicon the entire dataset. This preview helps you apply transformations and gain anunderstanding of how they affect the entire dataset.
Wrangler directives
A directive is a single instruction used within the Wrangler. Directivesspecify how to manipulate your data, such as transforming, filtering, orpivoting individual records.
The following concepts are related to directives:
- Recipe
- A recipe is a set of directives. It consists of one or more directives.
- Transformation step
- A transformation step is an implementation of a data transformation directive,operating on a single record or set of records. A transformation step cangenerate zero or more records from applying a directive. Wranglerapplies the transformation steps in the order listed in the recipe.
Wrangler components
The following sections explain components of Wrangler in theCloud Data Fusion Studio.
Wrangler workspace
The Wrangler workspace is a page in the Cloud Data Fusion Studio interfacewhere you parse, blend, cleanse, and transform datasets. On theWorkspacepage, you can do the following:
- Add transformation steps to a recipe using the drop-down menu in eachcolumn.
- View or delete steps in a recipe by selecting theTransformation stepstab.
- Discover columns with blank fields and other information by checking theData quality bar.
- View the schema for the dataset by clickingMore.
- Create a data pipeline with a source plugin for the dataset,and the Wrangler transformation with the recipe containing thetransformation steps, which are executed when the pipeline runs.
Wrangler Power Mode (CLI)
To specify directives using declarative syntax, use the Power Mode (CLI). It'suseful for the following tasks:
- Using directives that aren't available in the Studio interface
- Adding user-defined directives
- Applying a directive to multiple columns
To use Wrangler Power Mode, enter directives in the black bar at the bottom ofthe WranglerData tab.
Wrangler Insights tab
You can use theInsights tab on the Wrangler page to perform data discoveryon a dataset.
Limitations
- Wrangler is only supported for batch ETL pipelines.
- Wrangler applies transformation only on the sample data. This sampledata is limited to the first 1000 records.
- Wrangler requires connections to be created with the source. For moreinformation, seeCreate and manage connections.
- Wrangler always requires at least one Wrangler workspace to be open.
- Clicking the Wrangle button in the Wrangler transformation isn't supported.
Navigate to Wrangler in Cloud Data Fusion
You can access Wrangler in two ways from the Cloud Data Fusion Studiointerface:
- To open the Cloud Data Fusion Wrangler workspace,go to the Cloud Data Fusion Studio and clickWrangler.
- To configure Wrangler properties, go to the Cloud Data Fusion Studio,and clickStudio>Transformations>Wrangler.
Connect to a data source
Wrangler supports various data sources, such as BigQuery,Cloud Storage, and external databases (with additional configuration). To useWrangler, you must create a connection with the source.
To create the connection, go to theConnections list and select theconnection to your data source. For more information, seeCreate and manage connections.
Note: If you access Wrangler from the plugin palette, to open the Wrangler workspace from the plugin pallet, open the Wrangler plugin properties and clickWrangle.Explore and preview data
Wrangler displays a sample of your data (typically 1000 rows) for inspection.You can get an overview of the data schema, including data types and basicstatistics.
Apply directives
Wrangler offers a variety of built-in directives for common data wranglingtasks.
- Drag the chosen directive onto a specific column or the datapreview window.
- Each directive has configuration options to customize its behavior.
For more information, seeWrangler command-line directives.
Preview transformation results
As you apply directives, the data preview window dynamically updates to reflectthe changes. This lets you see the immediate impact of each transformationon your data.
Refine and iterate
To refine your data wrangling process, continue adding directives, modifyingconfigurations, and reviewing the preview.
Wrangler's visual interface helps you experiment and ensure that yourtransformations produce the expected outcome.
Add transformations to a pipeline
While Wrangler itself isn't a persistent storage solution,Cloud Data Fusion offers ways to capture your wrangling logic:
Create a pipeline. From the Wrangler workspace, convert your Wranglertransformations into a Cloud Data Fusion pipeline by following thesesteps:
- ClickCreate pipeline.
- SelectBatch pipeline. ThePipeline Studio page opens with apipeline that has a source and a Wrangler transformation.
Apply transformations. If you're using the Wrangler plugin on theStudio page, convert your Wrangler transformations into aCloud Data Fusion pipeline by clickingApply.
Edit Recipes
When you use the Wrangler workspace to create a Wrangler transformation, afteryou add the Wrangler transformation to a pipeline, it's recommended that you usethe Wrangler interface to add or edit recipes.
In the Wrangler transformation, if you manually edit the recipe or add new stepsto the recipe and the changes affect the output schema, you must manually updatethe output schema in the Wrangler transformation to match the changes in therecipe. Only recipes created or edited in the Wrangler workspace willauto-create and auto-update the output schema in the Wrangler transformation.
To edit a recipe in the Wrangler transformation that was created in the Wranglerweb interface, follow these steps:
- Go to the Wrangler node in your pipeline and clickProperties.
- ClickWrangle.
- Edit or add a new recipe.
- ClickApply.
What's next
- Learn more aboutWrangler CLI directives.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.