Dataproc optional Pig component Stay organized with collections Save and categorize content based on your preferences.
You can install additional components likeApache Pigwhen you create a Dataproc cluster using theOptional componentsfeature. This page describes the Pig component, an open source platform foranalyzing large data sets.
Install the component
Install the component when you create a Dataproc cluster.
Apache Pig is an optional component in Dataproc2.3 and laterimage versions.
2.2and earlier image versions.SeeSupported Dataproc versionsfor component versions included in the latest Dataproc imagereleases.
gcloud
To create a Dataproc cluster that includes the Pig component,use thegcloud dataproc clusters createCLUSTER_NAMEcommand with the--optional-components flag (using image version2.3 or later).
gcloud dataproc clusters createCLUSTER_NAME \ --region=REGION \ --optional-components=PIG \ --image-version=2.3 \ ... other flags
REST API
The Pig component can be specified through the Dataproc APIusingSoftwareConfig.Componentas part of aclusters.createrequest.
Console
Enable the component:
- In the Google Cloud console, open the DataprocCreate a cluster page. The Set up cluster panel is selected.
- In the Components section, under Optional components, select Pig and other optional components to install on your cluster.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-02-19 UTC.