Dataproc optional Solr component

You can install additional components like Solr when you create a Dataproccluster using theOptional componentsfeature. This page describes the Solr component.

TheApache Solrcomponent is an open source enterprise search platform. The Solr server andWeb UI are available on port8983 on the cluster's master node(s).

Persisting Solr files: By default, Solr writes and reads the index andtransaction log files inHDFS.To persist Solr files, use a Cloud Storage path as the Solr homedirectory by setting thedataproc:solr.gcs.pathcluster property when youinstall the component.

Install the component

Install the component when you create a Dataproc cluster.Components can be added to clusters created withDataprocversion 1.3and later.

SeeSupported Dataproc versionsfor the component version included in each Dataproc image release.

gcloud command

To create a Dataproc cluster that includes the Solr component,use thegcloud dataproc clusters createcluster-namecommand with the--optional-components flag. The sample command below uses the optionalpropertiesflag to set a Cloud Storage path as the Solr home directory.

When creating the cluster, usegcloud dataproc clusters create command with the--enable-component-gateway flag, as shown below,to enable connecting to the Solr Web UI using theComponent Gateway.
gcloud dataproc clusters createcluster-name \    --region=region \    --optional-components=SOLR \    --enable-component-gateway \    ... other flags
Add the--properties="dataproc:solr.gcs.path=gs://bucket-name/" cluster property to thegcloud dataproc clusters create command to set a Cloud Storage bucket where Solr documents will be stored (Solr home directory).

REST API

The Solr component can be specified through the Dataproc API usingSoftwareConfig.Componentas part of aclusters.createrequest.

As part of yourclusters.create request, you can:
  1. Set theEndpointConfig.enableHttpPortAccess property totrue to enable connecting to the Solr Web UI using theComponent Gateway.
  2. Set the"dataproc:solr.gcs.path=gs://bucket-name" cluster property in theSoftwareConfig.Component.properties field to set a Cloud Storage bucket where Solr documents will be stored (Solr home directory).

Console

  1. Enable the component and component gateway.
    • In the Google Cloud console, open the DataprocCreate a cluster page. The Set up cluster panel is selected.
    • In the Components section:

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.