Dataproc optional Solr component Stay organized with collections Save and categorize content based on your preferences.
You can install additional components like Solr when you create a Dataproccluster using theOptional componentsfeature. This page describes the Solr component.
TheApache Solrcomponent is an open source enterprise search platform. The Solr server andWeb UI are available on port8983 on the cluster's master node(s).
Persisting Solr files: By default, Solr writes and reads the index andtransaction log files inHDFS.To persist Solr files, use a Cloud Storage path as the Solr homedirectory by setting thedataproc:solr.gcs.pathcluster property when youinstall the component.
Install the component
Install the component when you create a Dataproc cluster.Components can be added to clusters created withDataprocversion 1.3and later.
SeeSupported Dataproc versionsfor the component version included in each Dataproc image release.
gcloud command
To create a Dataproc cluster that includes the Solr component,use thegcloud dataproc clusters createcluster-namecommand with the--optional-components flag. The sample command below uses the optionalpropertiesflag to set a Cloud Storage path as the Solr home directory.
--enable-component-gateway flag, as shown below,to enable connecting to the Solr Web UI using theComponent Gateway.gcloud dataproc clusters createcluster-name \ --region=region \ --optional-components=SOLR \ --enable-component-gateway \ ... other flags
--properties="dataproc:solr.gcs.path=gs://bucket-name/" cluster property to thegcloud dataproc clusters create command to set a Cloud Storage bucket where Solr documents will be stored (Solr home directory).REST API
The Solr component can be specified through the Dataproc API usingSoftwareConfig.Componentas part of aclusters.createrequest.
As part of yourclusters.create request, you can:- Set theEndpointConfig.enableHttpPortAccess property to
trueto enable connecting to the Solr Web UI using theComponent Gateway. - Set the
"dataproc:solr.gcs.path=gs://bucket-name"cluster property in theSoftwareConfig.Component.properties field to set a Cloud Storage bucket where Solr documents will be stored (Solr home directory).
Console
- Enable the component and component gateway.
- In the Google Cloud console, open the DataprocCreate a cluster page. The Set up cluster panel is selected.
- In the Components section:
- Under Optional components, select Solr and other optional components to install on your cluster.
- Under Component Gateway, select Enable component gateway (seeViewing and Accessing Component Gateway URLs).
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.