Cluster web interfaces

Avoid Security Vulnerabilities.

Open ports and improperly configured firewall rules on a public network can allowunauthorized users to execute arbitrary code.*

ReviewSpecify a source IP range for subnet firewall rules

Create an SSH tunnel to establish a secure connection to your cluster's master instance.

Apache Hadoop YARN provides REST APIs that share the sameports as the YARN web interfaces (default port 8088). By default, users who can reach the YARN webinterface can create applications, submit jobs, and may be able to performCloud Storage operations.SeeAllowed YARN Resource Manager REST APIsfor information on setting allowed YARN Resource Manager REST API methods.

Some of the core open source components included with Dataprocclusters, such asApache Hadoop andApache Spark, provide web interfaces. Theseinterfaces can be used to manage and monitor cluster resources andfacilities, such as the YARN resource manager, the Hadoop Distributed FileSystem (HDFS), MapReduce, and Spark. Other componentsor applications that you install on your cluster may also provide web interfaces(see, for example,Install and run a Jupyter notebook on a Dataproc cluster).

Objective: The steps and examples shown below show you how tosecurely connect to web interfaces running on your Dataproccluster using an SSH tunnel from your local network or Google Cloud Cloud Shellto your cluster's Compute Engine network.Use the Component Gateway to connect to coreand optional component web interfaces. Clusters created with Dataproc imageversion 1.3.29and later can install and enable access tocomponent web interfaces, including YARN, HDFS, Jupyter, and Zeppelin UIs, without relying onSSH tunnels ormodifying firewall rulesto allow inbound traffic. SeeDataproc Component Gateway for more information.

SeeSSH into a Dataproc cluster toopen an SSH session on a Dataproc cluster's master node.

Available interfaces

The following interfaces are available on a Dataproc cluster masternode (replacemaster-host-name with the name of your master node).

The clustermaster-host-name is the name of yourDataproc cluster followed by an-m suffix—for example,if your cluster is named "my-cluster", the master-host-name wouldbe "my-cluster-m".
Web UIPortURL
YARN ResourceManager80881http://master-host-name:8088
HDFS NameNode98702,3http://master-host-name:9870

1 On Kerberos enabled clusters, the YARNResourceManager web UI port is 8090, and it runs on HTTPS.

2 On Kerberos enabled clusters, the HDFSNamenode web UI port is 9871, and it runs on HTTPS.

3 In earlier Dataprocreleases (pre-1.2), the HDFS Namenode web UI port was 50070.

The YARN ResourceManager has links for all currently running and completedMapReduce and Spark Applications web interfaces under the "Tracking UI" column.

Allowed YARN ResourceManager REST APIs

When you create a cluster, Dataprocsets the yarn-site.xmlyarn.resourcemanager.webapp.methods-allowedpropertyto "GET,HEAD". which restricts the HTTP methods that can be called on theYARN Resource Manager web UI andREST APIsto theGET andHEAD methods. This default setting alsodisables job submission and modifications via the YARN REST API.

You can override the default values to enable specific HTTP methodson port 8088 by setting this property to one or more comma-separated HTTP methodnames. AnALL value will allow all HTTP methods on the port.

Example:

gcloud dataproc clusters createcluster-name \    --properties=^#^yarn:yarn.resourcemanager.webapp.methods-allowed=GET,POST,DELETE \    --region=region \

Recommendation: If you set this property to allownon-default HTTP methods, make sure to configure firewall rules and othersecurity settings to restrict access to port 8088.

Connecting to web interfaces

You can connect to web interfaces running on a Dataproc clusterusing theDataproc Component Gateway,your project'sCloud Shell, or the Google Cloud CLIgcloudcommand-line tool:

  • Component Gateway: Connect with one click to Hadoop, Spark, and other componentWeb UI interfaces from the Google Cloud console. You enable theComponent Gatewaywhen you create your cluster.

  • Cloud Shell: The Cloud Shell in the Google Cloud console has thegcloud CLI commands and utilities pre-installed, and it provides aWeb Preview feature thatallows you to quickly connect through an SSH tunnel to a web interface port ona cluster. However, a connection to the cluster from Cloud Shelluses local port forwarding, which opens a connection to only one port on acluster web interface—multiple commands are needed to connect to multipleports. Also, Cloud Shell sessions automatically exit after aperiod of inactivity (30 minutes).

  • Google Cloud CLI: Thegcloud compute ssh command withdynamic port forwardingallows you to establish an SSH tunnel and run aSOCKSproxy server on top of the tunnel. After issuing this command, you mustconfigure your local browser to use the SOCKS proxy. This connection methodallows you to connect to multiple ports on a cluster web interface. SeeCan I use local port forwarding instead of a SOCKS proxy?for more information.

Set commonly used command variables

To make copying and running command-line examples on your local machineor inCloud Shell easier,setgcloud dataproc command variables. Additional variables mayneed to be set for some of the command examples shown on this page.

Linux/mac/Shell

export PROJECT=project;export HOSTNAME=hostname;export ZONE=zone

Windows

set PROJECT=project && set HOSTNAME=hostname && set ZONE=zone
  • SetPROJECT to your Google Cloudproject ID
  • SetHOSTNAME to the name ofmaster node in yourDataproc cluster (the master name ends with a-m suffix)
  • SetZONE to thezoneof the VMs in your Dataproc cluster (for example, "us-central1-b")

Create an SSH tunnel

gcloud Command

Run the followinggcloudcommand on your local machine to set up an SSH tunnel from an open port on your local machine to the master instance of your cluster, and run a local SOCKS proxy server listening on the port.

Before running the command, on your local machine:

  1. Set commonly used command variables
  2. Set aPORT variable to an open port on your local machine. Port1080 is an arbitrary but typical choice since it is likely to be open.
    PORT=number

Linux/macOS

gcloud compute ssh ${HOSTNAME} \    --project=${PROJECT} --zone=${ZONE}  -- \    -D ${PORT} -N

Windows

gcloud compute ssh %HOSTNAME% ^    --project=%PROJECT% --zone=%ZONE%  -- ^    -D %PORT% -N

The-- separator allows you to addSSH arguments to thegcloud compute ssh command, as follows:

  • -Dspecifies dynamic application-level port forwarding.
  • -N instructsgcloud not to open a remote shell.

Thisgcloud command creates an SSH tunnel that operates independently from other SSH shell sessions, keeps tunnel-related errors out of the shell output, and helps prevent inadvertent closures of the tunnel.

If the ssh command fails with the error messagebind: Cannot assign requested address, a likely cause is that the requested port is in use. Try running the command with a differentPORT variable value.

The above command runs in the foreground, and must continue running to keep the tunnel active. The command should exit automatically if and when the you delete the cluster.

Opt to run the command in the background. You can run the command as a background process by adding the `-n` flag (-- -D ${PORT} -N -n), which redirectsstdin from/dev/null. Note that the-n flag may not be supported in all operating systems.

Cloud Shell

  1. Open Google CloudCloud Shell.
    Cloud Shell Session Timeout: Cloud Shell sessions automatically exit after a period of inactivity (30 minutes).
  2. Run thegcloud command, below, in Cloud Shell to set up an SSH tunnel from a Cloud Shell preview port to a web interface port on the master node on your cluster. Before running the command, in Cloud Shell :

    1. Set commonly used command variables
    2. Set aPORT1 variable to a Cloud Shell port in the port range 8080 - 8084, and set aPORT2 variable to the web interface port on the master node on your Dataproc cluster.
      PORT1=numberPORT2=number
    gcloud compute ssh ${HOSTNAME} \    --project=${PROJECT} --zone=${ZONE}  -- \    -4 -N -L ${PORT1}:${HOSTNAME}:${PORT2}

    The-- separator allows you to addSSH arguments to thegcloud compute ssh command, as follows:

    • -4 instructs ssh to only use IPv4.
    • -N instructsgcloud not to open a remote shell.
    • -L ${PORT1}:${HOSTNAME}:${PORT2} specifies local port forwarding from the specified Cloud ShellPORT1 to clusterHOSTNAME:PORT2.

    Thisgcloud command creates an SSH tunnel that operates independently from other SSH shell sessions, keeps tunnel-related errors out of the shell output, and helps prevent inadvertent closures of the tunnel.

Configure your browser

gcloud Command

Your SSH tunnel supports traffic proxying using the SOCKS protocol. To configure your browser to use the proxy, start a new browser session with proxy server parameters. Here's an example that uses the Google Chrome browser.HOSTNAME is the name of the cluster's master node (seeSet commonly used command variables).

Linux

/usr/bin/google-chrome \    --proxy-server="socks5://localhost:${PORT}" \    --user-data-dir=/tmp/${HOSTNAME}

macOS

"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" \    --proxy-server="socks5://localhost:${PORT}" \    --user-data-dir=/tmp/${HOSTNAME}

Windows

"%ProgramFiles(x86)%\Google\Chrome\Application\chrome.exe" ^    --proxy-server="socks5://localhost:%PORT%" ^    --user-data-dir="%Temp%\%HOSTNAME%"
Windows 32 bit Users: Change%ProgramFiles(x86)% to%ProgramFiles% in the command above.

This command uses the following Chrome browser flags:

Proxy extensions: Proxy managementextensions that simplify management and use of the proxy in your browserare available for Chrome, Firefox, and other web browsers.

Cloud Shell

You do not need to configure your local browser when using Cloud Shell. Aftercreating an SSH tunnel, use Cloud Shell web preview toconnect to the cluster interface.

Connect to the cluster interface

gcloud Command

Once your local browser is configured to use the proxy, you can navigate to theweb interface URL on your Dataproc cluster (seeAvailable interfaces).The browser URL has the following format and content:http://cluster-name-m:port (cluster interface port)

Cloud Shell

Click the Cloud ShellWeb Preview buttonweb-preview-button, and then select either:

  • "Preview on port 8080", or
  • "Change port" and insert the port number in the dialog
according to the Cloud ShellPORT1 number (port 8080 - 8084) you passed to thegcloud compute ssh command inCreate an SSH tunnel.

A browser window opens that connects to the web interface port on the cluster master node.

Chrome Browser Messages: When using a Chrome browser,you may see messages of the following type in the terminal window or Cloud Shellthat you used tocreate an SSH tunnel.
channel 15: open failed: administratively prohibited: open failedchannel 16: open failed: administratively prohibited: open failedchannel 17: open failed: administratively prohibited: open failed
These are not fatal error messages. The Chrome browser issues thesemessages when it is unable to load a page, and you maysee these messages even when you can successfully connect to theapplication interface on your cluster.

FAQ And debugging tips

What if I don't see the UI in my browser?

If you don't see the UIs in your browser, the two most common reasons are:

  1. You have a network connectivity issue, possibly due to a firewall.Run the following command (aftersetting local variables)to see if you can SSH to the master instance.If you can't, it signals a connectivity issue.

    Linux/macOS

    gcloud compute ssh ${HOSTNAME}-m \    --project=${PROJECT}

    Windows

    gcloud compute ssh %HOSTNAME%-m ^    --project=%PROJECT%

  2. Another proxy is interfering with the SOCKS proxy. To check the proxy,run the followingcurl command (available on Linux and macOS):

    Linux/macOS

    curl -Is --socks5-hostname localhost:1080 http://cluster-name-m:8088

    Windows

    curl.exe -Is --socks5-hostname localhost:1080 http://cluster-name-m:8088
    If you see an HTTP response, the proxy is working, so it's possiblethat the SOCKS proxy is being interrupted by another proxy or browser extension.

Can I use local port forwarding instead of a SOCKS proxy?

Instead of the SOCKS proxy, it's possible to access web application UIs runningon your master instance with SSH local port forwarding, whichforwards the master's port to a local port. For example, the following command letsyou accesslocalhost:1080 to reachcluster-name-m:8088 without SOCKS(seeSet commonly used command variables):

Linux/macOS

gcloud compute ssh ${HOSTNAME}-m \    --project=${PROJECT} -- \    -L 1080:${HOSTNAME}-m:8088 -N -n

Windows

gcloud compute ssh %HOSTNAME%-m ^    --project=%PROJECT% -- ^    -L 1080:%HOSTNAME%-m:8088 -N -n

With a SOCKS proxy, you access the remote UI interfacerunning on the master instance withhttp://cluster-name-m:port, butwith local port forwarding you access the master instance's web application UIathttp://localhost:port.

Using a SOCKS proxy may be preferable to using local port forwardingsince the proxy:

  • allows you to access all web application ports without having toset up a port forward tunnel for each UI port
  • allows the Spark and Hadoop web UIs to correctly resolve DNS hosts

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.