Configuring Network Access for Dataproc Metastore

This page provides detailed guidance on configuring network access for yourDataproc Metastore instances. Correct network setup is essential forDataproc clusters and Google Cloud Serverless for Apache Spark workloads to securely and privatelycommunicate with your managed Dataproc Metastore service.

For a more general overview of networking concepts, seeNetworkingOverview

Key Networking Concepts

Dataproc Metastore instances typically reside within a Google-managedservice producer network and communicate with your Virtual Private Cloud (VPC) network usingprivate connectivity. Understanding the following concepts is crucial for asuccessful setup:

  • Shared Virtual Private Cloud: If your Dataproc clusters orServerless for Apache Spark workloads are in a service project that uses aShared VPC network from a host project, verify the appropriatenetwork configurations are made in the host project. For more information, seeShared VPC overview.
  • Private Google Access: Dataproc Metastore instances oftenrely on Private Google Access for private communication with yourVPC network. This allows Virtual Machine (VM) instances in yourVPC to connect to Google APIs and services using internal IP addresses.For more information, seePrivate Google Access.
  • VPC Network Peering: This mechanism enables private IP connectivitybetween two VPC networks, allowing resources in one network tocommunicate with resources in the other using internal IP addresses.Dataproc Metastore establishes a managed VPC Network Peering connectionto your VPC network as part of its setup. For more information, seeVPC Network Peering.
  • Firewall Rules: Proper firewall rules are necessary to permit trafficbetween your Dataproc workloads and the Dataproc Metastoreinstance.
  • Cloud DNS Resolution: Verify that DNS resolution is correctlyconfigured within your VPC network to resolve theDataproc Metastore endpoint URI to its private IP address.

Configuration Steps

To verify proper network access for your Dataproc Metastore instance,follow these steps:

1. Configure Private Service Access

Dataproc Metastore uses Private Service Access to establish a private connectionbetween your VPC network and the Google-managed service producernetwork where your Dataproc Metastore instance resides.

  • Verify Private Service Access Connection:
    1. In the Google Cloud console, go toVirtual Private Cloud network > VPC Network Peering.
    2. Verify that a peering connection namedservicenetworking-googleapis-comexists and its state isACTIVE.
    3. If this connection is missing or not active, follow the instructions inConfiguring Private Service Access.This includes allocating an IP address range for the service producer network.

2. Configure Firewall Rules

Verify that firewall rules in your VPC network (or the Shared VPChost project, if applicable) allow necessary traffic.

  • Egress Rule from Workload to Metastore:
    • Verify that an egress firewall rule allows outbound TCP traffic from yourDataproc cluster or Serverless for Apache Spark workloadsto the IP address range of your Dataproc Metastore instanceon port9083. This is the default port for Hive Metastore.
    • If using Private Service Access, this traffic will be routed privately.
  • Ingress Rules (less common for client-to-Metastore):
    • Generally, you don't need to configure ingress rules on your VPCfor trafficfrom the Dataproc Metastore instanceto yourworkload, as communication typically originates from the workload. However,verify no overly restrictive ingress rules are inadvertently blockingnecessary responses.

3. Verify DNS Resolution

Your Dataproc workloads need to resolve theDataproc Metastore endpoint URI to its private IP address.

  • DNS Peering or Private Zones: If you are using custom DNS servers orprivate Cloud DNS zones, verify that DNS queries for theDataproc Metastore endpoint (for example,your-metastore-endpoint.us-central1.dataproc.cloud.google.com) are correctlyforwarded or resolved to the private IP range used by Private Service Access.
  • Testing DNS Resolution: From a VM within the same subnet as yourDataproc workload, usenslookup ordig to verify thatthe Dataproc Metastore endpoint resolves to a private IPaddress.

Troubleshooting Network Connectivity

If you encounter connectivity issues after configuring network access, considerthe following troubleshooting steps:

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.