Dataproc Metastore networking overview

This document provides an overview of the networking settings you can use to setup a Dataproc Metastore service.

Note: For Dataproc Metastore services that use theThriftendpoint protocol, Private Service Connect is the defaultnetworking option. Most other networking concepts on this page also apply toThrift endpoint services. If you are using the gRPC endpoint protocol, the onlynetwork configuration you need to consider in this document isVPC Service Controls.

Quick reference for networking topics

Networking settingsNotes
Default network settings
VPC subnetworksBy default, Dataproc Metastore services that use the Thrift endpoint protocol use a VPC subnetwork with Private Service Connect.
Virtual Private Cloud (VPC) networksYou can optionally choose to use VPC networks for Dataproc Metastore services that use the Thrift endpoint protocol. This is an alternative to using VPC subnetworks with Private Service Connect.

After the VPC network is created, Dataproc Metastore also automatically configuresVPC Network Peering for your service.
Additional network settings
Shared VPC networksYou can optionally choose to create Dataproc Metastore services in a Shared VPC network.
On-premise networkingYou can connect to a Dataproc Metastore service with an on-premise environment by using Cloud VPN or Cloud Interconnect.
VPC Service ControlsYou can optionally choose to create Dataproc Metastore services with VPC Service Controls.
Firewall rulesIn non-default or private environments with an established security footprint, you might need to create your own firewall rules.

Default networking settings

The following section describes the default network settings thatDataproc Metastore uses for the Thrift endpoint protocol—VPC subnetworks with Private Service Connect.

VPC subnetworks

For Dataproc Metastore services that use the Thrift endpoint protocol,Private Service Connect (PSC) is the default networking option.PSC lets you set up a private connection to Dataproc Metastore metadata across VPC networks.With PSC, you can create a service without VPC peering. This lets you use your own internal IPaddresses to access Dataproc Metastore, without leaving your VPCnetworks or using external IP addresses.

To set up Private Service Connect when creating a service, seePrivate Service Connect with Dataproc Metastore.

VPC networks

You can optionally choose to useVPC Networks for Dataproc Metastoreservices that use the Thrift endpoint protocol. This is an alternative to usingVPC subnetworks with Private Service Connect. A VPC network is a virtual versionof a physical network that is implemented inside of Google's production network.When you create a Dataproc Metastore, the service automaticallycreates the VPC network for you.

If you don't change any settings when you create your service,Dataproc Metastore uses thedefault VPC network.With this setting, the VPC network that you use with your Dataproc Metastoreservice can belong to the same Google Cloud project or a different project.This setting also lets you expose your service in a single VPC network ormake your service accessible from multiple VPC networks (through the use ofsubnetworks).

Dataproc Metastore requires the following per region for eachVPC network:

VPC Network Peering

After the VPC network is created, Dataproc Metastore alsoautomatically configuresVPC Network Peering for your service. VPCprovides your service with access to the Dataproc Metastoreendpoint protocols. After you create your service, you can see its underlying VPC Network Peering ontheVPC Network Peering page in the Google Cloudconsole.

VPC Network Peering is not transitive. This means that only directly peerednetworks can communicate with each other. For example, consider the followingscenario:

You have the following networks, VPC network N1, N2, and N3.

  • VPC network N1 is paired with N2 and N3.
  • VPC network N2 and N3 are not directly connected.

What does this mean?

It means that through VPC Network Peering, VPC network N2 can't communicatewith VPC network N3. This impacts Dataproc Metastore connectionsin the following ways:

  • Virtual machines that are in networks peered with yourDataproc Metastore project network can't reachDataproc Metastore.
  • Only hosts on the VPC network can reach a Dataproc Metastoreservice.

VPC Network Peering Security considerations

  • Traffic over VPC Network Peering is provided with a certain levelof encryption. For more information, seeGoogle Cloud virtual networkencryption and authentication.

  • Creating one VPC network for each service with an internal IPaddress provides better network isolation than putting all services in thedefault VPC network.

IP Addresses

To connect to a network and help protect your metadata,Dataproc Metastore services only use internal IP addresses. Thismeans that public IP addresses aren't exposed or are available for networkingpurposes.

By using an internal IP address, Dataproc Metastore can onlyconnect to Virtual Machines (VMs) that exist on specified VPC(VPC) networks or an on-premises environment.

Connections to a Dataproc Metastore service using a internal IPaddress useRFC 1918 address ranges. Using these ranges means thatConnections to a Dataproc Metastore service using an internal IPaddress useRFC 1918 address ranges. Using these ranges means thatDataproc Metastore allocates a/17 range and a/20 range fromthe address space for each region. For example, placingDataproc Metastore services in two regions requires that theallocated IP address range contains the following:

  • At least two unused address blocks of size/17.
  • At least two unused address blocks of size of size/20.

If RFC 1918 address blocks aren't found, then Dataproc Metastorefinds suitable non-RFC 1918 address blocks instead. Note that the allocation ofnon-RFC 1918 blocks doesn't take into account whether or not those addresses arein use in your VPC network or on-premises.

Additional networking settings

If you require a different networking setting, you can use the following optionswith your Dataproc Metastore service.

Shared VPC network

You can create Dataproc Metastore services in aShared VPC network. A Shared VPC lets you connectDataproc Metastore resources from multiple projects to a commonVPC (VPC) network.

To set up a Shared VPC when creating a service, seeCreate a Dataproc Metastore Service.

On-premise networking

You can connect to a Dataproc Metastore service with an on-premiseenvironment by usingCloud VPN or Cloud Interconnect

VPC Service Controls

VPC Service Controls improve your ability to mitigate the risk of dataexfiltration. With VPC Service Controls, you create perimeters around theDataproc Metastore service. VPC Service Controls restrict access toresources within the perimeter from the outside. Only clients and resourceswithin the perimeter can interact with one another.

To use VPC Service Controls with Dataproc Metastore, seeVPC Service Controls with Dataproc Metastore. Also reviewDataproc Metastore limitations when using VPC Service Controls.

Firewall rules for Dataproc Metastore

In non-default or private environments with an establishedsecurity footprint, you might need to create your own firewall rules. If you do,don't create a firewall rule that blocks the IP address range orport of your Dataproc Metastore services.

When youcreate a Dataproc Metastore service,you can accept the default network for the service. The default network ensuresfull internal IP networking access for your VMs.

For more general information about firewall rules, seeVPC firewall rulesandUsing VPC firewall rules.

Create a firewall rule for a custom network

When you use a custom network, make sure your firewall rule permits trafficcoming from and going to the Dataproc Metastore endpoint. Toexplicitly allow Dataproc Metastore traffic, run the followinggcloud commands:

gcloudcomputefirewall-rulescreatedpms-allow-egress-DPMS_NETWORK-REGION--allowtcp--destination-rangesDPMS_NET_PREFIX/17--networkDPMS_NETWORK--directionOUT
gcloudcomputefirewall-rulescreatedpms-allow-ingress-DPMS_NETWORK-REGION--allowtcp,udp--source-rangesDPMS_NET_PREFIX/17--networkDPMS_NETWORK

ForDPMS_NET_PREFIX, apply a/17 subnet mask toyour Dataproc Metastore service IP. You can find yourDataproc Metastore IP address information in theendpointUri configuration on theService detail page.

Considerations

Networks have animplied allow egress rule that normally allows access fromyour network to Dataproc Metastore. If you create deny egressrules that override the implied allow egress rule, you should create an allowegress rule with a higher priority to permit egress to theDataproc Metastore IP.

Some features such as Kerberos require Dataproc Metastore toinitiate connections to hosts in your project network. All networks have animplied deny ingress rulethat blocks these connections and prevent the those features from working.implied deny ingress rulethat blocks these connections and prevent those features from working.You should create a firewall rule that allows TCP and UDP ingress on all portsfrom the/17 IP block that contains the Dataproc Metastore IP.

Custom routing

Custom routes are for subnets that use privately used public IPaddresses (PUPI). Custom routes allow your VPC network to connect to a peer network.Custom routes can only be received when your VPC network imports them and thepeer network explicitly exports them. Custom routes can be either static ordynamic.

Sharing custom routes with peered VPC networks allow networks to "learn" routesdirectly from their peered networks. This means that when a custom routein a peered network is updated, your VPC network automatically learns andimplements the custom route without requiring any additional action from you.

For more information about custom routing, seenetwork config.

Dataproc Metastore Networking example

In the following example, Google allocates the10.100.0.0/17 and10.200.0.0/20 address ranges in the customer VPC network forGoogle services and uses the address ranges in a peered VPCnetwork.

INSERT ALT TEXT HERE
Figure 1. Dataproc Metastore VPC network configuration
alt = "Diagram showing Dataproc Metastore VPC network configuration with Google services and customer VPC network, illustrating IP address allocation and peering.">
Figure 1. Dataproc Metastore VPC network configuration

Description of the networking example:

  • On the Google services side of the VPC peering, Googlecreates a project for the customer. The project is isolated, meaning noother customers share it and the customer is billed for only the resourcesthe customer provisions.
  • When creating the first Dataproc Metastore service in aregion, Dataproc Metastore allocates a/17 range and a/20 range in the customer's network for all futureDataproc Metastore services usage in that region andnetwork. Dataproc Metastore further subdivides these rangesto create subnetworks and address ranges in the service producer project.
  • VM services in the customer's network can accessDataproc Metastore service resources in any region if theGoogle Cloud service supports it. Some Google Cloud servicesmight not support cross-region communication.
  • Egress costs for cross-regional traffic,where a VM instance communicates with resources in a different region, stillapply.
  • Google assigns the Dataproc Metastore service the IP address10.100.0.100. In the customer VPC network, requests with adestination of10.100.0.100 are routed through the VPCpeering to the service producer's network. After reaching the servicenetwork, the service network contains routes that direct the request to thecorrect resource.
  • Traffic between VPC networks travels internally withinGoogle's network, not through the public internet.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.