Troubleshoot internal connectivity between VMs

This document provides troubleshooting steps for connectivity issues betweenCompute Engine VMs that are in the same Virtual Private Cloud (VPC) network (eitherShared VPC or standalone) or two VPC networks connectedwith VPC Network Peering. It assumes that the VMs are communicating usingthe internal IP addresses of their respective virtual network interfacecontrollers (vNICs).

The steps in this guide apply to both Compute Engine VMs and Google Kubernetes Engine nodes.

If you would like to see specific additional troubleshooting scenarios,click theSend feedback link at the bottom of the page and let us know.

The following VM and VPC configurations are applicable to this guide:

VM-to-VM connections using internal IP addresses in a singleVPC network.
VM-to-VM connections using internal IP addresses within aShared VPC network.
VM-to-VM connections using internal IP addresses in differentVPC networks peered using VPC Network Peering.

Commands used in this guide are available on all Google-provided OS images. Ifyou are using your own OS image, you might have to install the tools.

Note: When troubleshooting, it's useful to record the commands you run and theresults you get in a document. You can use this document to check yourown work and to let others know what you've researched. In addition, ifyou do need to open a support ticket, the document can speed upresolving your issue.

Quantify the problem

If you think you have complete packet loss, go to Troubleshoot complete connection failure.
If you are experiencing latency, only partial packet loss, or timeoutsoccurring mid-connection, go toTroubleshoot network latency or loss causing throughput issues.

Troubleshoot complete connection failure

The following sections provide steps for troubleshooting complete internalconnectivity failure between VMs. If you are instead experiencingincreased latency or intermittent connection timeouts, skip toTroubleshoot network latency or loss causing throughput issues.

Determine connection values

First gather the following information:

From theVM instances page,gather the following for both VMs:
- VM names
- VM zones
- Internal IP addresses for the vNICs that are communicating
From the configuration of the destination server software, gather thefollowing information:
- Layer 4 protocol
- Destination port
For example, if your destination is an HTTPS server, the protocol is TCPand the port is usually443, but your specific configuration might use adifferent port.

If you're seeing issues with multiple VMs, pick a single source and singledestination VM that are experiencing issues and use those values.In general, you shouldn't need the source port of the connection.

Once you have this information, proceed toInvestigate issues with the underlying Google network.

Investigate issues with the underlying Google network

If your setup is anexisting one that hasn't changed recently, then the issue might be with theunderlying Google network. Check the Network Intelligence Center Performance Dashboardforpacket loss between the VM zones. If there is an increase in packet loss between the zonesduring the timeframe when you experienced network timeouts, it might indicatethat the problem was with the physical network underlying yourvirtual network. Check theGoogle Cloud Status Dashboard for known issues before filing a support case.

If the issue does not seem to be with the underlying Google network, proceed toCheck for misconfigured Google Cloud firewall rules.

Check for misconfigured firewall rules in Google Cloud

Note: This section uses Connectivity Tests, which can incur charges.For pricing details, see Network Intelligence Centerpricing.

Connectivity Tests analyzes the VPC network pathconfiguration between two VMs and shows whether the programmed configurationshould allow the traffic or not. If the traffic is not allowed, the resultsshow whether a Google Cloud egress or ingress firewall rule is blockingthe traffic or if a route isn't available.

Connectivity Tests might also dynamically test the path by sendingpackets between the hypervisors of the VMs. If these tests are performed, thenthe results of those tests are displayed.

Connectivity Tests examines the configuration of theVPC network only. It does not test the operating system firewallor operating system routes or the server software on the VM.

The following procedure runs Connectivity Tests fromGoogle Cloud console. For other ways to run tests, seeRunningConnectivity Tests.

Use the following procedure to create and run a test:

In the Google Cloud console, go to theConnectivity Tests page.
Go to Connectivity Tests
In the project pull-down menu, confirm you are in the correct project orspecify the correct one.
ClickCreate connectivity test.
Give the test a name.
Specify the following:
1. Protocol
2. Source endpoint IP address
3. Source project and VPC network
4. Destination endpoint IP address
5. Destination project and VPC network
6. Destination port
ClickCreate.

The test runs immediately. To see the result diagram, clickView in the intheResult details column.

If the results say the connection is dropped by a Google Cloudfirewall rule, determine if your intended security setupshould allow theconnection. You might have to ask your security or networkadministrator for details. If the traffic should be allowed, then checkthe following:
- Check theAlways blocked trafficlist. If the traffic is blocked by Google Cloud as described inthe always blocked traffic list, then your existing configuration won'twork.
- Go to the Firewall policies page and review your firewall rules. If the firewallis misconfigured, create or modify a firewallrule to allow the connection. This rule can be aVPCfirewall rule or ahierarchical firewall policyrule.
If there is a correctly configured firewall rule that blocks this traffic, check with your security or network administrator. If the security requirements of your organization mean that the VMs shouldn't reach each other, you might need to redesign your setup.
If the results indicate that there are no issues with theVPC connectivity path, then the issue might be one of thefollowing.
- Issues with the guest OS configuration, such as issueswith firewall software.
- Issues with the client or server applications, such asthe application being frozen or configured to listen on the wrong port.

Subsequent steps walk you through examining each of these possibilities.Continue withTest TCP connectivity from inside the VM.

Test TCP connectivity from inside the VM

If your VM-VM Connectivity Test did not detect aVPC configuration issue, start testing OS-OS connectivity. Thefollowing steps help you determine the following:

If a TCP server is listening at the indicated port
If the server-side firewall software is allowing connections to that portfrom the client VM
If the client-side firewall software is allowing connections to that port onthe server
If the server-side route table is correctly configured to forward packets
If the client-side route table is correctly configured to forward packets

You can test the TCP handshake usingcurl with Linux or Windows 2019, orusing theNew-Object System.Net.Sockets.TcpClient command with WindowsPowershell. The workflow in this section should result in one of the followingoutcomes: connection success, connection timeout, or connection reset.

Success: If the TCP handshake completes successfully, then an OSfirewall rule is not blocking the connection, the OS is correctlyforwarding packets, and a server of some kind is listening on thedestination port. If this is the case, then the issue might bewith the application itself. To check, seeCheck server logging forinformation about server behavior.
Timeout: If your connection times out, it usually means one of thefollowing:
- There's no machine at that IP address
- There's a firewall somewhere silently discarding your packets
- OS packet routing is sending the packets to a destination that can'tprocess them, or asymmetric routing is sending the return packet on aninvalid path
Reset: If the connection is being reset, it means that thedestination IP is receiving packets, but an OS or an application isrejecting the packets. This can mean one of the following:
- The packets are arriving at the wrong machine and it is not configuredto respond to that protocol on that port
- The packets are arriving at the correct machine, but no serveris listening on that port
- The packets are arriving at the correct machine and port, buthigher level protocols (such as SSL) aren't completing their handshake
- A firewall is resetting the connection. This is less likely thana firewall silently discarding the packets, but it can happen.

Linux

In the Google Cloud console, go to theFirewall policies page.
Go to Firewall policies
Ensure that there is afirewall rule that allows SSH connections fromIAP to your VM orcreate a new one.
In the Google Cloud console, go to theVM instances page.
Go to VM instances
Find your source VM.
ClickSSH in theConnect column for that VM.
From the client machine command line, run the following command. ReplaceDEST_IP:DEST_PORT with your destinationIP address and port.
```
curl -vso /dev/null --connect-timeout 5DEST_IP:DEST_PORT
```

Windows

In the Google Cloud console, go to theVM instances page.
Go to VM instances
Find your source VM.
Use one of the methods described inConnecting to WindowsVMs to connect to your VM.

From the client machine command line, run the following:

Windows 2019:

curl -vso /dev/null --connect-timeout 5DEST_IP:DEST_PORT

Windows 2012 or Windows 2016 Powershell:

PS C:> New-Object System.Net.Sockets.TcpClient('DEST_IP',DEST_PORT)`

Connection success

The following results indicate a successful TCP handshake.If the TCP handshake completes successfully, then the issue is not related toTCP connection timeout or reset. Instead, the timeout issue is occurring withinthe application layers. If you get a successful connection, proceed toCheck server logging for information about server behavior.

Linux and Windows 2019

$curl -vso /dev/null --connect-timeout 5 192.168.0.4:443

The "Connected to" line indicates a successful TCP handshake.

Expire in 0 ms for 6 (transfer 0x558b3289ffb0)Expire in 5000 ms for 2 (transfer 0x558b3289ffb0)  Trying 192.168.0.4...TCP_NODELAY setExpire in 200 ms for 4 (transfer 0x558b3289ffb0)Connected to 192.168.0.4 (192.168.0.4) port 443 (#0)> GET / HTTP/1.1> Host: 192.168.0.4:443> User-Agent: curl/7.64.0> Accept: */*>Empty reply from serverConnection #0 to host 192.168.0.4 left intact

Windows 2012 and 2016

PS C:\>New-Object System.Net.Sockets.TcpClient('DEST_IP_ADDRESS',PORT)

Connection successful result. The "Connected: True" line is relevant.

Available           : 0Client              : System.Net.Sockets.SocketConnected           : TrueExclusiveAddressUse : FalseReceiveBufferSize   : 131072SendBufferSize      : 131072ReceiveTimeout      : 0SendTimeout         : 0LingerState         : System.Net.Sockets.LingerOptionNoDelay             : False

Connection timeout

The following results indicate that the connection has timed out. If yourconnection is timing out, proceed toVerify server IP address and port.

Linux and Windows 2019

$curl -vso /dev/null --connect-timeout 5DEST_IP_ADDRESS:PORT

Connection timeout result:

Trying 192.168.0.4:443...Connection timed out after 5000 millisecondsClosing connection 0

Windows 2012 and 2016

PS C:\>New-Object System.Net.Sockets.TcpClient('DEST_IP_ADDRESS',PORT)

Connection timeout result:

New-Object: Exception calling ".ctor" with "2" argument(s): "A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. 192.168.0.4:443"

Connection reset

A reset is when a device sends a RST packet back to theclient, informing the client that the connection has been terminated. Theconnection might be reset for one of the following reasons:

The receiving server was not configured to acceptconnections for that protocol on that port. This could be because the packetwas sent to the wrong server or the wrong port, or the server software wasmisconfigured.
Firewall software rejected the connection attempt

If the connection was reset, proceed toVerify that you are accessing the correct IP address and port.

Linux and Windows 2019

$curl -vso /dev/null --connect-timeout 5DEST_IP_ADDRESS:PORT

Connection reset result:

Trying 192.168.0.4:443...connect to 192.168.0.4 port 443 failed: Connection refusedFailed to connect to 192.168.0.4 port 443: Connection refusedClosing connection 0

Windows 2012 and 2016

PS C:\>New-Object System.Net.Sockets.TcpClientt('DEST_IP_ADDRESS',PORT)

Connection reset result:

New-Object: Exception calling ".ctor" with "2" argument(s): "No connection could be made because the target machine actively refused it. 192.168.0.4:443"

Verify server IP address and port

Run one of the following commands on your server. They indicate if there is aserver listening on the necessary port.

Linux

$sudo netstat -ltuvnp

The output shows that a TCP server is listening to any destination IP address(0.0.0.0) at port 22, accepting connections from any source address(0.0.0.0) and any source port (*). ThePID/Program name column specifiesthe executable bound to the socket.

Active Internet connections (only servers)Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program nametcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      588/sshdtcp6       0      0 :::22                   :::*                    LISTEN      588/sshdudp        0      0 0.0.0.0:68              0.0.0.0:*                           334/dhclientudp        0      0 127.0.0.1:323           0.0.0.0:*                           429/chronydudp6       0      0 ::1:323                 :::*                                429/chronyd

Windows

Note: If you are using a UDP based server, Windows also offers the"Get-NetUdpEndpoint" command.

PS C:\>Get-NetTcpConnection -State "LISTEN" -LocalPortDEST_PORT

Output shows results of command run withDEST_PORT set to443.This output shows that a TCP server is listening to any address (0.0.0.0) atport443, accepting connections from any source address (0.0.0.0) and any sourceport (0). TheOwningProcess column indicates the process ID of the processlistening to the socket.

LocalAddress LocalPort RemoteAddress RemotePort State  AppliedSetting OwningProcess------------ --------- ------------- ---------- -----  -------------- -------------::           443       ::            0          Listen                9280.0.0.0      443       0.0.0.0       0          Listen                928

If you see that the server is not bound to the correct port or IP, or that theremote prefix does not match your client, consult the server'sdocumentation or vendor to resolve the issue. The server must be bound to the IPaddress of a particular interface or to0.0.0.0, and it must accept connectionsfrom the correct client IP prefix or0.0.0.0.

If the application server is bound to the correct IP address and port,it might be that the client is accessing the wrong port, that a higher-levelprotocol (frequently TLS) is actively refusing the connection, orthat there is a firewall rejecting the connection.

Check that the client and server are using the same TLS version andencryption formation.

Check that your client is accessing the correct port.

If the preceding steps don't resolve the problem, proceed to Check firewall on client and server for packet discards.

Check firewall on client and server for packet discards

If the server is unreachable from the client VM but is listening on the correctport, one of the VMs might be running firewall software that is discarding packetsassociated with the connection. Check the firewall on both the client andserver VMs using the following commands.

If a rule is blocking your traffic, you can update the firewall software to allow thetraffic. If you do update the firewall, proceed cautiously as you prepare andexecute the commands because a misconfigured firewall can block unexpected traffic.Consider setting upVM Serial Console access before proceeding.

Linux iptables

Check packet counts for the number of packets processed for each installediptables chain and rule. Determine which DROP rules are being matched against bycomparing source and destination IP addresses and ports with the prefixes and portsspecified by iptables rules.

If a matched rule is showing increasing discards with connection timeouts,consult the iptables documentation to apply the correctallow rule to theappropriate connections.

$sudo iptables -L -n -v -x

This example INPUT chain shows that packets from any IP address to any IPaddress using destination TCP port5000 will be discarded at the firewall.The pkts column indicates that the rule has dropped 10342 packets. As atest, if you create connections that are discarded by this rule, you willsee the pkts counter increase, confirming the behavior.

Chain INPUT (policy ACCEPT 0 packets, 0 bytes) pkts   bytes  target prot opt in  out  source      destination10342 2078513    DROP  tcp  --  *  *    0.0.0.0/0   0.0.0.0/0 tcp dpt:5000

You can add an ingress or egress rule to iptables with the following commands:

Ingress rule:

$sudo iptables -A INPUT -p tcp -sSOURCE_IP_PREFIX --dportSERVER_PORT -j ACCEPT

Egress rule:

$sudo iptables -A OUTPUT -p tcp -dDEST_IP_PREFIX --dportDEST_PORT -j ACCEPT

Windows Firewall

Check inWindows Firewall that the connection is permitted to egress from theclient and ingress to the server. If a rule is blocking your traffic, makethe needed corrections in Windows Firewall to allow the connections. Youcan also enable Windows Firewall Logging.

The default DENY behavior of Windows Firewall is to silently discard deniedpackets, resulting in timeouts.

This command checks the server. To check the egress rules on theclient VM, change the-match value toOutbound.

PS C:\>Get-NetFirewallPortFilter | `>>   Where-Object LocalPort -match  "PORT" | `>>   Get-NetFirewallRule | `>>   Where-Object {$_.Direction -match "Inbound" -and $_.Profile -match "Any"}

Name                  : {80D79988-C7A5-4391-902D-382369B4E4A3}DisplayName           : iperf3 udpDescription           :DisplayGroup          :Group                 :Enabled               : TrueProfile               : AnyPlatform              : {}Direction             : InboundAction                : AllowEdgeTraversalPolicy   : BlockLooseSourceMapping    : FalseLocalOnlyMapping      : FalseOwner                 :PrimaryStatus         : OKStatus                : The rule was parsed successfully from the store. (65536)EnforcementStatus     : NotApplicablePolicyStoreSource     : PersistentStorePolicyStoreSourceType : Local

You can add a new firewall rules to Windows with the following commands.

Egress Rule:

PS C:\>netsh advfirewall firewall add rule name="My Firewall Rule" dir=out action=allow protocol=TCP remoteport=DEST_PORT

Ingress Rule:

PS C:\>netsh advfirewall firewall add rule name="My Firewall Rule" dir=in action=allow protocol=TCP localport=PORT

Third-party software

Third-party application firewalls or antivirus software can also drop orreject connections. Consult the documentation provided by yourvendor.

If you find a problem with firewall rules and correct it, retest yourconnectivity. If firewall rules don't seem to be the problem, proceed toCheck configuration of OS routing.

Check OS routing configuration

Operating system routing issues can come from one of the following situations:

Routing issues are most common on VMs with multiple network interfacesbecause of the additional routing complexity
On a VM created in Google Cloud with a single network interface,routing issues normally only happen if someone has manually modified thedefault routing table
On a VM that was migrated from on-premises, the VM might carry overrouting or MTU settings that were needed on premises but which are causingproblems in the VPC network

If you are using a VM with multiple network interfaces, routes must be configuredto egress to the correct vNIC and subnet. For example, a VM might have routesconfigured so that traffic intended for internal subnets is sent to one vNIC,but the default gateway (destination0.0.0.0/0) is configured on anothervNIC which has an external IP address or access to Cloud NAT.

You can review routes by checking individual routes one at a time or by lookingat the entire VM routing table. If either approach reveals issues with therouting table, consult the steps inUpdate routing tables if needed for instructions.

Review all routes

List all your routes to understand what routes already exist on your VM.

Linux

$ip route show table all

default via 10.3.0.1 dev ens410.3.0.1 dev ens4 scope linklocal 10.3.0.19 dev ens4 table local proto kernel scope host src 10.3.0.19broadcast 10.3.0.19 dev ens4 table local proto kernel scope link src 10.3.0.19broadcast 127.0.0.0 dev lo table local proto kernel scope link src 127.0.0.1local 127.0.0.0/8 dev lo table local proto kernel scope host src 127.0.0.1local 127.0.0.1 dev lo table local proto kernel scope host src 127.0.0.1broadcast 127.255.255.255 dev lo table local proto kernel scope link src 127.0.0.1::1 dev lo proto kernel metric 256 pref mediumfe80::/64 dev ens4 proto kernel metric 256 pref mediumlocal ::1 dev lo table local proto kernel metric 0 pref mediumlocal fe80::4001:aff:fe03:13 dev ens4 table local proto kernel metric 0 pref mediummulticast ff00::/8 dev ens4 table local proto kernel metric 256 pref medium

Windows

PS C:\>Get-NetRoute

ifIndex DestinationPrefix             NextHop  RouteMetric ifMetric PolicyStore------- -----------------             -------  ----------- -------- -----------4       255.255.255.255/32            0.0.0.0          256 5        ActiveStore1       255.255.255.255/32            0.0.0.0          256 75       ActiveStore4       224.0.0.0/4                   0.0.0.0          256 5        ActiveStore1       224.0.0.0/4                   0.0.0.0          256 75       ActiveStore4       169.254.169.254/32            0.0.0.0            1 5        ActiveStore1       127.255.255.255/32            0.0.0.0          256 75       ActiveStore1       127.0.0.1/32                  0.0.0.0          256 75       ActiveStore1       127.0.0.0/8                   0.0.0.0          256 75       ActiveStore4       10.3.0.255/32                 0.0.0.0          256 5        ActiveStore4       10.3.0.31/32                  0.0.0.0          256 5        ActiveStore4       10.3.0.1/32                   0.0.0.0            1 5        ActiveStore4       10.3.0.0/24                   0.0.0.0          256 5        ActiveStore4       0.0.0.0/0                     10.3.0.1           0 5        ActiveStore4       ff00::/8                      ::               256 5        ActiveStore1       ff00::/8                      ::               256 75       ActiveStore4       fe80::b991:6a71:ca62:f23f/128 ::               256 5        ActiveStore4       fe80::/64                     ::               256 5        ActiveStore1       ::1/128                       ::               256 75       ActiveStore

Check individual routes

If a particular IP prefix seems to be the problem, check that proper routesexists for the source and destination IPs within the client and server VMs.

Linux

$ip route getDEST_IP

Good result:

A valid route is shown. In this case, the packets egress from interfaceens4.

10.3.0.34 via 10.3.0.1 dev ens4 src 10.3.0.26 uid 1000   cache

Bad result:

This result confirms that packets are being discarded becausethere is no pathway to the destination network. Confirm that your routetable contains a path to the correct egress interface.

**RTNETLINK answers: Network is unreachable

Windows

PS C:\>Find-NetRoute -RemoteIpAddress "DEST_IP"

Good result:

IPAddress         : 192.168.0.2InterfaceIndex    : 4InterfaceAlias    : EthernetAddressFamily     : IPv4Type              : UnicastPrefixLength      : 24PrefixOrigin      : DhcpSuffixOrigin      : DhcpAddressState      : PreferredValidLifetime     : 12:53:13PreferredLifetime : 12:53:13SkipAsSource      : FalsePolicyStore       : ActiveStoreCaption            :Description        :ElementName        :InstanceID         : ;:8=8:8:9<>55>55:8:8:8:55;AdminDistance      :DestinationAddress :IsStatic           :RouteMetric        : 256TypeOfRoute        : 3AddressFamily      : IPv4CompartmentId      : 1DestinationPrefix  : 192.168.0.0/24InterfaceAlias     : EthernetInterfaceIndex     : 4InterfaceMetric    : 5NextHop            : 0.0.0.0PreferredLifetime  : 10675199.02:48:05.4775807Protocol           : LocalPublish            : NoState              : AliveStore              : ActiveStoreValidLifetime      : 10675199.02:48:05.4775807PSComputerName     :ifIndex            : 4

Bad result:

Find-NetRoute : The network location cannot be reached. For information about network troubleshooting, see Windows Help.At line:1 char:1+ Find-NetRoute -RemoteIpAddress "192.168.0.4"+ ----------------------------------------    + CategoryInfo          : NotSpecified: (MSFT_NetRoute:ROOT/StandardCimv2/MSFT_NetRoute) [Find-NetRoute], CimException    + FullyQualifiedErrorId : Windows System Error 1231,Find-NetRoute

This command confirms that packets are being discardedbecause there is no pathway to the destination IP address. Check thatyou have a default gateway, and the gateway is applied to thecorrect vNIC and network.

Update routing tables

If needed, you can add a route to your operating system's route table. Before running acommand to update the routing VM's routingtable, we recommend you familiarize yourself with the commands and develop anunderstanding of the possible implications. Improper use of route update commandsmight cause unexpected problems or disconnection to the VM. Consider setting upVM Serial Console access before proceeding.

Consult your operating system documentation for instructions on updating routes.

If you find a problem with routes and correct it, retest your connectivity. Ifroutes don't seem to be the problem, proceed toCheck interface MTU.

Check MTU

A VM's interface MTU should match the MTU of the VPCnetwork it is attached to. Ideally, VMs that are communicating with each otheralso have matching MTUs. Mismatched MTUs are normally not an issue for TCP,but can be for UDP.

Check the MTU of the VPC. If the VMs are in two differentnetworks, check both networks.

gcloud compute networks describeNET_NAME --format="table(name,mtu)"

Check the MTU configuration for your client and server network interfaces.

Linux

$netstat -i

The lo (loopback) interface always has an MTU of 65536 and can be ignoredfor this step.

Kernel Interface tableIface      MTU    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flgens4      1460  8720854      0      0 0      18270406      0      0      0 BMRUlo       65536       53      0      0 0            53      0      0      0 LRU

Windows

PS C:\>Get-NetIpInterface

Loopback Pseudo-Interfaces always have an MTU of 4294967295 and can beignored for this step.

ifIndex InterfaceAlias              Address NlMtu(Bytes) Interface Dhcp     Connection PolicyStore                                    Family               Metric             State------- --------------              ------- ------------ --------- ----     ---------- -----------4       Ethernet                    IPv6            1500         5 Enabled  Connected  ActiveStore1       Loopback Pseudo-Interface 1 IPv6      4294967295        75 Disabled Connected  ActiveStore4       Ethernet                    IPv4            1460         5 Enabled  Connected  ActiveStore1       Loopback Pseudo-Interface 1 IPv4      4294967295        75 Disabled Connected  Active

If the interface and network MTUs don't match, you can reconfigure theinterface MTU. For more information, seeVMs and MTUsettings. If they do match, and if youhave followed the troubleshooting steps this far, then the issueis likely with the server itself. For guidance on troubleshooting server issues,proceed toCheck server logging for information about serverbehavior.

Check server logging for information about server behavior

If the preceding steps don't resolve an issue, the application might be causingthe timeouts. Check server and application logs for behavior that would explainwhat you're seeing.

Log sources to check:

Cloud Logging for the VM
VM Serial Logs
Linux syslog and kern.log, or Windows Event Viewer

If you're still having issues

If you're still having issues, seeGettingsupport for next steps. It's useful to have theoutput from the preceding troubleshooting steps available to share with othercollaborators.

Troubleshoot network latency or loss causing throughput issues

Network latency or loss issues are typically caused by resource exhaustion orbottlenecks within a VM or network path. Occasionally, network loss cancause intermittent connection timeouts. Causes like vCPU exhaustionor vNIC saturation result in increased latency andpacket loss leading to a reduction in network performance.

The following instructions assume that connections are not consistently timingout and you are instead seeing issues of limited capacity or performance. If youare seeing complete packet loss, seeTroubleshoot complete connection failure.

Small variations in latency, such as latencies varying by a few milliseconds,are normal. Latencies vary because of network load or queuing inside the VM.

Determine connection values

First gather the following information:

From theVM instances page,gather the following for both VMs:
- VM names
- VM zones
- Internal IP addresses for the vNICs that are communicating
From the configuration of the destination server software, gather thefollowing information:
- Layer 4 protocol
- Destination port

Once you have this information, proceed toInvestigate issues with the underlying Google network.

Investigate issues with the underlying Google network

If your setup is anexisting one that hasn't changed recently, then the issue might be with theunderlying Google network. Check the Network Intelligence Center Performance Dashboardforpacket loss between the VM zones. If there is an increase in packet loss between the zonesduring the timeframe where you experienced network timeouts, it might indicatethat the problem is with the physical network underlying yourvirtual network. Check theGoogle Cloud Status Dashboard for known issues before filing a support case.

If the issue does not seem to be with the underlying Google network, proceed toCheck handshake latency.

Check handshake latency

All connection-based protocols incur some latency while they do theirconnection setup handshake. Each protocol handshake adds to the overhead. ForSSL/TLS connections, for example, the TCP handshake has to complete before theSSL/TLS handshake can start, then the TLS handshake has to complete before datacan be transmitted.

Handshake latency in the same Google Cloud zone is usually negligible, buthandshakes to globally distant locations might add greater delays atconnection initiation. If you have resources in distant regions, you can checkto see if the latency you're seeing is due to protocol handshake.

Linux and Windows 2019

$curl -o /dev/null -Lvs -w 'tcp_handshake: %{time_connect}s, application_handshake: %{time_appconnect}s'DEST_IP:PORT

tcp_handshake: 0.035489s, application_handshake: 0.051321s

tcp_handshake is duration from when the client sends theinitial SYN packet to when the client sends the ACK of the TCP handshake.
application_handshake is the time from the first SYN packet ofthe TCP handshake to the completion of the TLS (typically) handshake.
additional handshake time = application_handshake - tcp_handshake

Windows 2012 and 2016

Not available with default OS tooling. ICMP round-trip time can be used asa reference if firewall rules allow.

If the latency is more than the handshakes would account for, proceed toDetermine the maximum throughput of your VM type.

Determine the maximum throughput of your VM type

VM networkegress throughput is limited by the VM CPU architecture andvCPU count. Determine the potential egress bandwidth of your VM by consultingtheNetwork bandwidth page.

If your VM is not capable of meeting your egress requirements, considerupgrading to a VM with greater capacity. For instructions, seeChanging the machine type of an instance.

If your machine type should allow sufficient egress bandwidth, theninvestigate whether Persistent Disk usage is interfering with your networkegress. Persistent Disk operations are allowed to occupy up to 60% of thetotal network throughput of your VM. To determine if Persistent Diskoperations might be interfering with network throughput, seeCheck Persistent Disk performance.

Networkingress to a VM is not limited by the VPC network orthe VM instance type. Instead, it is determined by the packet queuing andprocessing performance of the VM operating system or application. If youregress bandwidth is adequate but you're seeing ingress issues, seeCheck server logging for information about server behavior.

Check interface MTU

The MTU of a VPC network is configurable. The MTU of interfaceon the VM should match the MTU value for theVPC network it is attached to. In a VPC Network Peeringsituation, VMs in different networks can have different MTUs. When this scenariooccurs, apply the smaller MTU value to the associated interfaces. MTUmismatches are normally not an issue for TCP, but can be for UDP.

Check the MTU of the VPC. If the VMs are in two differentnetworks, check both networks.

gcloud compute networks describeNET_NAME --format="table(name,mtu)"

Check the MTU configuration for your network interface.

Linux

The lo (loopback) interface always has an MTU of 65536 and can be ignored forthis step.

$netstat -i

Kernel Interface tableIface      MTU    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flgens4      1460  8720854      0      0 0      18270406      0      0      0 BMRUlo       65536       53      0      0 0            53      0      0      0 LRU

Windows

PS C:\>Get-NetIpInterface

Loopback Pseudo-Interfaces always have an MTU of 4294967295 and can be ignoredfor this step.

ifIndex InterfaceAlias              Address NlMtu(Bytes) Interface Dhcp     Connection PolicyStore                                    Family               Metric             State------- --------------              ------- ------------ --------- ----     ---------- -----------4       Ethernet                    IPv6            1500         5 Enabled  Connected  ActiveStore1       Loopback Pseudo-Interface 1 IPv6      4294967295        75 Disabled Connected  ActiveStore4       Ethernet                    IPv4            1460         5 Enabled  Connected  ActiveStore1       Loopback Pseudo-Interface 1 IPv4      4294967295        75 Disabled Connected  Active

If the interface and network MTUs don't match, you can reconfigure theinterface MTU. For instructions on updating MTU for Windows VMs, seeVMs and MTUsettings. If they do match, then the issueis likely might be server availability. The next step is toCheck logs to see if a VM was rebooted, stopped, or live migrated to see if anything happened to your VM during the relevant time.

Check logs to see if a VM was rebooted, stopped, or live migrated

During the lifecycle of a VM, a VM can be user-rebooted, live-migrated forGoogle Cloud maintenance, or, in rare circumstances, a VM might be lost andrecreated if there is a failure within the physical host containing your VM.These events might cause a brief increase in latency or connection timeouts. Ifany of these things happens to the VM, the event is logged.

To view logs for your VM, do the following:

In the Google Cloud console, go to theLogging page.
Go to Logging
Choose the timeframe of when the latency occurred.

Use the following Logging query to determine if a VM event occurred nearthe timeframe when the latency occurred:

resource.labels.instance_id:"INSTANCE_NAME"resource.type="gce_instance"(  protoPayload.methodName:"compute.instances.hostError" OR  protoPayload.methodName:"compute.instances.OnHostMaintenance" OR  protoPayload.methodName:"compute.instances.migrateOnHostMaintenance" OR  protoPayload.methodName:"compute.instances.terminateOnHostMaintenance" OR  protoPayload.methodName:"compute.instances.stop" OR  protoPayload.methodName:"compute.instances.reset" OR  protoPayload.methodName:"compute.instances.automaticRestart" OR  protoPayload.methodName:"compute.instances.guestTerminate" OR  protoPayload.methodName:"compute.instances.instanceManagerHaltForRestart" OR  protoPayload.methodName:"compute.instances.preempted")

If VMs didn't restart or migrate during the relevant time, the issue might bewith resource exhaustion. To check, proceed toCheck network and OS statistics for packet discards due to resource exhaustion.

Check network and OS statistics for packet discards due to resource exhaustion

Resource exhaustion is a general term that means that some resource on theVM, such as egress bandwidth, is being asked to handle more than it can.Resource exhaustion can result in the periodic discards of packets, which causesconnection latency or timeouts. These timeouts might not be visible atclient or server startup, but might appear over time as a system exhausts resources.

The following is a list of commands which display packet counters andstatistics. Some of these commands duplicate the results of other commands. Insuch cases, you can use whichever command works better for you. See thenotes within each section to better understand the intended outcome of runningthe command. It can be useful to run the commands at different timesto see if discards or errors are occurring at the same time as the issue.

Linux

Use thenetstat command to view network statistics.
```
$netstat -s
```
```
TcpExt:  341976 packets pruned from receive queue because of socket buffer overrun  6 ICMP packets dropped because they were out-of-window  45675 TCP sockets finished time wait in fast timer  3380 packets rejected in established connections because of timestamp  50065 delayed acks sent
```
The netstat command outputs network statistics containing values fordiscarded packets by protocol. Discarded packets might be the result ofresource exhaustion by the application or network interface. View thecounter reason for indication of why a counter was incremented.
Check kern.log for logs matchingnf_conntrack: table full, droppingpacket.
Debian:cat /var/log/kern.log | grep "dropping packet"
CentOS:sudo cat /var/log/dmesg | grep "dropping packet"
This log indicates that the connection tracking table forVM has reached the maximum connections that can be tracked. Furtherconnections to and from this VM might timeout. If conntrack has beenenabled, the maximum connection count can be found with:sudo sysctl net.netfilter.nf_conntrack_max
You can increase the value for maximum tracked connections bymodifying sysctlnet.netfilter.nf_conntrack_max or by spreading aVMs workload across multiple VMs to reduce load.

Windows UI

Perfmon

Using the Windows menu, search for "perfmon" and open theprogram.
On the left-menu, selectPerformance > Monitoring Tools > Performance Monitor.
In the main view, click the green plus "+" to add performance counters to themonitoring graph. The following counters are of interest:
- Network Adapter
  - Output Queue Length
  - Packets Outbound Discarded
  - Packets Outbound Errors
  - Packets Received Discarded
  - Packets Received Errors
  - Packets Received Unknown
- Network Interface
  - Output Queue Length
  - Packets Outbound Discarded
  - Packets Outbound Errors
  - Packets Received Discarded
  - Packets Received Errors
  - Packets Received Unknown
- Per Processor Network Interface Card Activity
  - Low Resource Receive Indications per sec
  - Low Resource Received Packets per sec
- Processor
  - % Interrupt Time
  - % Privileged Time
  - % Processor Time
  - % User Time

Pefmon lets you plot the preceding counters on a time series graph.This can be beneficial to watch when testing is occurring or a server isimpacted. Spikes in CPU-related counters such as Interrupt Timeand Privileged Time can indicate saturation issues as the VM reaches CPUthroughput limitations. Packet discards and errors can occur when the CPUis saturated, which forces packets to be lost before being processed by theclient or server sockets. Finally, Output Queue Length will also growduring CPU saturation as more packets are queued for processing.

Windows Powershell

PS C:\>netstat -s

IPv4 Statistics  Packets Received                   = 56183  Received Header Errors             = 0  Received Address Errors            = 0  Datagrams Forwarded                = 0  Unknown Protocols Received         = 0  Received Packets Discarded         = 25  Received Packets Delivered         = 56297  Output Requests                    = 47994  Routing Discards                   = 0  Discarded Output Packets           = 0  Output Packet No Route             = 0  Reassembly Required                = 0  Reassembly Successful              = 0  Reassembly Failures                = 0  Datagrams Successfully Fragmented  = 0  Datagrams Failing Fragmentation    = 0  Fragments Created                  = 0

The netstat command outputs network statistics containing values fordiscarded packets by protocol. Discarded packets might be the result ofresource exhaustion by the application or network interface.

If you are seeing resource exhaustion, you can try spreading your workloadacross more instances, upgrading the VM to one with more resources, tuningthe OS or application for specific performance needs, entering the errormessage into a search engine to look for possible solutions, or ask for helpusing one of the methods described inIf you're still having issues.

If resource exhaustion doesn't seem to be the problem, the issue might be withthe server software itself. For guidance on troubleshooting server softwareissues, proceed toCheck server logging for information about serverbehavior.

Check server logging for information about server behavior

If the preceding steps don't reveal an issue, the timeouts might be caused byapplication behavior such as processing stalls caused by vCPU exhaustion. Checkthe server and applications logs for indications of the behavior you areexperiencing.

As an example, a server experiencing increased latency due to an upstreamsystem, such as a database under load, might queue an excessive amount ofrequests which can cause increased memory usage and CPU wait times. Thesefactors might result in failed connections or socket buffer overrun.

TCP connections occasionally lose a packet, but selectiveacknowledgement and packet retransmission usually recovers lost packets,avoiding connection timeout. Instead, consider that timeouts might have been theresult of the application server failing or being redeployed, causing amomentary failure for connections.

If your server application relies on a connection to a database or otherservice, confirm that coupled services are not performing poorly. Yourapplication might track these metrics.

If you're still having issues

If you're still having issues, seeGettingsupport for next steps. It's useful to have theoutput from the troubleshooting steps available to share with othercollaborators.

What's next

If you are still having trouble, see theResources page.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.

Movatterモバイル変換

Troubleshoot internal connectivity between VMs

Quantify the problem

Troubleshoot complete connection failure

Determine connection values

Investigate issues with the underlying Google network

Check for misconfigured firewall rules in Google Cloud

Test TCP connectivity from inside the VM

Linux

Windows

Connection success

Linux and Windows 2019

Windows 2012 and 2016

Connection timeout

Linux and Windows 2019

Windows 2012 and 2016

Connection reset

Linux and Windows 2019

Windows 2012 and 2016

Verify server IP address and port

Linux

Windows

Check firewall on client and server for packet discards

Linux iptables

Windows Firewall

Third-party software

Check OS routing configuration

Review all routes

Linux

Windows

Check individual routes

Linux

Windows

Update routing tables

Check MTU

Linux

Windows

Check server logging for information about server behavior

If you're still having issues

Troubleshoot network latency or loss causing throughput issues

Determine connection values

Investigate issues with the underlying Google network

Check handshake latency

Linux and Windows 2019

Windows 2012 and 2016

Determine the maximum throughput of your VM type

Check interface MTU

Linux

Windows

Check logs to see if a VM was rebooted, stopped, or live migrated

Check network and OS statistics for packet discards due to resource exhaustion

Linux

Windows UI

Windows Powershell

Check server logging for information about server behavior

If you're still having issues

What's next