Monitor processes on VMs

By default, the Ops Agent and the legacy Monitoring agent are configured tocollect metrics that capture information about the processes running on yourCompute Engine virtual machines (VMs). You can also collect these metricson Amazon Elastic Compute Cloud (EC2) VMs by using the Monitoring agent.This set of metrics, calledprocess metrics, is identifiable bythe prefixagent.googleapis.com/processes. These metricsare not collected on Google Kubernetes Engine (GKE).

As of August 6, 2021, charges will be introduced for thesemetrics, as described in the chargeable metrics section of theGoogle Cloud Observability pricing page.The set of process metrics is classified as chargeable, but charges havenever been implemented.

This document describes tools for visualizing process metrics,how to determine the amount of data you are ingestingfrom these metrics, and how to to minimize the related charges.

Working with process metrics

You can visualize your process-metric data with charts created by usingMetrics Explorer or custom dashboards. For more information, seeUsing dashboards and charts. In addition, Cloud Monitoringincludes data from process metrics on two predefined dashboards:

  • VM Instances dashboard in Monitoring
  • VM instanceDetails dashboard in Compute Engine

The following sections describe these dashboards.

Monitoring: View aggregated process metrics

To view aggregated process metrics within a metrics scope, go totheProcesses tab on theVM Instances dashboard:

  1. In the Google Cloud console, go to the Dashboards page:

    Go toDashboards

    If you use the search bar to find this page, then select the result whose subheading isMonitoring.

  2. Select theVM Instances dashboard from the list.

  3. ClickProcesses.

The following screenshot shows an example of the MonitoringProcesses page:

The **Processes** page in Monitoring shows aggregated processmetrics.

You can use the charts on theProcesses tab to identify theprocesses in your metrics scope that are consuming the mostCPU and memory, and that have the highest disk utilization.

Compute Engine: View performance metrics for top resource-consuming VMs

To view the performance charts showing the five VMs consuming the most of aresource in your Google Cloud project, go to theObservability tab foryour VM instances:

  1. In the Google Cloud console, go to theVM instances page:

    Go toVM instances

    If you use the search bar to find this page, then select the result whose subheading isCompute Engine.

  2. ClickObservability.

The following screenshot shows an example of the Compute EngineObservability page.

The **Observability** page in Compute Engine showsthe top five VMs consuming a given resource.

For information about using these metrics to diagnose problems with yourVMs, seeTroubleshooting VM performance issues.

Compute Engine: View per-VM process metrics

To view a list of the processes running on a single Compute Enginevirtual machine (VM) and charts for the processes with the highest resourceconsumption, go to theObservability tab for the VM:

  1. In the Google Cloud console, go to theVM instances page:

    Go toVM instances

    If you use the search bar to find this page, then select the result whose subheading isCompute Engine.

  2. On theInstances tab, click the name of a VM to inspect.

  3. ClickObservability to view the metrics for this VM.

  4. In the navigation pane on theObservability tab, selectProcesses.

The following screenshot shows an example of the Compute EngineProcesses page:

The **Processes** page in Compute Engine shows per-VM processmetrics.

Process metrics are retained for up to24 hours, so you can use them to lookback in time and attribute anomalies in resource consumption to specificprocesses or identify your most expensive resource consumers.For example, the following chart shows the processes consuming thehighest percentages of CPU resources. You can use the time-range selector tochange the time range of the chart. The time-range selector offers presetvalues, like the most recent hour, and also lets you input a custom time range.

You can use process metrics to identify the processes consuming themost of a resource.

TheRunning Processes table provides a listing of resource consumptionanalogous to the output of the Linuxtop command.By default, the table shows a snapshot of the most recent data.However, if you select a range of time on a chart that ends in the past,the table shows the processes running at the end of that range.

For information about using these metrics to diagnose problems with yourVMs, seeTroubleshooting VM performance issues.

Process metrics collected by the agent

The Linux agents collect all of the metrics listed in the following tablefrom processes running on Compute Engine VMs and, by using theMonitoring agent, Amazon Elastic ComputeCloud (EC2) VMs. You can disable their collection by the Ops Agent (versions2.0.0 and higher) and by the legacy Monitoring agent.

You can also disable collection of process metrics for theOps Agent (versions 2.0.0 and higher) running onWindows VMs.

For more information, seeDisabling process metrics.

If you want to disable collection of these metrics on Windows, we recommendthat you upgrade to the Ops Agent version 2.0.0 orhigher. For more information, seeInstalling the Ops Agent.

Table of process metrics

The "metric type" strings in this table must be prefixed withagent.googleapis.com/processes/. That prefix has been omitted from the entries in the table. When querying a label, use themetric.labels. prefix; for example,metric.labels.LABEL="VALUE".

Metric type Launch stage(Resource hierarchy levels)
Display name
Kind, Type, Unit
Monitored resources
Description
Labels
count_by_stateGA(project)
Processes
GAUGEDOUBLE1
aws_ec2_instance
baremetalsolution.googleapis.com/Instance
gce_instance
Count of processes in the given state. Linux only. Sampled every 60 seconds.
state: Running, sleeping, zombie, etc.
cpu_timeGA(project)
Process CPU
CUMULATIVEINT64us{CPU}
aws_ec2_instance
baremetalsolution.googleapis.com/Instance
gce_instance
CPU time of the given process. Sampled every 60 seconds.
process: Process name.
user_or_syst: Whether a user or system process.
command: Process command.
command_line: Process command line, 1024 characters maximum.
owner: Process owner.
pid: Process ID.
disk/read_bytes_countGA(project)
Process disk read I/O
CUMULATIVEINT64By
aws_ec2_instance
baremetalsolution.googleapis.com/Instance
gce_instance
Process disk read I/O. Linux only. Sampled every 60 seconds.
process: Process name.
command: Process command.
command_line: Process command line, 1024 characters maximum.
owner: Process owner.
pid: Process ID.
disk/write_bytes_countGA(project)
Process disk write I/O
CUMULATIVEINT64By
aws_ec2_instance
baremetalsolution.googleapis.com/Instance
gce_instance
Process disk write I/O. Linux only. Sampled every 60 seconds.
process: Process name.
command: Process command.
command_line: Process command line, 1024 characters maximum.
owner: Process owner.
pid: Process ID.
fork_countGA(project)
Fork count
CUMULATIVEINT641
aws_ec2_instance
baremetalsolution.googleapis.com/Instance
gce_instance
Total number of processes forked. Linux only. Sampled every 60 seconds.
rss_usageGA(project)
Process resident memory
GAUGEDOUBLEBy
aws_ec2_instance
baremetalsolution.googleapis.com/Instance
gce_instance
Resident memory usage of the given process. Linux only. Sampled every 60 seconds.
process: Process name.
command: Process command.
command_line: Process command line, 1024 characters maximum.
owner: Process owner.
pid: Process ID.
vm_usageGA(project)
Process virtual memory
GAUGEDOUBLEBy
aws_ec2_instance
baremetalsolution.googleapis.com/Instance
gce_instance
VM Usage of the given process. Sampled every 60 seconds.
process: Process name.
command: Process command.
command_line: Process command line, 1024 characters maximum.
owner: Process owner.
pid: Process ID.
windows/handlesALPHA(project)
Process open handles (Windows)
GAUGEINT641
aws_ec2_instance
baremetalsolution.googleapis.com/Instance
gce_instance
Open handle count of the given process. Windows only. Sampled every 60 seconds.
process: Process name.
command: Process command.
command_line: Process command line, 1024 characters maximum.
owner: Process owner.
pid: Process ID.

Table generated at 2026-02-12 22:12:11 UTC.

Determining current ingestion

You can use Metrics Explorer to see how much data you areingesting for process metrics. Use the following procedure:

  1. In the Google Cloud console, go to the Metrics explorer page:

    Go toMetrics explorer

    If you use the search bar to find this page, then select the result whose subheading isMonitoring.

  2. In the toolbar of thequery-builder pane, select the button whose name is either MQL or PromQL.

  3. Verify thatPromQL is selectedin theLanguage toggle. The language toggle is in the same toolbar thatlets you format your query.

  4. To see the total number of process-metric pointsfor yourgce_instance andaws_ec2_instance resources, do the following:

    1. Enter the following query:

      sum_over_time(  sum by (resource_type) (    label_replace(      label_replace(        sum(count_over_time({"agent.googleapis.com/processes/cpu_time", monitored_resource="gce_instance"}[1m])),        "metric_suffix", "cpu_time", "", ""      )      or      label_replace(        sum(count_over_time({"agent.googleapis.com/processes/disk/read_bytes_count", monitored_resource="gce_instance"}[1m])),        "metric_suffix", "disk_read_bytes_count", "", ""      )      or      label_replace(        sum(count_over_time({"agent.googleapis.com/processes/disk/write_bytes_count", monitored_resource="gce_instance"}[1m])),        "metric_suffix", "disk_write_bytes_count", "", ""      )      or      label_replace(        sum(count_over_time({"agent.googleapis.com/processes/rss_usage", monitored_resource="gce_instance"}[1m])),        "metric_suffix", "rss_usage", "", ""      )      or      label_replace(        sum(count_over_time({"agent.googleapis.com/processes/vm_usage", monitored_resource="gce_instance"}[1m])),        "metric_suffix", "vm_usage", "", ""      )      or      label_replace(        sum(count_over_time({"agent.googleapis.com/processes/count_by_state", monitored_resource="gce_instance"}[1m])),        "metric_suffix", "count_by_state", "", ""      )      or      label_replace(        sum(count_over_time({"agent.googleapis.com/processes/fork_count", monitored_resource="gce_instance"}[1m])),        "metric_suffix", "fork_count", "", ""      ),      "resource_type", "gce_instance", "", ""    )    or    label_replace(      label_replace(        sum(count_over_time({"agent.googleapis.com/processes/cpu_time", monitored_resource="aws_ec2_instance"}[1m])),        "metric_suffix", "cpu_time", "", ""      )      or      label_replace(        sum(count_over_time({"agent.googleapis.com/processes/disk/read_bytes_count", monitored_resource="aws_ec2_instance"}[1m])),        "metric_suffix", "disk_read_bytes_count", "", ""      )      or      label_replace(        sum(count_over_time({"agent.googleapis.com/processes/disk/write_bytes_count", monitored_resource="aws_ec2_instance"}[1m])),        "metric_suffix", "disk_write_bytes_count", "", ""      )      or      label_replace(        sum(count_over_time({"agent.googleapis.com/processes/rss_usage", monitored_resource="aws_ec2_instance"}[1m])),        "metric_suffix", "rss_usage", "", ""      )      or      label_replace(        sum(count_over_time({"agent.googleapis.com/processes/vm_usage", monitored_resource="aws_ec2_instance"}[1m])),        "metric_suffix", "vm_usage", "", ""      )      or      label_replace(        sum(count_over_time({"agent.googleapis.com/processes/count_by_state", monitored_resource="aws_ec2_instance"}[1m])),        "metric_suffix", "count_by_state", "", ""      )      or      label_replace(        sum(count_over_time({"agent.googleapis.com/processes/fork_count", monitored_resource="aws_ec2_instance"}[1m])),        "metric_suffix", "fork_count", "", ""      ),      "resource_type", "aws_ec2_instance", "", ""    )  )[1d:])
    2. ClickRun Query. The resulting chart shows you the values foreach resource type.

Estimating the cost of the metrics

The Monitoringpricing examples illustratehow you can estimate the cost of ingesting metrics. These examples can beapplied to process metrics.

  • All of the process metrics are sampled every 60 seconds, and all of themwrite data points that are counted as eight bytes for pricing purposes.

  • Pricing for the process metrics is being set at5% of the standard volumecost used in the pricing examples. Therefore, if you assume that all themetrics in the scenarios described in those examples are process metrics,you can then use 5% of the total costfor each scenario as an estimate of the cost of process metrics.

Disable collection of process metrics

There are multiple ways you can disable the collection of these metricsby the Ops Agent (versions 2.0.0 and higher) andby the legacy Monitoring agent on Linux.

The agents run only on Compute Engine VMs; these procedures apply only tothat platform.

You can't disable collection by the Ops Agent if you are running versionsless than 2.0.0 or the legacy Monitoring agent on Windows.If you want to disable collection of these metrics on Windows, we recommendthat you upgrade to the Ops Agent version 2.0.0 orhigher. For more information, seeInstalling the Ops Agent.

The general procedure looks like this:

  1. Connect to the VM.

  2. Make a copy of the existing configuration file as a back up. Storethe back-up copy outside the agent's configuration directory, so theagent doesn't attempt to load both files. For example, the following commandmakes a copy of the configuration file for the Monitoring agent on Linux:

    cp /etc/stackdriver/collectd.confBACKUP_DIR/collectd.conf.bak
  3. Change the configuration using one of the options described in thefollowing:

  4. Restart the agent, to pick up the new configuration:

    • Monitoring agent:sudo service stackdriver-agent restart
    • Ops Agent:sudo service google-cloud-ops-agent restart
  5. Verify that the process metrics are no longer being collected for this VM:

    1. In the Google Cloud console, go to the Metrics explorer page:

      Go toMetrics explorer

      If you use the search bar to find this page, then select the result whose subheading isMonitoring.

    2. In the toolbar of thequery-builder pane, select the button whose name is either MQL or PromQL.

    3. Verify thatPromQL is selectedin theLanguage toggle. The language toggle is in the same toolbar thatlets you format your query.

    4. For agce_instance resource, enter the following query, replacingVM_NAME with the name of this VM:

      rate({"agent.googleapis.com/processes/cpu_time", monitored_resource="gce_instance", metadata_system_name="VM_NAME"}[1m])
    5. ClickRun Query.

Ops Agent on Linux or Windows

Note: This procedure applies only to Ops Agent versions2.0.0 and higher; you can't disable processmetrics on earlier versions.

The location of the configuration file for the Ops Agent depends on theoperating system:

  • For Linux:/etc/google-cloud-ops-agent/config.yaml
  • For Windows:C:\Program Files\Google\Cloud Operations\OpsAgent\config\config.yaml

To disable the collection of all process metrics by the Ops Agent,add the following to yourconfig.yaml file:

metrics:  processors:    metrics_filter:      type: exclude_metrics      metrics_pattern:      - agent.googleapis.com/processes/*

This excludes process metrics from collection in themetrics_filterprocessor that applies to the default pipeline in themetrics service.

For more information about configuration options for the Ops Agent, seeConfiguring the Ops Agent.

Monitoring agent on Linux

You have the following options for disabling the collection of processmetrics with the legacy Monitoring agent:

The following sections describe each option and list the benefits and risksassociated with that option.

Modify the agent's configuration file

With this option, you directly edit the agent's main configuration file,/etc/stackdriver/collectd.conf, to remove the sectionsthat enable the collection of the process metrics.

Procedure

There are three groups of deletions you need to make to thecollectd.conf file:

  1. Delete the followingLoadPlugin directive and plugin configuration:

    LoadPlugin processes<Plugin "processes">  ProcessMatch "all" ".*"  Detail "ps_cputime"  Detail "ps_disk_octets"  Detail "ps_rss"  Detail "ps_vm"</Plugin>
  2. Delete the followingPostCacheChain directive and the configurationof thePostCache chain:

    PostCacheChain"PostCache"<Chain"PostCache"><Rule"processes"><Match"regex">Plugin"^processes$"Type"^(ps_cputime|disk_octets|ps_rss|ps_vm)$"</Match><Target"jump">Chain"MaybeThrottleProcesses"</Target>Target"stop"</Rule><Rule"otherwise"><Match"throttle_metadata_keys">OKToThrottlefalseHighWaterMark5700000000#950M*6LowWaterMark4800000000#800M*6</Match><Target"write">Plugin"write_gcm"</Target></Rule></Chain>
  3. Delete theMaybeThrottleProcesses chain used by thePostCache chain:

    <Chain "MaybeThrottleProcesses">  <Rule "default">    <Match "throttle_metadata_keys">      OKToThrottle true      TrackedMetadata "processes:pid"      TrackedMetadata "processes:command"      TrackedMetadata "processes:command_line"      TrackedMetadata "processes:owner"    </Match>    <Target "write">       Plugin "write_gcm"    </Target>  </Rule></Chain>
Benefits and risks
  • Benefits
    • You reduce the resources consumed by the agent, because the metrics arenever collected.
    • If you have made other changes to yourcollectd.conf file,you might be able to easily preserve those changes.
  • Risks
    • You must use theroot account to edit this configuration file.
    • You risk introducing typographical errors into the file.

Replace the agent's configuration file

With this option, you replace the agent's main configuration filewith a pre-edited version that has the relevant sections removed for you.

Procedure
  1. Download the pre-edited file,collectd-no-process-metrics.conf,from the GitHub repository to the/tmp directory, and then do thefollowing:

    cd /tmp && curl -sSO https://raw.githubusercontent.com/Stackdriver/agent-packaging/master/collectd-no-process-metrics.conf
  2. Replace the existingcollectd.conf file with the pre-edited file:

    cp /tmp/collectd-no-process-metrics.conf /etc/stackdriver/collectd.conf
Benefits and risks
  • Benefits
    • You reduce resources consumed by the agent because the metrics arenever collected.
    • You don't have to manually edit the file asroot.
    • Configuration-management tools can easily replace a file.
  • Risks
    • If you have made other changes to thecollectd.conf file,you have to merge those changes into the replacement file.

Troubleshooting

The procedures described in this document are changes to the configurationof the agent, so the following problems are most likely:

  • Insufficient privilege to edit the configuration files. Configurationfiles must be edited from theroot account.
  • Introduction of typographical errors into the configuration file, if youedit it directly.

For information on resolving other problems, seeTroubleshooting theMonitoring agent.

Monitoring agent on Windows

You can't disable the collection of process metrics by the legacy Monitoring agentrunning on Windows VMs. This agent is not configurable.If you want to disable collection of these metrics on Windows, we recommendthat you upgrade to the Ops Agent version 2.0.0 orhigher. For more information, seeInstalling the Ops Agent.

If you are running the Ops Agent, seeOps Agent on Linux or Windows.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.