Sample MQL queries

Announcement: Starting on October 22, 2024, Monitoring Query Language (MQL) will no longer be a recommended querylanguage for Cloud Monitoring. Certain usability features will bedisabled, but you can still run MQL queries in Metrics Explorer,and dashboards and alerting policies that use MQL will continue towork. For more information, see thedeprecation notice forMQL.

This document introduces Monitoring Query Language (MQL) through examples. However, itdoesn't attempt to cover all aspects of the language. MQL iscomprehensively documented inMonitoring Query Language reference.

For information on MQL-basedalerting policies, seeAlerting policies with MQL.

You can write a particular query in many forms; the language isflexible, and there are many shortcuts you can use after you arefamiliar with the syntax. For more information, seeStrict-form queries.

Before you begin

To access the code editor when using Metrics Explorer, do thefollowing:

  1. In the Google Cloud console, go to the Metrics explorer page:

    Go toMetrics explorer

    If you use the search bar to find this page, then select the result whose subheading isMonitoring.

  2. In the toolbar of thequery-builder pane, select the button whose name is either MQL or PromQL.
  3. Verify thatMQL is selectedin theLanguage toggle. The language toggle is in the same toolbar thatlets you format your query.

To run a query, paste the query into the editor and clickRun Query.For an introduction to this editor, seeUse the code editor for MQL.

Some familiarity with Cloud Monitoring concepts including metrictypes, monitored-resource types, and time series is helpful.For an introduction to these concepts, seeMetrics, time series, and resources.

Data model

MQL queries retrieve and manipulate data in the Cloud Monitoringtime-series database. This section introduces some of the conceptsand terminology related to that database. For detailed information,see the reference topicData model.

Every time series originates from a single type of monitored resource, andevery time series collects data of one metric type.A monitored-resource descriptor defines a monitored-resource type. Similarly,a metric descriptor defines a metric type.For example, the resource type might begce_instance, aCompute Engine virtual machine (VM), and the metric type might becompute.googleapis.com/instance/cpu/utilization, the CPU utilizationof the Compute Engine VM.

These descriptors also specify a set of labels that are used tocollect information about other attributes of the metric or resourcetype. For example, resources typically have azone label, used torecord the geographic location of the resource.

One time series is created for each combination of valuesfor the labels from the pair of a metric descriptor and amonitored-resource descriptor.

You can find the available labels for resource types in theMonitored resource list, for examplegce_instance.To find the labels for metric types, see theMetrics list;for example, seemetrics from Compute Engine.

The Cloud Monitoring database stores the time series from a particularmetric and resource type in one table. The metric and resource typeact as the identifier for the table. This MQL query fetchesthe table of time series recording CPU utilization for Compute Engineinstances:

fetch gce_instance::compute.googleapis.com/instance/cpu/utilization

There is one time series in the table for each unique combination of metricand resource label values.

MQL queries retrieve time-series data from these tablesand transform it into output tables. These output tables can bepassed into other operations. For example, you canisolate the time series written by resources in a particularzone or set of zones by passing the retrieved table as input toafilter operation:

fetch gce_instance::compute.googleapis.com/instance/cpu/utilization| filter zone =~ 'us-central.*'

The preceding query results in a table that contains onlythe time series from resources in a zone that begins withus-central.

MQL queries are structured to pass the output of one operationas the input to the next operation. This table-based approach lets you linkoperations together to manipulate this data by filtering, selection, and otherfamiliar database operations like inner and outer joins. You can alsorun various functions on the data in the time series as the datais passed from one operation to another.

The operations and functions available in MQL are fully documented inMonitoring Query Language reference.

Query structure

A query is made up of one or moreoperations. Operations are linked,orpiped, together so that the output of one operation is the inputto the next. Therefore, the result of a query depends on the order ofthe operations. Some of the things you can do include the following:

  • Start a query with afetch or other selection operation.
  • Build up a query with multiple operations piped together.
  • Select a subset of information withfilter operations.
  • Aggregate related information withgroup_by operations.
  • Look at outliers withtop andbottom operations.
  • Combine multiple queries with{ ; } andjoin operations.
  • Use thevalue operation and functions to compute ratios and other values.

Not all queries use all of these options.

These examples introduce only some of the available operations and functions.For detailed information about the structure of MQL queries, seereference topicQuery Structure.

These examples don't specify two things that you might expectto see: time ranges and alignment. The following sections explain why.

Time ranges

When you use the code editor, the chart settings define thetime range for queries. By default, the chart's time range is set to one hour.

To change the time range of the chart, use the time-range selector. Forexample, if you want to view the data for the past week, then selectLast 1 week from the time-range selector. You can also specifya start and end time, or specify a time to view around.

For more information about time ranges in the code editor, seeTime ranges, charts, and the code editor.

Alignment

Many of the operations used in these examples, like thejoin andgroup_byoperations, depend on all the time series points in a table occurring atregular intervals. The act of making all the points line up at regulartimestamps is calledalignment. Usually, alignment is done implicitly, andnone of the examples here show it.

MQL automatically aligns tables forjoin andgroup_by operationswhen needed, but MQL lets you do alignment explicitly as well.

Fetch and filter data

MQL queries start with the retrieval and selection or filtering ofdata. This section illustrates some basic retrieval andfiltering with MQL.

Retrieve time-series data

A query always starts with afetch operation, which retrieves timeseries from Cloud Monitoring.

The simplest query consists of a singlefetch operation and an argument thatidentifies the time series to fetch, such as the following:

fetch gce_instance::compute.googleapis.com/instance/cpu/utilization

The argument consists of a monitored-resource type,gce_instance,a pair of colon characters,::,and a metric type,compute.googleapis.com/instance/cpu/utilization.

This query retrieves the time series written by Compute Engine instancesfor the metric typecompute.googleapis.com/instance/cpu/utilization, whichrecords the CPU utilization of those instances.

If you run the query from the code editor in Metrics Explorer,you get a chart showing each of the requested time series:

Chart shows CPU utilization data for Compute Engine instances.

Each of the requested time series is displayed as a line on the chart. Eachtime series includes a list of time-stamped values from the CPU-utilizationmetric for one VM instance in this project.

In the backend storage used by Cloud Monitoring, time series are stored intables. Thefetch operation organizes the time series for the specifiedmonitored-resource and metric types into a table, and then it returns the table.The returned data is displayed in the chart.

Thefetch operation is described, along with its arguments, on thefetch reference page. For more information on thedata produced by operations, see the reference pages fortime seriesandtables.

Filter operations

Queries typically consist of a combination of multiple operations. The simplestcombination is to pipe the output of one operation into the input of the nextby using the pipe operator,|. The following example illustrates using apipe to input the table to a filter operation:

fetch gce_instance::compute.googleapis.com/instance/cpu/utilization| filter instance_name =~ 'gke.*'

This query pipes the table, returned by thefetch operation shown in theprevious example, into afilter operation which takes as an expressionthat evaluates to a boolean value. In this example, the expression means"instance_name starts withgke".

Thefilter operation takes the input table, removes the time series forwhich the filter is false, and outputs the resulting table. The followingscreenshot shows the resulting chart:

Chart shows results filtered for`gke`.

If you don't have any instance names that start withgke, change the filter before trying this query. For example, if you haveVM instances withapache at the beginning of their names, use the followingfilter:

 | filter instance_name =~ 'apache.*'

Thefilter expression is evaluated once for each input time series.If the expression evaluates totrue, that time series is included in theoutput. In this example, the filter expression does a regular expressionmatch,=~, on theinstance_name label of each time series. If the valueof the label matches the regular expression'gke.*', then the time seriesis included in the output. If not, the time series is dropped from the output.

For more information on filtering, see thefilter reference page.Thefilter predicate can be any arbitrary expression that returns a booleanvalue; for more information, seeExpressions.

Group and aggregate

Grouping lets you group time series along specific dimensions.Aggregation combines all the time series in a group into one output time series.

The following query filters the output of the initialfetch operationto retain only those time series from resources in a zone that begins withus-central. It then groups the time series by zone and combines themusingmean aggregation.

fetchgce_instance::compute.googleapis.com/instance/cpu/utilization|filterzone=~'us-central.*'|group_by[zone],mean(val())

The table resulting from thegroup_by operation has one time series per zone.The following screenshot shows the resulting chart:

Chart shows a filtered fetch grouped byzone.

Thegroup_by operation takes two arguments, separated by acomma,,. These arguments determine the precise grouping behavior. Inthis example,group_by [zone], mean(val()), the arguments act as follows:

  • The first argument,[zone], is amap expression that determines thegrouping of the time series. In this example, it specifies the labels touse for grouping. The grouping step collects all input time series thathave the same outputzone values into one group. In this example, theexpression collects the time series from the Compute Engine VMs inone zone.

    The output time series has only azone label, with the value copied fromthe input time series in the group. Other labels on the input time seriesare dropped from the output time series.

    The map expression can do much more than list labels; for more information,see themap reference page.

  • The second argument,mean(val()), determines how the time series in eachgroup are combined, oraggregated, into one output time series. Eachpoint in the output time series for a group is the result of aggregatingthe points with the same timestamp from all input time seriesin the group.

    The aggregation function,mean in this example, determinesthe aggregated value. Theval() function returns the pointsto be aggregated, the aggregation function is applied to those points.In this example, you get the mean of the CPU utilization of the virtualmachines in the zone at each output time point.

    The expressionmean(val()) is an example of anaggregating expression.

Thegroup_by operation always combines grouping and aggregation.If you specify a grouping but omit the aggregation argument,group_by usesa default aggregation,aggregate(val()), which selects an appropriatefunction for the data type. Seeaggregate for the listof default aggregation functions.

Usegroup_by with a log-based metric

Suppose you have created a distribution log-based metric to extract the numberof data points processed from a set of long entries including strings likefollowing:

... entry ID 1 ... Processed data points 1000 ...... entry ID 2 ... Processed data points 1500 ...... entry ID 3 ... Processed data points 1000 ...... entry ID 4 ... Processed data points 500 ...

To create a time series that shows the count of all processed data points,use an MQL such as the following:

fetch global| metric 'logging.googleapis.com/user/METRIC_NAME'| group_by [], sum(sum_from(value))

To create a log-based distribution metric, seeconfigure distribution metrics.

Exclude columns from a group

You can use thedrop modifier in a mapping to exclude columns from agroup.For example, the Kubernetescore_usage_time metric has six columns:

fetch k8s_container :: kubernetes.io/container/cpu/core_usage_time| group_by [project_id, location, cluster_name, namespace_name, container_name]

If you don't need to grouppod_name, then you can exclude it withdrop:

fetch k8s_container :: kubernetes.io/container/cpu/core_usage_time| group_by drop [pod_name]

Select time series

The examples in this section illustrate ways to select particular timeseries out of an input table.

Select top or bottom time series

To see the time series data for the three Compute Engineinstances with the highest CPU utilization within your project, enter thefollowing query:

fetch gce_instance::compute.googleapis.com/instance/cpu/utilization| top 3

The following screenshot shows the result from one project:

Chart shows 3 highest-utilization timeseries.

You can retrieve the time series with the lowest CPU utilization byreplacingtop withbottom.

Thetop operation outputs a table with a specified number oftime series selected from its input table. The time series included in theoutput have the largest value for some aspect of the time series.

Since this query doesn't specify a way to order the time series,it returns those time series with the largest value for the most recent point.To specify how to determine which time series have the largest value, youcan provide an argument to thetop operation. For example, theprevious query is equivalent to the following query:

fetch gce_instance::compute.googleapis.com/instance/cpu/utilization| top 3, val()

Theval() expression selects the value of the most recent point in eachtime series it is applied to. Therefore, the query returns those time serieswith the largest value for the most recent point.

You can provide an expression that does aggregation over some or all pointsin a time series to give the sorting value. The following takes the meanof all points within the last 10 minutes:

fetch gce_instance::compute.googleapis.com/instance/cpu/utilization| top 3, mean(val()).within(10m)

If thewithin function isn't used, themean functionis applied to the values of all the displayed points in the time series.

Thebottom operation works similarly.The following query finds the value of the largest point in each time serieswithmax(val()) and then selects the three time series for which that value issmallest:

fetch gce_instance::compute.googleapis.com/instance/cpu/utilization| bottom 3, max(val())

The following screenshot shows a chart displaying the streams withthe smallest spikes:

Chart shows 3 highest-utilization timeseries.

Exclude the top or bottomn results in the time series

Consider a scenario in which you have many Compute Engine VM instances.A few of these instances consume a lot more memory than most instances,and these outliers are making it harder to see the usage patterns in thelarger group. Your CPU utilization charts look like the following:

Chart shows many CPU utilization lines, with several outliers.

You want to exclude the three outliers from the chart so that you can seethe patterns in the larger group more clearly.

To exclude the top three time series in a query that retrieves the time seriesfor Compute Engine CPU utilization, use thetop table operationto identify the time series and theouter_join table operationto exclude the identified time series from the results. You can use thefollowing query:

fetchgce_instance::compute.googleapis.com/instance/cpu/utilization|{top3|value[is_default_value: false()];ident}|outer_jointrue(),_|filteris_default_value|valuedrop[is_default_value]

Thefetch operation returns a table of time series for CPU utilization fromall instances. This table is then processed into two resulting tables:

  • Thetopn table operation outputs a table thatcontains then timeseries with the highest values. In this case,n = 3. The resultingtable contains the three times series to be excluded.

    The table containing the top three time series is then piped into avalue table operation. This operation adds another columnto each of the time series in the top-three table. This column,is_default_value, is given the boolean valuefalse for alltime series in the top-three table.

  • Theident operation returns the same table that was piped into it:the original table of CPU utilization time series. None of the timeseries in this table have theis_default_value column.

The top-three table and the original table are then piped into theouter_join table operation. The top-three table is the left tablein the join, the fetched table is the right table in the join.The outer join is set up to provide the valuetrue as the value forany field that doesn't exist in a row being joined. The result of theouter join is a merged table, with the rows from the top-three table keepingthe columnis_default_value with the valuefalse, and all the rows fromthe original table that weren't also in the top-three table getting theis_default_value column with the valuetrue.

The table resulting from the join is then passed to thefiltertable operation, which filters out the rows that have avalue offalse in theis_default_value column. The resulting tablecontains the rows from the originally fetched table without the rowsfrom the top-three table. This table contains the intended set of time series,with the addedis_default_column.

The final step is to drop the columnis_default_column that was added bythe join, so the output table has the same columns as the originally fetchedtable.

The following screenshot shows the chart for the prior query:

Chart shows many CPU utilization lines, with the outliers excluded.

You can create a query to exclude the time series with the lowestCPU utilization by replacingtopn withbottomn.

The ability to exclude outliers can be useful in cases where you want toset an alert but don't want the outliers to constantly trigger the alert.The following alert query uses the same exclusion logic as the priorquery to monitor the CPU limit utilization by a set of Kubernetes podsafter excluding the top two pods:

fetchk8s_container|metric'kubernetes.io/container/cpu/limit_utilization'|filter(resource.cluster_name=='CLUSTER_NAME'&&resource.namespace_name=='NAMESPACE_NAME'&&resource.pod_name=~'POD_NAME')|group_by1m,[value_limit_utilization_max: max(value.limit_utilization)]|{top2|value[is_default_value: false()];ident}|outer_jointrue(),_|filteris_default_value|valuedrop[is_default_value]|every1m|conditionval(0) >0.73'1'

Select top or bottom from groups

Thetop andbottom table operations select time series from the entireinput table. Thetop_by andbottom_byoperations group the time series in a table and then pick some number oftime series from each group.

The following query selects the time series in each zone with the largestpeak value:

fetchgce_instance::compute.googleapis.com/instance/cpu/utilization|top_by[zone],1,max(val())

Chart shows largest peak by zone.

The[zone] expression indicates that a group consists ofthe time series with the same value of thezone column.The1 in thetop_by indicates how many time series to select from eachzone's group. Themax(val()) expression looks for the largest value inthe chart's time range in each time series.

You can use anyaggregation function in place ofmax.For example, the following uses themean aggregator and useswithinto specify the sorting range of 20 minutes. It selects the top 2 time seriesin each zone:

fetchgce_instance::compute.googleapis.com/instance/cpu/utilization|top_by[zone],2,mean(val()).within(20m)

Chart shows 2 largest mean peak by zone within 20minutes.

In the previous example, there is only one instance in the zoneus-central-c,so there is only one time series returned; there isn't a "top 2" in the group.

Combine selections withunion

You can combine selection operations liketop andbottom to createcharts that show both. For example, the following query returns the singletime series with the maximum value and the single time series with the minimumvalue:

fetchgce_instance::compute.googleapis.com/instance/cpu/utilization|{top1,max(val());bottom1,min(val())}|union
Note: The indentation and alignment of the query components is forreadability. MQL isn't a space-dependent language.

The resulting chart shows two lines, the one containing the highest valueand the one containing the lowest:

Chart shows the time series with the highest and lowestvalues.

You can use braces,{ }, to specify sequences of operations, each ofwhich yields one table of time series as output. The individual operations areseparated by a semicolon,;.

In this example, thefetch operation returns a single table, whichis piped to each of the two operations in the sequence, atop operationand abottom operation. Each of these operations results in an output tablebased on the same input table. Theunion operation then combinesthe two tables into one, which is displayed on the chart.

See more about sequencing operations by using{ } in the referencetopicQuery Structure.

Combine time series with different values for one label

Suppose that you have multiple time series for the same metric type andyou want to combine a few of them together. If you want to select thembased on the values of a single label, then you can't create the queryby using the query-builder interface in Metrics Explorer. You need tofilter on two or more different values of the same label, but the query-builderinterface requires that a time series match all the filters to be selected:the label matching is anAND test. No time series can have two differentvalues for the same label, but you can't create anOR test for filtersin the query builder.

The following query retrieves the time series for the Compute Engineinstance/disk/max_read_ops_count metric for two specific Compute Engineinstances and aligns the output over 1-minute intervals:

fetch gce_instance| metric 'compute.googleapis.com/instance/disk/max_read_ops_count'| filter (resource.instance_id == '1854776029354445619' ||          resource.instance_id == '3124475757702255230')| every 1m

The following chart shows a result of this query:

Chart shows two time series selected by value of the same label.

If you want to find the sum of the maximummax_read_ops_count values for thesetwo VMs and sum them, you can do the following:

  • Find the maximum value for each time series by using thegroup_by table operator, specifying the same 1-minutealignment period and aggregating over the period with themax aggregatorto create a column namedmax_val_of_read_ops_count_max in the outputtable.
  • Find the sum of the time series by using thegroup_by table operator andthesum aggregator on themax_val_of_read_ops_count_max column.

The following shows the query:

fetch gce_instance| metric 'compute.googleapis.com/instance/disk/max_read_ops_count'| filter (resource.instance_id == '1854776029354445619' ||          resource.instance_id == '3124475757702255230')| group_by 1m, [max_val_of_read_ops_count_max: max(value.max_read_ops_count)]| every 1m| group_by [], [summed_value: sum(max_val_of_read_ops_count_max)]

The following chart shows a result of this query:

Chart shows the sum of two time series selected by value of the same label.

Compute percentile statistics across time and across streams

To compute a percentile stream value over a sliding window separately for eachstream, use a temporalgroup_by operation. For example, the followingquery computes the 99th percentile value of a stream over a 1-hour slidingwindow:

fetch gce_instance :: compute.googleapis.com/instance/cpu/utilization| group_by 1h, percentile(val(), 99)| every 1m

To compute the same percentile statistic at a point in time across streams,rather than across time within one stream, use a spatialgroup_by operation:

fetch gce_instance :: compute.googleapis.com/instance/cpu/utilization| group_by [], percentile(val(), 99)

Compute ratios

Suppose you've built a distributed web service that runs onCompute Engine VM instances and usesCloud Load Balancing.

You want to see a chart that displays the ratio of requests that return HTTP500 responses (internal errors) to the total number of requests; that is,the request-failure ratio. This section illustrates several ways to computethe request-failure ratio.

Cloud Load Balancing uses the monitored-resource typehttp_lb_rule.Thehttp_lb_rule monitored-resource type has amatched_url_path_rule label that records the prefix of URLs defined infor the rule; the default value isUNMATCHED.

Theloadbalancing.googleapis.com/https/request_countmetric type has aresponse_code_class label. This label captures the classof response codes.

Useouter_join anddiv

The following query determines the500 responses for each value ofthematched_url_path_rule label in eachhttp_lb_rule monitored resourcein your project. It then joins this failure-count table with the originaltable, which contains all response counts and divides the values to showthe ratio of failure responses to total responses:

fetchhttps_lb_rule::loadbalancing.googleapis.com/https/request_count|{filterresponse_code_class=500;ident}|group_by[matched_url_path_rule]|outer_join0|div

The following chart shows the result from one project:

Chart shows the request failure-to-total ratio byjoining.

The shaded areas around the lines on the chart aremin/max bands; for moreinformation, seeMin/max bands.

Thefetch operation outputs a table of time series containing counts ofrequest for all load-balanced queries. This table is processed in two ways bythe two operation sequences in the braces:

  • filter response_code_class = 500 outputs only the time series thathaveresponse_code_class label with the value500.The resulting time series counts the requests with HTTP 5xx (error) responsecodes.

    This table is the numerator of the ratio.

  • Theident, oridentity, operation outputs its input, sothis operation returns the originally fetched table. That's the table thatcontains time series with counts for every response code.

    This table is the denominator of the ratio.

The numerator and denominator tables, produced by thefilter andidentoperations respectively, are processed separately by thegroup_by operation.Thegroup_by operation groups the time series in each table by the value ofthematched_url_path_rule label and sums the counts for each value of thelabel. Thisgroup_by operation doesn't explicitly state the aggregatorfunction, so a default,sum, is used.

  • For the filtered table, thegroup_by result is the number of requestsreturning a500 response for eachmatched_url_path_rule value.

  • For the identity table, thegroup_by result is the total number ofrequests for eachmatched_url_path_rule value.

These tables are piped to theouter_join operation,which pairs time series with matching label values, one from eachof the two input tables. The paired time series arezipped up by matching thetimestamp of each point in one time series to the timestamp of a point inthe other time series. For each matched pair of points,outer_join produces asingle output point with two values, one from each of the input tables.The zipped-up time series is output by the join with the same labels as thetwo input time series.

With an outer join, if a point from the second table doesn't have amatching point in the first, a stand-in value must be provided. In thisexample, a point with value0—the argument to theouter_joinoperation—is used.

Note: Why use 0 as the value? If a monitored resource has never responded witha500 response, then there is no count value for500 responses.The stand-in value provides a value of 0 for the count.

Finally, thediv operation takes each point with two values and divides thevalues to produce a single output point: the ratio of 500 responses to allresponses for each URL map.

The stringdiv here is actually the name of thediv function,which divides two numeric values. But it's used here as an operation. Whenused as operations, functions likediv expect two values in each input point(which thisjoin ensures) and produce a single value for the correspondingoutput point.

The| div part of the query is a shortcut for| value val(0) / val(1).Thevalue operation allows arbitrary expressions on the value columnsof an input table to produce the value columns of the output table.For more information, see the reference pages for thevalueoperation and forexpressions.

Useratio

Thediv function could be replaced with any function on two values, butbecause ratios are so frequently used, MQL provides aratio table operation that computes ratios directly.

The following query is equivalent to the preceding version, usingouter_join anddiv:

fetchhttps_lb_rule::loadbalancing.googleapis.com/https/request_count|{filterresponse_code_class=500;ident}|group_by[matched_url_path_rule]|ratio

In this version, theratio operation replaces theouter_join 0 | divoperations in the earlier version and produces the same result.

Note thatratio only usesouter_join to supply a0 for the numerator ifboth the numerator and denominator inputs have the same labels identifyingeach time series, which MQLouter_join requires. If the numerator inputhas extra labels, then there will be no output for any point missing in thedenominator.

Usegroup_by and/

There's yet another way to compute the ratio of error responses to allresponses. In this case, because the numerator and denominator for the ratioare derived from the same time series, you can also compute the ratioby grouping alone. The following query shows this approach:

fetchhttps_lb_rule::loadbalancing.googleapis.com/https/request_count|group_by[matched_url_path_rule],sum(if(response_code_class=500,val(),0))/sum(val())

This query uses anaggregation expression built onthe ratio of two sums:

  • The firstsum uses theif function to count500-valued labels and 0 for others. Thesum function computes the countof the requests that returned 500.

  • The secondsum adds up the counts for all requests,val().

The two sums are then divided, resulting in the ratio of 500 responses to allresponses. This query produces the same result as those queries inUsingouter_join anddiv andUsingratio.

Usefilter_ratio_by

Because ratios are frequently computed by dividing two sums derived fromthe same table, MQL provides thefilter_ratio_byoperation for this purpose. The following query does the same thingas the preceding version, which explicitly divides the sums:

fetchhttps_lb_rule::loadbalancing.googleapis.com/https/request_count|filter_ratio_by[matched_url_path_rule],response_code_class=500

The first operand of thefilter_ratio_by operation, here[matched_url_path_rule], indicates how to group the responses. Thesecond operation, hereresponse_code_class = 500, acts as a filteringexpression for the numerator.

  • The denominator table is the result of grouping the fetched table bymatched_url_path_rule and aggregated by usingsum.
  • The numerator table is the fetched table, filtered for time series withan HTTP response code of 5xx, and then grouped bymatched_url_path_ruleand aggregated by usingsum.

Ratios and quota metrics

To set up queries and alerts onserviceruntime quotametrics and resource-specific quota metrics to monitor yourquota consumption, you can use MQL. For more information,including examples, seeUsing quota metrics.

Arithmetic computation

Sometimes you might want to perform an arithmetic operationon data before you chart it. For example, you might want to scale time series,convert the data to log scale, or chart the sum of two time series. For a listof arithmetic functions available in MQL, seeArithmetic.

To scale a time series, use themul function. For example, thefollowing query retrieves the time series and then multiplies each valueby 10:

  fetch gce_instance  | metric 'compute.googleapis.com/instance/disk/read_bytes_count'  | mul(10)

To sum two time series, configure your query to fetch two tables of time series,join those results, and then call theadd function. The followingexample illustrates a query that computes the sum of the number of bytes readfrom, and written to, Compute Engine instances:

  fetch gce_instance  | { metric 'compute.googleapis.com/instance/disk/read_bytes_count'    ; metric 'compute.googleapis.com/instance/disk/write_bytes_count' }  | outer_join 0  | add

To subtract the written byte counts from the read bytes count, replaceaddwithsub in the previous expression.

MQL uses the labels in the sets of tables returned from the firstand second fetch to determine how to join the tables:

  • If the first table contains a label not found in the second table, thenMQL can't perform anouter_join operation on thetables and therefore it reports an error.For example, the following query causes an error because themetric.instance_name label is present in the first table butnot in the second table:

     fetch gce_instance  | { metric 'compute.googleapis.com/instance/disk/write_bytes_count'    ; metric 'compute.googleapis.com/instance/disk/max_write_bytes_count' }  | outer_join 0  | add

    One way to resolve this type of error is to apply grouping clauses to ensurethe two tables have the same labels. For example, you can group away the alltime series labels:

     fetch gce_instance  | { metric 'compute.googleapis.com/instance/disk/write_bytes_count'      | group_by []    ; metric 'compute.googleapis.com/instance/disk/max_write_bytes_count'      | group_by [] }  | outer_join 0  | add
  • If the labels of the two tables match, or if the second table contains alabel not found in the first table, then the outer join is allowed. Forexample, the following query doesn't cause an error even though themetric.instance_name label is present in the second table, but not thefirst:

     fetch gce_instance  | { metric 'compute.googleapis.com/instance/disk/max_write_bytes_count'    ; metric 'compute.googleapis.com/instance/disk/write_bytes_count' }  | outer_join 0  | sub

    A time series found in the first table might have label values that matchmultiple time series in the second table, so MQLperforms the subtraction operation for each pairing.

Time shifting

Sometimes you want to compare what is going on now with what has happened in thepast. To let you compare past data to current data,MQL provides thetime_shift table operation to move data fromthe past into the current time period.

Over-time ratios

The following query usestime_shift,join,anddiv to compute the ratio of the mean utilization in eachzone between now and one week ago.

fetchgce_instance::compute.googleapis.com/instance/cpu/utilization|group_by[zone],mean(val())|{ident;time_shift1w}|join|div

The following chart shows a possible result of this query:

Chart shows the ratio of current and time-shifteddata.

The first two operations fetch the time series and then groupthem, by zone, computing the mean values for each. The resulting tableis then passed to two operations. The first operation,ident, passesthe table through unchanged.

The second operation,time_shift, adds the period (1 week) to thetimestamps for values in the table, which shifts data from one week agoforward. This change makes the timestamps for older data in the second tableline up with the timestamps for the current data in the first table.

The unchanged table and the time-shifted table are then combined by using aninnerjoin. Thejoin produces a table of time series where eachpoint has two values: the current utilization and the utilization a week ago.The query then uses thediv operation to compute the ratio of the currentvalue to the week-old value.

Past and present data

By combiningtime_shift withunion, you can create a chart that showspast and present data simultaneously. For example, the following queryreturns the overall mean utilization now and from a week ago. Usingunion,you can display these two results on the same chart.

fetchgce_instance::compute.googleapis.com/instance/cpu/utilization|group_by[]|{add[when:"now"];add[when:"then"]|time_shift1w}|union

The following chart shows a possible result of this query:

Chart shows the current and past meanutilization.

This query fetches the time series and then usesgroup_by []to combine them into a single time series with no labels, leaving theCPU-utilization data points. This result is passed to two operations.The first adds a column for a new label calledwhen with the valuenow.The second adds a label calledwhen with the valuethen andpasses the result to thetime_shift operation to shift the valuesby a week. This query uses theadd map modifier; seeMapsfor more information.

The two tables, each containing data for a single time series, are passedtounion, which produces one table containing the time series from bothinput tables.

What's next

For an overview of MQL language structures, seeAbout theMQL language.

For a complete description of MQL, see theMonitoring Query Language reference.

For information about interacting with charts, seeWorking with charts.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.