- Notifications
You must be signed in to change notification settings - Fork114
General purpose metrics adapter for Kubernetes HPA metrics
License
zalando-incubator/kube-metrics-adapter
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Kube Metrics Adapter is a general purpose metrics adapter for Kubernetes thatcan collect and serve custom and external metrics for Horizontal PodAutoscaling.
It supports scaling based onPrometheus metrics,SQS queues and others out of the box.
It discovers Horizontal Pod Autoscaling resources and starts to collect therequested metrics and stores them in memory. It's implemented using thecustom-metrics-apiserverlibrary.
Here's an example of aHorizontalPodAutoscaler
resource configured to getrequests-per-second
metrics from each pod of the deploymentmyapp
.
apiVersion:autoscaling/v2kind:HorizontalPodAutoscalermetadata:name:myapp-hpaannotations:# metric-config.<metricType>.<metricName>.<collectorType>/<configKey>metric-config.pods.requests-per-second.json-path/json-key:"$.http_server.rps"metric-config.pods.requests-per-second.json-path/path:/metricsmetric-config.pods.requests-per-second.json-path/port:"9090"spec:scaleTargetRef:apiVersion:apps/v1kind:Deploymentname:myappminReplicas:1maxReplicas:10metrics: -type:Podspods:metric:name:requests-per-secondtarget:averageValue:1ktype:AverageValue
Themetric-config.*
annotations are used by thekube-metrics-adapter
toconfigure a collector for getting the metrics. In the above example itconfigures ajson-path pod collector.
Like thesupportpolicy offeredfor Kubernetes, this project aims to support the latest three minor releases ofKubernetes.
The default supported API isautoscaling/v2
(available sincev1.23
).This API MUST be available in the cluster which is the default.
This project usesGo modules asintroduced in Go 1.11 therefore you need Go >=1.11 installed in order to build.If using Go 1.11 you also need toactivate Modulesupport.
Assuming Go has been setup with module support it can be built simply by running:
export GO111MODULE=on# needed if the project is checked out in your $GOPATH.$ make
Clone this repository, and run as below:
$cd kube-metrics-adapter/docs$ kubectl apply -f.
Collectors are different implementations for getting metrics requested by anHPA resource. They are configured based on HPA resources and started on-demand by thekube-metrics-adapter
to only collect the metrics required for scaling the application.
The collectors are configured either simply based on the metrics defined in anHPA resource, or via additional annotations on the HPA resource.
The pod collector allows collecting metrics from each pod matching the label selector defined in the HPA'sscaleTargetRef
.Currently onlyjson-path
collection is supported.
The Pod Collector utilizes thescaleTargetRef
specified in an HPA resource to obtain the label selector from the referenced Kubernetes object. This enables the identification and management of pods associated with that object. Currently, the supported Kubernetes objects for this operation are:Deployment
,StatefulSet
andRollout
.
Metric | Description | Type | K8s Versions |
---|---|---|---|
custom | No predefined metrics. Metrics are generated from user defined queries. | Pods | >=1.12 |
This is an example of using the pod collector to collect metrics from a jsonmetrics endpoint of each pod matched by the HPA.
apiVersion:autoscaling/v2kind:HorizontalPodAutoscalermetadata:name:myapp-hpaannotations:# metric-config.<metricType>.<metricName>.<collectorType>/<configKey>metric-config.pods.requests-per-second.json-path/json-key:"$.http_server.rps"metric-config.pods.requests-per-second.json-path/json-eval:"ceil($['active processes'] / $['total processes'] * 100)"# cannot use both json-eval and json-keymetric-config.pods.requests-per-second.json-path/path:/metricsmetric-config.pods.requests-per-second.json-path/port:"9090"metric-config.pods.requests-per-second.json-path/scheme:"https"metric-config.pods.requests-per-second.json-path/aggregator:"max"metric-config.pods.requests-per-second.json-path/interval:"60s"# optionalmetric-config.pods.requests-per-second.json-path/min-pod-ready-age:"30s"# optionalspec:scaleTargetRef:apiVersion:apps/v1kind:Deploymentname:myappminReplicas:1maxReplicas:10metrics: -type:Podspods:metric:name:requests-per-secondtarget:averageValue:1ktype:AverageValue
The pod collector is configured through the annotations which specify thecollector namejson-path
and a set of configuration options for thecollector.json-key
defines the json-path query for extracting the rightmetric. This assumes the pod is exposing metrics in JSON format. For the aboveexample the following JSON data would be expected:
{"http_server": {"rps":0.5 }}
The json-path query support depends on thegithub.com/spyzhov/ajson library.See the README for possible queries. It's expected that the metric you queryreturns something that can be turned into afloat64
.
Thejson-eval
configuration option allows for more complex calculations to beperformed on the extracted metric. Thejson-eval
expression is evaluated usingajson's script engine.
The other configuration optionspath
,port
andscheme
specify where the metricsendpoint is exposed on the pod. Thepath
andport
options do not have default valuesso they must be defined. Thescheme
is optional and defaults tohttp
.
Theaggregator
configuration option specifies the aggregation function used to aggregatevalues of JSONPath expressions that evaluate to arrays/slices of numbers.It's optional but when the expression evaluates to an array/slice, it's absence willproduce an error. The supported aggregation functions areavg
,max
,min
andsum
.
Theraw-query
configuration option specifies the query params to send along to the endpoint:
metric-config.pods.requests-per-second.json-path/path:/metricsmetric-config.pods.requests-per-second.json-path/port:"9090"metric-config.pods.requests-per-second.json-path/raw-query:"foo=bar&baz=bop"
will create a URL like this:
http://<podIP>:9090/metrics?foo=bar&baz=bop
There are also configuration options for custom (connect and request) timeouts when querying pods for metrics:
metric-config.pods.requests-per-second.json-path/request-timeout:2smetric-config.pods.requests-per-second.json-path/connect-timeout:500ms
The default for both of the above values is 15 seconds.
Themin-pod-ready-age
configuration option instructs the service to start collecting metrics from the pods only if they are "older" (time elapsed after pod reached "Ready" state) than the specified amount of time.This is handy when pods need to warm up before HPAs will start tracking their metrics.
The default value is 0 seconds.
The Prometheus collector is a generic collector which can map Prometheusqueries to metrics that can be used for scaling. This approach is differentfrom how it's done in thek8s-prometheus-adapterwhere all available Prometheus metrics are collectedand transformed into metrics which the HPA can scale on, and there is nopossibility to do custom queries.With the approach implemented here, users can define custom queries and only metricsreturned from those queries will be available, reducing the total number ofmetrics stored.
One downside of this approach is that bad performing queries can slow down/killPrometheus, so it can be dangerous to allow in a multi tenant cluster. It'salso not possible to restrict the available metrics using something like RBACsince any user would be able to create the metrics based on a custom query.
I still believe custom queries are more useful, but it's good to be aware ofthe trade-offs between the two approaches.
Metric | Description | Type | Kind | K8s Versions |
---|---|---|---|---|
prometheus-query | Generic metric which requires a user defined query. | External | >=1.12 | |
custom | No predefined metrics. Metrics are generated from user defined queries. | Object | any | >=1.12 |
This is an example of an HPA configured to get metrics based on a Prometheusquery. The query is defined in the annotationmetric-config.external.processed-events-per-second.prometheus/query
whereprocessed-events-per-second
is the query name which will be associatedwith the result of the query.This allows having multiple prometheus queries associated with a single HPA.
apiVersion:autoscaling/v2kind:HorizontalPodAutoscalermetadata:name:myapp-hpaannotations:# This annotation is optional.# If specified, then this prometheus server is used,# instead of the prometheus server specified as the CLI argument `--prometheus-server`.metric-config.external.processed-events-per-second.prometheus/prometheus-server:http://prometheus.my-namespace.svc# metric-config.<metricType>.<metricName>.<collectorType>/<configKey>metric-config.external.processed-events-per-second.prometheus/query:| scalar(sum(rate(event-service_events_count{application="event-service",processed="true"}[1m])))metric-config.external.processed-events-per-second.prometheus/interval:"60s"# optionalspec:scaleTargetRef:apiVersion:apps/v1kind:Deploymentname:custom-metrics-consumerminReplicas:1maxReplicas:10metrics: -type:Externalexternal:metric:name:processed-events-per-secondselector:matchLabels:type:prometheustarget:type:AverageValueaverageValue:"10"
Note: Prometheus Object metrics aredeprecated and will most likely beremoved in the future. Use the Prometheus External metrics instead as describedabove.
This is an example of an HPA configured to get metrics based on a Prometheusquery. The query is defined in the annotationmetric-config.object.processed-events-per-second.prometheus/query
whereprocessed-events-per-second
is the metric name which will be associated withthe result of the query.
It also specifies an annotationmetric-config.object.processed-events-per-second.prometheus/per-replica
whichinstructs the collector to treat the results as an average over all podstargeted by the HPA. This makes it possible to mimic the behavior oftargetAverageValue
which is not implemented for metric typeObject
as ofKubernetes v1.10. (It will most likely come in v1.12).
apiVersion:autoscaling/v2beta1kind:HorizontalPodAutoscalermetadata:name:myapp-hpaannotations:# metric-config.<metricType>.<metricName>.<collectorType>/<configKey>metric-config.object.processed-events-per-second.prometheus/query:| scalar(sum(rate(event-service_events_count{application="event-service",processed="true"}[1m])))metric-config.object.processed-events-per-second.prometheus/per-replica:"true"spec:scaleTargetRef:apiVersion:apps/v1kind:Deploymentname:custom-metrics-consumerminReplicas:1maxReplicas:10metrics: -type:Objectobject:metricName:processed-events-per-secondtarget:apiVersion:v1kind:Podname:dummy-podtargetValue:10# this will be treated as targetAverageValue
Note: The HPA object requires anObject
to be specified. However when a Prometheus metric is used there is no needfor this object. But to satisfy the schema we specify a dummy pod calleddummy-pod
.
The skipper collector is a simple wrapper around the Prometheus collector tomake it easy to define an HPA for scaling based onIngress orRouteGroup metrics whenskipper is used as the ingressimplementation in your cluster. It assumes you are collecting Prometheusmetrics from skipper and it provides the correct Prometheus queries out of thebox so users don't have to define those manually.
Metric | Description | Type | Kind | K8s Versions |
---|---|---|---|---|
requests-per-second | Scale based on requests per second for a certain ingress or routegroup. | Object | Ingress ,RouteGroup | >=1.19 |
This is an example of an HPA that will scale based onrequests-per-second
foran ingress calledmyapp
.
apiVersion:autoscaling/v2kind:HorizontalPodAutoscalermetadata:name:myapp-hpaspec:scaleTargetRef:apiVersion:apps/v1kind:Deploymentname:myappminReplicas:1maxReplicas:10metrics: -type:Objectobject:describedObject:apiVersion:networking.k8s.io/v1kind:Ingressname:myappmetric:name:requests-per-secondselector:matchLabels:backend:backend1# optional backendtarget:averageValue:"10"type:AverageValue
This is an example of an HPA that will scale based onrequests-per-second
fora routegroup calledmyapp
.
apiVersion:autoscaling/v2kind:HorizontalPodAutoscalermetadata:name:myapp-hpaspec:scaleTargetRef:apiVersion:apps/v1kind:Deploymentname:myappminReplicas:1maxReplicas:10metrics: -type:Objectobject:describedObject:apiVersion:zalando.org/v1kind:RouteGroupname:myappmetric:name:requests-per-secondselector:matchLabels:backend:backend1# optional backendtarget:averageValue:"10"type:AverageValue
Skipper supports sending traffic to different backends based on annotationspresent on theIngress
object, or weights on the RouteGroup backends. Bydefault the number of replicas will be calculated based on the full trafficserved by that ingress/routegroup. If however only the traffic being routed toa specific backend should be used then the backend name can be specified viathebackend
label undermatchLabels
for the metric. The ingress annotationwhere the backend weights can be obtained can be specified through the flag--skipper-backends-annotation
.
The External RPS collector, like Skipper collector, is a simple wrapper around the Prometheus collector tomake it easy to define an HPA for scaling based on the RPS measured for a given hostname. Whenskipper is used as the ingressimplementation in your cluster everything should work automatically, in case another reverse proxy is used as ingress, likeNginx for example, its necessary to configure which prometheus metric should be used through--external-rps-metric-name <metric-name>
flag. Assumingskipper-ingress
is being used or the appropriate metric name is passed using the flag mentioned previously this collector provides the correct Prometheus queries out of thebox so users don't have to define those manually.
Metric | Description | Type | Kind | K8s Versions |
---|---|---|---|---|
requests-per-second | Scale based on requests per second for a certain hostname. | External | >=1.12 |
This is an example of an HPA that will scale based onrequests-per-second
for the RPS measured in the hostnames called:www.example1.com
andwww.example2.com
; and weighted by 42%.
apiVersion:autoscaling/v2kind:HorizontalPodAutoscalermetadata:name:myapp-hpaannotations:metric-config.external.example-rps.requests-per-second/hostnames:www.example1.com,www.example2.commetric-config.external.example-rps.requests-per-second/weight:"42"spec:scaleTargetRef:apiVersion:apps/v1kind:Deploymentname:custom-metrics-consumerminReplicas:1maxReplicas:10metrics: -type:Externalexternal:metric:name:example-rpsselector:matchLabels:type:requests-per-secondtarget:type:AverageValueaverageValue:"42"
This metric supports a relation of n:1 between hostnames and metrics. The way it works is the measured RPS is the sum of the RPS rate of each of the specified hostnames. This value is further modified by the weight parameter explained below.
There are ingress-controllers, like skipper-ingress, that supports sending traffic to different backends based on some kind of configuration, in case of skipper annotationspresent on theIngress
object, or weights on the RouteGroup backends. Bydefault the number of replicas will be calculated based on the full trafficserved by these components. If however only the traffic being routed toa specific hostname should be used then the weight for the configured hostname(s) might be specified via theweight
annotationmetric-config.external.<metric-name>.request-per-second/weight
for the metric being configured.
The InfluxDB collector mapsFlux queries to metrics that can be used for scaling.
Note that the collector targets anInfluxDB v2 instance, that's whywe only support Flux instead of InfluxQL.
Metric | Description | Type | Kind | K8s Versions |
---|---|---|---|---|
flux-query | Generic metric which requires a user defined query. | External | >=1.10 |
This is an example of an HPA configured to get metrics based on a Flux query.The query is defined in the annotationmetric-config.external.<metricName>.influxdb/query
where<metricName>
isthe query name which will be associated with the result of the query. Thisallows having multiple flux queries associated with a single HPA.
apiVersion:autoscaling/v2kind:HorizontalPodAutoscalermetadata:name:myapp-hpaannotations:# These annotations are optional.# If specified, then they are used for setting up the InfluxDB client properly,# instead of using the ones specified via CLI. Respectively:# - --influxdb-address# - --influxdb-token# - --influxdb-orgmetric-config.external.queue-depth.influxdb/address:"http://influxdbv2.my-namespace.svc"metric-config.external.queue-depth.influxdb/token:"secret-token"# This could be either the organization name or the ID.metric-config.external.queue-depth.influxdb/org:"deadbeef"# metric-config.<metricType>.<metricName>.<collectorType>/<configKey># <configKey> == query-namemetric-config.external.queue-depth.influxdb/query:| from(bucket: "apps") |> range(start: -30s) |> filter(fn: (r) => r._measurement == "queue_depth") |> group() |> max() // Rename "_value" to "metricvalue" for letting the metrics server properly unmarshal the result. |> rename(columns: {_value: "metricvalue"}) |> keep(columns: ["metricvalue"])metric-config.external.queue-depth.influxdb/interval:"60s"# optionalspec:scaleTargetRef:apiVersion:apps/v1kind:Deploymentname:queryd-v1minReplicas:1maxReplicas:4metrics: -type:Externalexternal:metric:name:queue-depthselector:matchLabels:type:influxdbtarget:type:Valuevalue:"1"
The AWS collector allows scaling based on external metrics exposed by AWSservices e.g. SQS queue lengths.
To integrate with AWS, the controller needs to run on nodes withaccess to AWS API. Additionally the controller have to have a rolewith the following policy to get all required data from AWS:
PolicyDocument:Statement: -Action:'sqs:GetQueueUrl'Effect:AllowResource:'*' -Action:'sqs:GetQueueAttributes'Effect:AllowResource:'*' -Action:'sqs:ListQueues'Effect:AllowResource:'*' -Action:'sqs:ListQueueTags'Effect:AllowResource:'*'Version:2012-10-17
Metric | Description | Type | K8s Versions |
---|---|---|---|
sqs-queue-length | Scale based on SQS queue length | External | >=1.12 |
This is an example of an HPA that will scale based on the length of an SQSqueue.
apiVersion:autoscaling/v2kind:HorizontalPodAutoscalermetadata:name:myapp-hpaspec:scaleTargetRef:apiVersion:apps/v1kind:Deploymentname:custom-metrics-consumerminReplicas:1maxReplicas:10metrics: -type:Externalexternal:metric:name:my-sqsselector:matchLabels:type:sqs-queue-lengthqueue-name:foobarregion:eu-central-1target:averageValue:"30"type:AverageValue
ThematchLabels
are used bykube-metrics-adapter
to configure a collectorthat will get the queue length for an SQS queue namedfoobar
in regioneu-central-1
.
The AWS account of the queue currently depends on howkube-metrics-adapter
isconfigured to get AWS credentials. The normal assumption is that you run theadapter in a cluster running in the AWS account where the queue is defined.Please open an issue if you would like support for other use cases.
The ZMON collector allows scaling based on external metrics exposed byZMON checks.
Metric | Description | Type | K8s Versions |
---|---|---|---|
zmon-check | Scale based on any ZMON check results | External | >=1.12 |
This is an example of an HPA that will scale based on the specified valueexposed by a ZMON check with id1234
.
apiVersion:autoscaling/v2kind:HorizontalPodAutoscalermetadata:name:myapp-hpaannotations:# metric-config.<metricType>.<metricName>.<collectorType>/<configKey>metric-config.external.my-zmon-check.zmon/key:"custom.*"metric-config.external.my-zmon-check.zmon/tag-application:"my-custom-app-*"metric-config.external.my-zmon-check.zmon/interval:"60s"# optionalspec:scaleTargetRef:apiVersion:apps/v1kind:Deploymentname:custom-metrics-consumerminReplicas:1maxReplicas:10metrics: -type:Externalexternal:metric:name:my-zmon-checkselector:matchLabels:type:zmoncheck-id:"1234"# the ZMON check to query for metricskey:"custom.value"tag-application:my-custom-appaggregators:avg# comma separated list of aggregation functions, default: lastduration:5m# default: 10mtarget:averageValue:"30"type:AverageValue
Thecheck-id
specifies the ZMON check to query for the metrics.key
specifies the JSON key in the check output to extract the metric value from.E.g. if you have a check which returns the following data:
{"custom": {"value":1.0 },"other": {"value":3.0 }}
Then the value1.0
would be returned when the key is defined ascustom.value
.
Thetag-<name>
labels defines the tags used for the kariosDB query. In anormal ZMON setup the following tags will be available:
application
alias
(name of Kubernetes cluster)entity
- full ZMON entity ID.
aggregators
defines the aggregation functions applied to the metrics query.For instance if you define the entity filtertype=kube_pod,application=my-custom-app
you might get three entities back andthen you might want to get an average over the metrics for those threeentities. This would be possible by using theavg
aggregator. The defaultaggregator islast
which returns only the latest metric point from thequery. The supported aggregation functions areavg
,count
,last
,max
,min
,sum
,diff
. See theKariosDB docs fordetails.
Theduration
defines the duration used for the timeseries query. E.g. if youspecify a duration of5m
then the query will return metric points for thelast 5 minutes and apply the specified aggregation with the same duration .e.gmax(5m)
.
The annotationsmetric-config.external.my-zmon-check.zmon/key
andmetric-config.external.my-zmon-check.zmon/tag-<name>
can be optionally used ifyou need to define akey
or othertag
with a "star" query syntax likevalues.*
. Thishack is in place because it's not allowed to use*
in themetric label definitions. If both annotations and corresponding label isdefined, then the annotation takes precedence.
The Nakadi collector allows scaling based onNakadiSubscription API stats metricsconsumer_lag_seconds
orunconsumed_events
.
Metric Type | Description | Type | K8s Versions |
---|---|---|---|
unconsumed-events | Scale based on number of unconsumed events for a Nakadi subscription | External | >=1.24 |
consumer-lag-seconds | Scale based on number of max consumer lag seconds for a Nakadi subscription | External | >=1.24 |
apiVersion:autoscaling/v2kind:HorizontalPodAutoscalermetadata:name:myapp-hpaannotations:# metric-config.<metricType>.<metricName>.<collectorType>/<configKey>metric-config.external.my-nakadi-consumer.nakadi/interval:"60s"# optionalspec:scaleTargetRef:apiVersion:apps/v1kind:Deploymentname:custom-metrics-consumerminReplicas:0maxReplicas:8# should match number of partitions for the event typemetrics: -type:Externalexternal:metric:name:my-nakadi-consumerselector:matchLabels:type:nakadisubscription-id:"708095f6-cece-4d02-840e-ee488d710b29"metric-type:"consumer-lag-seconds|unconsumed-events"target:# value is compatible with the consumer-lag-seconds metric type.# It describes the amount of consumer lag in seconds before scaling# additionally up.# if an event-type has multiple partitions the value of# consumer-lag-seconds is the max of all the partitions.value:"600"# 10mtype:Value# averageValue is compatible with unconsumed-events metric type.# This means for every 30 unconsumed events a pod is added.# unconsumed-events is the sum of of unconsumed_events over all# partitions.averageValue:"30"type:AverageValue
Thesubscription-id
is the Subscription ID of the relevant consumer. Themetric-type
indicates whether to scale onconsumer-lag-seconds
orunconsumed-events
as outlined below.
unconsumed-events
- is the total number of unconsumed events over allpartitions. When using thismetric-type
you should also use the targetaverageValue
which indicates the number of events which can be handled perpod. To best estimate the number of events per pods, you need to understand theaverage time for processing an event as well as the rate of events.
Example: You have an event type producing 100 events per second between 00:00and 08:00. Between 08:01 to 23:59 it produces 400 events per second.Let's assume that on average a single pod can consume 100 events per second,then we can define 100 asaverageValue
and the HPA would scale to 1 between00:00 and 08:00, and scale to 4 between 08:01 and 23:59. If there for somereason is a short spike of 800 events per second, then it would scale to 8 podsto process those events until the rate goes down again.
consumer-lag-seconds
- describes the age of the oldest unconsumed event fora subscription. If the event type has multiple partitions the lag is defined asthe max age over all partitions. When using thismetric-type
you should usethe targetvalue
to indicate the max lag (in seconds) before the HPA shouldscale.
Example: You have a subscription with a defined SLO of "99.99 of events areconsumed within 30 min.". In this case you can define a targetvalue
of e.g.20 min. (1200s) (to include a safety buffer) such that the HPA only scales upfrom 1 to 2 if the target of 20 min. is breached and it needs to work fasterwith more consumers.For this case you should also account for the average time for processing anevent when defining the target.
Alternative to definingsubscription-id
you can also filter based onowning_application
,event-types
andconsumer-group
:
metrics:-type:Externalexternal:metric:name:my-nakadi-consumerselector:matchLabels:type:nakadiowning-application:"example-app"# comma separated list of event typesevent-types:"example-event-type,example-event-type2"consumer-group:"abcd1234"metric-type:"consumer-lag-seconds|unconsumed-events"
This is useful in dynamic environments where the subscription ID might not beknown before deployment time (e.g. because it's created by the same deployment).
The http collector allows collecting metrics from an external endpoint specified in the HPA.Currently onlyjson-path
collection is supported.
Metric | Description | Type | K8s Versions |
---|---|---|---|
custom | No predefined metrics. Metrics are generated from user defined queries. | Pods | >=1.12 |
This is an example of using the HTTP collector to collect metrics from a jsonmetrics endpoint specified in the annotations.
apiVersion:autoscaling/v2kind:HorizontalPodAutoscalermetadata:name:myapp-hpaannotations:# metric-config.<metricType>.<metricName>.<collectorType>/<configKey>metric-config.external.unique-metric-name.json-path/json-key:"$.some-metric.value"metric-config.external.unique-metric-name.json-path/json-eval:ceil($['active processes'] / $['total processes'] * 100)# cannot use both json-eval and json-keymetric-config.external.unique-metric-name.json-path/endpoint:"http://metric-source.app-namespace:8080/metrics"metric-config.external.unique-metric-name.json-path/aggregator:"max"metric-config.external.unique-metric-name.json-path/interval:"60s"# optionalspec:scaleTargetRef:apiVersion:apps/v1kind:Deploymentname:myappminReplicas:1maxReplicas:10metrics: -type:Externalexternal:metric:name:unique-metric-nameselector:matchLabels:type:json-pathtarget:averageValue:1type:AverageValue
The HTTP collector similar to the Pod Metrics collector. The followingconfiguration values are supported:
json-key
to specify the JSON path of the metric to be queriedjson-eval
to specify an evaluate string toevaluate on the script engine,cannot be used in conjunction withjson-key
endpoint
the fully formed path to query for the metric. In the above example a KubernetesServicein the namespaceapp-namespace
is called.aggregator
is only required if the metric is an array of values and specifies how the valuesare aggregated. Currently this option can support the values:sum
,max
,min
,avg
.
It's possible to configure the scrape interval for each of the metric types viaan annotation:
metric-config.<metricType>.<metricName>.<collectorType>/interval:"30s"
The default is60s
but can be reduced to let the adapter collect metrics moreoften.
TheScalingSchedule
andClusterScalingSchedule
collectors allowcollecting time-based metrics from the respective CRD objects specifiedin the HPA.
These collectors are disabled by default, you have to start the serverwith the--scaling-schedule
flag to enable it. Remember to deploy the CRDsScalingSchedule
andClusterScalingSchedule
and allow the serviceaccount used by the server to read, watch and list them.
Metric | Description | Type | K8s Versions |
---|---|---|---|
ObjectName | The metric is calculated and stored for eachScalingSchedule andClusterScalingSchedule referenced in the HPAs | ScalingSchedule andClusterScalingSchedule | >=1.16 |
To avoid abrupt scaling due to time based metrics,theSchalingSchedule
collector has a feature of ramp-up and ramp-down the metric over aspecific period of time. The duration of the scaling window can beconfigured individually in the[Cluster]ScalingSchedule
object, viathe optionscalingWindowDurationMinutes
or globally for all scheduledevents, and defaults to a globally configured value if not specified.The default for the latter is set to 10 minutes, but can be changedusing the--scaling-schedule-default-scaling-window
flag.
This spreads the scale events around, creating less load on the othercomponents, and helping the rest of the metrics (like the CPU ones) toadjust as well.
TheHPA algorithm does not make changes if the metricchange is less than the specified by thehorizontal-pod-autoscaler-tolerance
flag:
We'll skip scaling if the ratio is sufficiently close to 1.0 (within aglobally-configurable tolerance, from the
--horizontal-pod-autoscaler-tolerance
flag, which defaults to 0.1.
With that in mind, the ramp-up and ramp-down feature divides the scalingover the specified period of time in buckets, trying to achieve changesbigger than the configured tolerance. The number of buckets defaults to10 and can be configured by the--scaling-schedule-ramp-steps
flag.
Important: note that the ramp-up and ramp-down feature can lead todeployments achieving less than the specified number of pods, due to theHPA 10% change rule and the ceiling function applied to the desirednumber of the pods (check thealgorithm details). Itvaries with the configured metric forScalingSchedule
events, thenumber of pods and the configuredhorizontal-pod-autoscaler-tolerance
flag of your kubernetes installation.This gist contains the code tosimulate the situations a deployment with different number of pods, witha metric of 10000 can face with 10 buckets (max of 90% of the metricreturned) and 5 buckets (max of 80% of the metric returned). The ramp-upand ramp-down feature can be disabled by setting--scaling-schedule-default-scaling-window
to 0 and abrupt scalings canbe handled viascaling policies.
This is an example of using the ScalingSchedule collectors to collectmetrics from a deployed kind of the CRD. First, the schedule object:
apiVersion:zalando.org/v1kind:ClusterScalingSchedulemetadata:name:"scheduling-event"spec:schedules: -type:OneTimedate:"2021-10-02T08:08:08+02:00"durationMinutes:30value:100 -type:RepeatingdurationMinutes:10value:120period:startTime:"15:45"timezone:"Europe/Berlin"days: -Mon -Wed -Fri
This resource defines a scheduling event namedscheduling-event
withtwo schedules of the kindClusterScalingSchedule
.
ClusterScalingSchedule
objects aren't namespaced, what means it can bereferenced by any HPA in any namespace in the cluster.ScalingSchedule
have the exact same fields and behavior, but can be referenced just byHPAs in the same namespace. The schedules can have the typeRepeating
orOneTime
.
This example configuration will generate the following result: at2021-10-02T08:08:08+02:00
for 30 minutes a metric with the value of100 will be returned. Every Monday, Wednesday and Friday, starting at 15hours and 45 minutes (Berlin time), a metric with the value of 120 willbe returned for 10 minutes. It's not the case of this example, but if multipleschedules collide in time, the biggest value is returned.
Check the CRDs definitions(ScalingSchedule,ClusterScalingSchedule) fora better understanding of the possible fields and their behavior.
An HPA can reference the deployedClusterScalingSchedule
object asthis example:
apiVersion:autoscaling/v2kind:HorizontalPodAutoscalermetadata:name:"myapp-hpa"spec:scaleTargetRef:apiVersion:apps/v1kind:Deploymentname:myappminReplicas:1maxReplicas:15metrics: -type:Objectobject:describedObject:apiVersion:zalando.org/v1kind:ClusterScalingSchedulename:"scheduling-event"metric:name:"scheduling-event"target:type:AverageValueaverageValue:"10"
The name of the metric is equal to the name of the referenced object.Thetarget.averageValue
in this example is set to 10. This value willbe used by the HPA controller to define the desired number of pods,based on the metric obtained (check theHPA algorithmdetailsfor more context). This HPA configuration explicitly says that each podof this application supports 10 units of theClusterScalingSchedule
metric. Multiple applications can share the sameClusterScalingSchedule
orScalingSchedule
event and have a differentnumber of pods based on itstarget.averageValue
configuration.
In our specific example at2021-10-02T08:08:08+02:00
as the metric hasthe value 100, this application will scale to 10 pods (100/10). EveryMonday, Wednesday and Friday, starting at 15 hours and 45 minutes(Berlin time) the application will scale to 12 pods (120/10). Bothscaling up will last at least the configured duration times of theschedules. After that, regular HPA scale down behavior applies.
Note that these number of pods are just considering these custommetrics, the normal HPA behavior still applies, such as: in case ofmultiple metrics the biggest number of pods is the utilized one, HPA maxand min replica configuration, autoscaling policies, etc.
About
General purpose metrics adapter for Kubernetes HPA metrics
Topics
Resources
License
Code of conduct
Security policy
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.