- Notifications
You must be signed in to change notification settings - Fork563
Description
Description
Hey team,
We've observed a couple of issues regarding the webhooks(MutatingWebhookConfiguration) installed on a Kubernetes cluster interfering with cluster critical resources and possibly other resources. The situation we are facing is that these webhooks introduce failures in node operations when the webhooks themselves are failing/unreachable.
Specifically the kube-apiserver tries calling these webhooks per request, waits for the timeout amount provided in the webhook configuration before completing the request. Under a failure scenario where these webhooks are unreachable for any reason, this introducesn * timeout latency to each apiserver request withn being the amount of webhooks. From an example deployment,timeoutSeconds: 10 is set, which means if there are 10 webhooks, an added100s of latency to the apiserver calls can be observed.
To note the webhook is handled by the model operator pod in cluster and requests are expected to be routed through the CNI.
The main culprit in my opinion here is the filters and rules on these webhooks being too wide. Again from an example k8s cloud, forjuju-model-admission-controller-myk8s we can see
failurePolicy:IgnorematchPolicy:Equivalentname:admission.juju.isnamespaceSelector:matchLabels:controller.juju.is/id:dbd899ca-70a6-4bb0-8831-2053eec5d5cemodel.juju.is/name:controllerobjectSelector:matchExpressions: -key:model.juju.is/disable-webhookoperator:DoesNotExistreinvocationPolicy:Neverrules: -apiGroups: -'*'apiVersions: -'*'operations: -CREATE -UPDATEresources: -'*'scope:'*'sideEffects:NonetimeoutSeconds:10
Here we can observe thatCREATE andUPDATE operations for anyapiGroups, anyresources for bothCluster andNamescape scopes will trigger these webhooks. The main filters we seem to have here is thenamespaceSelector andobjectSelector.
From theupstream docs
If the object is a cluster scoped resource other than a Namespace,
namespaceSelectorhas no effect.
based on this information, these webhooks run for any cluster scoped resource such asTokenRequest,TokenReview used for authentication by kubelet, CNI CRs for Cilium, and more gets affected by this.
An example failure scenario is after restarting a control plane machine in a cluster with these webhooks, the restarted node is not able to recover and serve requests properly. The root cause here is that these webhooks will fail until the pod networking is setup by the CNI(e.g.cilium), Cilium itself needs to create/update certain resources(possibly cluster scoped) on startup for operations which results in these requests timing out/not completing in time since these webhooks introduce100s of latency. This gets us into a cyclic dependency where Cilium can not start because webhooks are failing, and webhooks are failing because Cilium is unable to setup the pod networking on the node. To break out this cycle one has to disable the admission webhooks by setting--disable-admission-plugins=MutatingAdmissionWebhook on the kube-apiserver, restarting Cilium to get the node ready and roll back this change to re-enable the webhooks.
Juju version
3.6.11
Cloud
Kubernetes
Expected behaviour
I am not knowledgeable in why and how these webhooks are used but I believe it is best to narrow down the rules, scopes and filtering here to make sure these webhooks only apply to specific resources.
Reproduce / Test
- Create multiple models on a k8s multi-node cluster
- Restart the control plane VMs
- Observe webhook / timeout failures on
kube-apiserverlogs - Observe Cilium failing to start
Notes & References
No response