- Notifications
You must be signed in to change notification settings - Fork1.5k
Description
Hello,
We are running Calico for Windows nodes deployed via the Tigera Operator, and we are facing issues caused by the current exec-based readiness and liveness probes in the calico-windows-node DaemonSet.
The exec probes trigger execution of commands inside the container, which in our environment leads to memory leaks or gradual memory growth in the containerd shim process. This appears to be related to known containerd issues on Windows:
When Calico is installed via the Tigera Operator, the operator fully manages the calico-windows-node DaemonSet. Any manual modifications to the DaemonSet (such as changing readiness/liveness probes from exec to httpGet) are immediately reverted.
This prevents us from applying a workaround that replaces exec probes with httpGet, even though:
calico-windows-node does expose HTTP health endpoints
Switching probes to HTTP successfully mitigates the memory growth issue in our tests
Our current workaround is to disable the Tigera Operator and manually patch the DaemonSet, which is not ideal and breaks the desired operator-managed workflow.
When installed via operator, Calico on Windows uses hcsshim directly rather than NSSM / Windows services.
The HTTP health endpoints are already available, so no application-level changes appear necessary — only operator-level configurability.
Environment
Kubernetes: 1.33.4
Calico: (add your version)
Installation method: Tigera Operator
OS: Windows Server 2022