Movatterモバイル変換


[0]ホーム

URL:


Jump to content
Wikitech
Search

Portal:Toolforge/Admin/Kubernetes/Pod tracing

From Wikitech
<Portal:Toolforge |Admin |Kubernetes
This page is currently a draft.
Material may not yet be complete, information may presently be omitted, and certain parts of the content may be subject to radical, rapid alteration. More information pertaining to this may be available on thetalk page.

This article describes some procedures in cases we need totrace a pod when is misbehaving.Typical case is an API being hammered by an unknown tool in Toolforge, and the need to shutdown the corresponding pod.

In all cases, after you have identified the offending tool/pod, you can disable it as described atHelp:Toolforge/Kubernetes#Monitoring_your_job (i.e., become the tool in a bastion, and then delete the deployment).

Case 1: Unknown Tool hammering an API

This case happened before. See T204267 for example.

In this example, the Wikidata API was being hammered by an IP from a k8s worker node in Toolforge. The tool was using an not-meaningfull User-Agent, so we had no way of identifying quickly which tool was causing it.

First, get an overview of pods running in the offending k8s node (you should know the k8s worker node becase that would be present in the API server logs):

Command example
aborrero@tools-k8s-master-01:~$kubectlgetpod-owide--all-namespaces|greptools-worker-1021base-php-cli                       interactive                                         1/1       Running            0          25d       192.168.165.3    tools-worker-1021.tools.eqiad1.wikimedia.cloudcitations                          interactive                                         1/1       Running            0          53d       192.168.165.8    tools-worker-1021.tools.eqiad1.wikimedia.cloudfireflytools                       fireflytools-3361574769-010jg                       1/1       Running            0          53d       192.168.165.2    tools-worker-1021.tools.eqiad1.wikimedia.cloudintuition                          interactive                                         1/1       Running            0          23d       192.168.165.7    tools-worker-1021.tools.eqiad1.wikimedia.cloudmagog                              magog-3723068317-3i4x9                              1/1       Running            0          15d       192.168.165.14   tools-worker-1021.tools.eqiad1.wikimedia.cloudordia                              ordia-2331477560-stoeq                              1/1       Running            0          15d       192.168.165.10   tools-worker-1021.tools.eqiad1.wikimedia.cloudores-support-checklist             ores-support-checklist-1319410377-t54a7             1/1       Running            0          53d       192.168.165.9    tools-worker-1021.tools.eqiad1.wikimedia.cloudphpinfo                            phpinfo-2217838168-rm5ui                            1/1       Running            0          21d       192.168.165.13   tools-worker-1021.tools.eqiad1.wikimedia.cloudproxies                            proxies-1745127721-f9yjq                            1/1       Running            0          44d       192.168.165.11   tools-worker-1021.tools.eqiad1.wikimedia.cloudstrephit                           strephit-1006144150-x7q69                           1/1       Running            0          77d       192.168.165.4    tools-worker-1021.tools.eqiad1.wikimedia.cloudtopicmatcher                       topicmatcher-187403292-suxnb                        1/1       Running            0          73d       192.168.165.5    tools-worker-1021.tools.eqiad1.wikimedia.cloudverification-pages                 verification-pages-3591681152-nfeew                 1/1       Running            0          22h       192.168.165.15   tools-worker-1021.tools.eqiad1.wikimedia.cloudw-slackbot                         w-slackbot-3270543702-ljgri                         1/1       Running            0          14d       192.168.165.12   tools-worker-1021.tools.eqiad1.wikimedia.cloud

Try to see at first glance if a tool is the obvious suspicious from causing the high traffic.If not, try runningtcpdump in the k8s node to try to see and identify some traffic pattern which allows you to match the traffic to a given internal k8s IP address (i.e, 192.168.x.x).You can also inspect other resources, likeconntrack -L which maintains a list of current NAT connections and likeiptables-save, which contains the matching between internal k8s IP addresses and tool names (rules comments).

Command examples
aborrero@tools-worker-1021:~$sudoconntrack-L|grep208.80.153.224conntrack v1.4.2 (conntrack-tools): 54 flow entries have been shown.tcp      6 86399 ESTABLISHED src=192.168.165.16 dst=208.80.153.224 sport=49614 dport=443 src=208.80.153.224 dst=10.68.22.153 sport=443 dport=49614 [ASSURED] mark=0 use=1tcp      6 86399 ESTABLISHED src=192.168.165.16 dst=208.80.153.224 sport=53066 dport=443 src=208.80.153.224 dst=10.68.22.153 sport=443 dport=53066 [ASSURED] mark=0 use=1tcp      6 86398 ESTABLISHED src=192.168.165.16 dst=208.80.153.224 sport=49616 dport=443 src=208.80.153.224 dst=10.68.22.153 sport=443 dport=49616 [ASSURED] mark=0 use=1tcp      6 86398 ESTABLISHED src=192.168.165.16 dst=208.80.153.224 sport=49618 dport=443 src=208.80.153.224 dst=10.68.22.153 sport=443 dport=49618 [ASSURED] mark=0 use=1tcp      6 4 CLOSE src=192.168.165.16 dst=208.80.153.224 sport=49598 dport=443 src=208.80.153.224 dst=10.68.22.153 sport=443 dport=49598 [ASSURED] mark=0 use=1tcp      6 86399 ESTABLISHED src=192.168.165.16 dst=208.80.153.224 sport=49606 dport=443 src=208.80.153.224 dst=10.68.22.153 sport=443 dport=49606 [ASSURED] mark=0 use=1tcp      6 86399 ESTABLISHED src=192.168.165.16 dst=208.80.153.224 sport=49600 dport=443 src=208.80.153.224 dst=10.68.22.153 sport=443 dport=49600 [ASSURED] mark=0 use=1aborrero@tools-worker-1021:~$sudoiptables-save|grep192.168.165.16-A KUBE-SEP-YGEO2XDTYUD5YAL4 -s 192.168.165.16/32 -m comment --comment "corhist/corhist:http" -j KUBE-MARK-MASQ-A KUBE-SEP-YGEO2XDTYUD5YAL4 -p tcp -m comment --comment "corhist/corhist:http" -m tcp -j DNAT --to-destination 192.168.165.16:8000
Retrieved from "https://wikitech.wikimedia.org/w/index.php?title=Portal:Toolforge/Admin/Kubernetes/Pod_tracing&oldid=1883472"

[8]ページ先頭

©2009-2025 Movatter.jp