After a problematic Kubernetes cluster upgrade, which involved replacing all physical nodes with new physical nodes, I am trying to resolve a problem where the databend-query Pods cannot seem to talk to the databend-meta servers due to the leader information being incorrect. The error is: fail to update leader: "node(10.240.3.40:9191) to set is not in the nodes list";endpoints: current:Some("databend-meta-0.databend-meta.databend-meta.svc:9191"), all:[databend-meta-0.databend-meta.databend-meta.svc:9191, databend-meta-1.databend-meta.databend-meta.svc:9191, databend-meta-2.databend-meta.databend-meta.svc:9191]
How do I get the query pods to drop knowledge about the non-existent node 10.240.3.40:9191 and use the actual leader? $ kubectl logs -n databend-query tenant1-databend-query-22024-11-01T11:55:53.996580Z ERROR databend_common_meta_client::established_client: src/meta/client/src/established_client.rs:72 **fail to update leader: "node(10.240.3.40:9191) to set is not in the nodes list"; endpoints:** current:Some("databend-meta-0.databend-meta.databend-meta.svc:9191"), all:[databend-meta-0.databend-meta.databend-meta.svc:9191, databend-meta-1.databend-meta.databend-meta.svc:9191, databend-meta-2.databend-meta.databend-meta.svc:9191]2024-11-01T11:55:54.009058Z ERROR databend_common_meta_client::established_client: src/meta/client/src/established_client.rs:72 fail to update leader: "node(10.240.3.40:9191) to set is not in the nodes list"; endpoints: current:Some("databend-meta-0.databend-meta.databend-meta.svc:9191"), all:[databend-meta-0.databend-meta.databend-meta.svc:9191, databend-meta-1.databend-meta.databend-meta.svc:9191, databend-meta-2.databend-meta.databend-meta.svc:9191]2024-11-01T11:55:54.021456Z ERROR databend_common_meta_client::established_client: src/meta/client/src/established_client.rs:72 fail to update leader: "node(10.240.3.40:9191) to set is not in the nodes list"; endpoints: current:Some("databend-meta-0.databend-meta.databend-meta.svc:9191"), all:[databend-meta-0.databend-meta.databend-meta.svc:9191, databend-meta-1.databend-meta.databend-meta.svc:9191, databend-meta-2.databend-meta.databend-meta.svc:9191]2024-11-01T11:55:54.024934Z WARN databend_query::api::http_service: src/query/service/src/api/http_service.rs:165 Http API TLS not set2024-11-01T11:55:54.047197Z ERROR databend_common_meta_client::established_client: src/meta/client/src/established_client.rs:72 fail to update leader: "node(10.240.3.40:9191) to set is not in the nodes list"; endpoints: current:Some("databend-meta-0.databend-meta.databend-meta.svc:9191"), all:[databend-meta-0.databend-meta.databend-meta.svc:9191, databend-meta-1.databend-meta.databend-meta.svc:9191, databend-meta-2.databend-meta.databend-meta.svc:9191]Databend QueryVersion: v1.2.410-4b8cd16f0c(rust-1.77.0-nightly-2024-04-08T12:20:44.288903419Z)Logging: file: enabled=false, level=INFO, dir=/var/log/databend, format=json stderr: enabled=true, level=WARN, format=text otlp: enabled=false, level=INFO, endpoint=http://127.0.0.1:4317, labels= query: enabled=false, dir=, otlp_endpoint=, labels= tracing: enabled=false, capture_log_level=INFO, otlp_endpoint=http://127.0.0.1:4317Meta: connected to endpoints [ "databend-meta-0.databend-meta.databend-meta.svc:9191", "databend-meta-1.databend-meta.databend-meta.svc:9191", "databend-meta-2.databend-meta.databend-meta.svc:9191",]Memory: limit: unlimited allocator: jemalloc config: percpu_arena:percpu,oversize_threshold:0,background_thread:true,dirty_decay_ms:5000,muzzy_decay_ms:50002024-11-01T11:55:54.079710Z ERROR databend_common_meta_client::established_client: src/meta/client/src/established_client.rs:72 fail to update leader: "node(10.240.3.40:9191) to set is not in the nodes list"; endpoints: current:Some("databend-meta-0.databend-meta.databend-meta.svc:9191"), all:[databend-meta-0.databend-meta.databend-meta.svc:9191, databend-meta-1.databend-meta.databend-meta.svc:9191, databend-meta-2.databend-meta.databend-meta.svc:9191]Cluster: [3] nodesStorage: azblob | container=databend,root=,endpoint=https://hospitalblob.blob.core.windows.netCache: noneBuiltin users: databendAdmin listened at 10.240.2.112:8080MySQL listened at 0.0.0.0:3307 connect via: mysql -u${USER} -p${PASSWORD} -h0.0.0.0 -P3307Clickhouse(http) listened at 0.0.0.0:8124 usage: echo 'create table test(foo string)' | curl -u${USER} -p${PASSWORD}: '0.0.0.0:8124' --data-binary @-echo '{"foo": "bar"}' | curl -u${USER} -p${PASSWORD}: '0.0.0.0:8124/?query=INSERT%20INTO%20test%20FORMAT%20JSONEachRow' --data-binary @-Databend HTTP listened at 0.0.0.0:8000 usage: curl -u${USER} -p${PASSWORD}: --request POST '0.0.0.0:8000/v1/query/' --header 'Content-Type: application/json' --data-raw '{"sql": "SELECT avg(number) FROM numbers(100000000)"}'2024-11-01T11:56:25.705791Z ERROR databend_common_meta_client::established_client: src/meta/client/src/established_client.rs:72 fail to update leader: "node(10.240.3.40:9191) to set is not in the nodes list"; endpoints: current:Some("databend-meta-0.databend-meta.databend-meta.svc:9191"), all:[databend-meta-0.databend-meta.databend-meta.svc:9191, databend-meta-1.databend-meta.databend-meta.svc:9191, databend-meta-2.databend-meta.databend-meta.svc:9191]
Looking in the metadata dump, the values for grpc_api_advertise_address are all incorrect and these physical VMs no longer exist: "grpc_api_advertise_address":"10.240.0.41:9191" "grpc_api_advertise_address":"10.240.1.1:9191" "grpc_api_advertise_address":"10.240.1.40:9191" Is there a way to forcibly update the IPs advertised? "raft_log",{"Logs":{"key":6072664,"value":{"log_id":{"leader_id":{"term":103,"node_id":1},"index":6072664},"payload":{"Normal":{"txid":null,"time_ms":1730462812940,"cmd":{"AddNode":{"node_id":2,"node":{"name":"2","endpoint":{"addr":"databend-meta-2.databend-meta.databend-meta.svc.cluster.local","port":28004},"grpc_api_advertise_address":"10.240.0.41:9191"},"overriding":true}}}}}}}]["raft_log",{"Logs":{"key":6072702,"value":{"log_id":{"leader_id":{"term":106,"node_id":2},"index":6072702},"payload":{"Normal":{"txid":null,"time_ms":1730463157677,"cmd":{"AddNode":{"node_id":1,"node":{"name":"1","endpoint":{"addr":"databend-meta-1.databend-meta.databend-meta.svc.cluster.local","port":28004},"grpc_api_advertise_address":"10.240.1.1:9191"},"overriding":true}}}}}}}]["raft_log",{"Logs":{"key":6072715,"value":{"log_id":{"leader_id":{"term":106,"node_id":2},"index":6072715},"payload":{"Normal":{"txid":null,"time_ms":1730463281633,"cmd":{"AddNode":{"node_id":0,"node":{"name":"0","endpoint":{"addr":"databend-meta-0.databend-meta.databend-meta.svc.cluster.local","port":28004},"grpc_api_advertise_address":"10.240.1.40:9191"},"overriding":true}}}}}}}]["raft_log",{"Logs":{"key":6129817,"value":{"log_id":{"leader_id":{"term":109,"node_id":1},"index":6129817},"payload":{"Normal":{"txid":null,"time_ms":1730713039741,"cmd":{"AddNode":{"node_id":2,"node":{"name":"2","endpoint":{"addr":"databend-meta-2.databend-meta.databend-meta.svc.cluster.local","port":28004},"grpc_api_advertise_address":"databend-meta-2.databend-meta.databend-meta.svc.cluster.local:9191"},"overriding":true}}}}}}}]["raft_log",{"Logs":{"key":6129853,"value":{"log_id":{"leader_id":{"term":113,"node_id":2},"index":6129853},"payload":{"Normal":{"txid":null,"time_ms":1730713081174,"cmd":{"AddNode":{"node_id":1,"node":{"name":"1","endpoint":{"addr":"databend-meta-1.databend-meta.databend-meta.svc.cluster.local","port":28004},"grpc_api_advertise_address":"databend-meta-1.databend-meta.databend-meta.svc.cluster.local:9191"},"overriding":true}}}}}}}]["raft_log",{"Logs":{"key":6129860,"value":{"log_id":{"leader_id":{"term":113,"node_id":2},"index":6129860},"payload":{"Normal":{"txid":null,"time_ms":1730713104683,"cmd":{"AddNode":{"node_id":0,"node":{"name":"0","endpoint":{"addr":"databend-meta-0.databend-meta.databend-meta.svc.cluster.local","port":28004},"grpc_api_advertise_address":"databend-meta-0.databend-meta.databend-meta.svc.cluster.local:9191"},"overriding":true}}}}}}}]["state_machine/0",{"Nodes":{"key":0,"value":{"name":"0","endpoint":{"addr":"databend-meta-0.databend-meta.databend-meta.svc.cluster.local","port":28004},"grpc_api_advertise_address":"databend-meta-0.databend-meta.databend-meta.svc.cluster.local:9191"}}}]["state_machine/0",{"Nodes":{"key":1,"value":{"name":"1","endpoint":{"addr":"databend-meta-1.databend-meta.databend-meta.svc.cluster.local","port":28004},"grpc_api_advertise_address":"databend-meta-1.databend-meta.databend-meta.svc.cluster.local:9191"}}}]["state_machine/0",{"Nodes":{"key":2,"value":{"name":"2","endpoint":{"addr":"databend-meta-2.databend-meta.databend-meta.svc.cluster.local","port":28004},"grpc_api_advertise_address":"databend-meta-2.databend-meta.databend-meta.svc.cluster.local:9191"}}}]
|