- Notifications
You must be signed in to change notification settings - Fork914
fix: use TSMP for pings and checking reachability#11306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
Current dependencies on/for this PR: Thisstack of pull requests is managed byGraphite. |
5933e85
to0576b65
CompareSigned-off-by: Spike Curtis <spike@coder.com>
@@ -186,7 +186,7 @@ func TestAgent_Stats_Magic(t *testing.T) { | |||
// If it isn't, it's set to -1. | |||
s.ConnectionMedianLatencyMS >= 0 | |||
}, testutil.WaitLong, testutil.IntervalFast, | |||
"never saw stats: %+v", s, | |||
"never saw stats", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This gives misleading information since the pointer is passed on the initial call toEventually
. It is not updated since<-stats
returns a new pointer, rather than updating the data at the old pointer. Changed in favor of logging the relevant info as we get it.
s, ok := <-stats | ||
t.Logf("got stats after disconnect %t, %d", | ||
ok, s.SessionCountJetBrains) | ||
return ok && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
TSMP ping counts as a "connection" in the agent stats so it never goes to zero. I'm not sure why we were testing it going to zero in the first place@code-asher
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I think we wanted to make sureConnectionCount
was also updating correctly, but did not realize there would be other connections going on.
Merge activity
|
Uh oh!
There was an error while loading.Please reload this page.
We're seeing some flaky tests related to agent connectivity -https://github.com/coder/coder/actions/runs/7286675441/job/19856270998
I'm pretty sure what happened in this one is that the client opened a connection while the wgengine was in the process of reconfiguring the wireguard device, so the fact that the peer became "active" as a result of traffic being sent was not noticed.
The test calls
AwaitReachable()
but this only tests the disco layer, so it doesn't wait for wireguard to come up.I think we should be using TSMP for pinging and reachability, since this operates at the IP layer, and therefore requires that wireguard comes up before being successful.
This should also help with the problems we have seen where a TCP connection starts before wireguard is up and the initial round trip has to wait for the 5 second wireguard handshake retry.
fixes:#11294