Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

fix(agent): send metadata in batches#10225

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
mafredri merged 2 commits intomainfrommafredri/feat-agent-send-metadata-in-batches
Oct 13, 2023

Conversation

mafredri
Copy link
Member

@mafredrimafredri commentedOct 11, 2023
edited
Loading

Fixes#9782


I recommend reviewing with ignore whitespace.

@mafredriGraphite App
Copy link
MemberAuthor

@mafredrimafredri changed the titlefeat(agent): send metadata in batchesfix(agent): send metadata in batchesOct 11, 2023
@mafredrimafredriforce-pushed themafredri/feat-agentsdk-use-agent-metadata-batch-endpoint branch fromeeb4adb to69ebb05CompareOctober 11, 2023 18:42
@mafredrimafredriforce-pushed themafredri/feat-agent-send-metadata-in-batches branch from82b8659 tob7958c3CompareOctober 11, 2023 18:44
@mafredrimafredriforce-pushed themafredri/feat-agentsdk-use-agent-metadata-batch-endpoint branch from69ebb05 toa3395acCompareOctober 12, 2023 08:30
@mafredrimafredriforce-pushed themafredri/feat-agent-send-metadata-in-batches branch fromb7958c3 todd6935eCompareOctober 12, 2023 08:30
@mafredrimafredriforce-pushed themafredri/feat-agentsdk-use-agent-metadata-batch-endpoint branch froma3395ac to7e7f6c0CompareOctober 12, 2023 10:26
@mafredrimafredriforce-pushed themafredri/feat-agent-send-metadata-in-batches branch fromdd6935e to431637aCompareOctober 12, 2023 10:27
updatedMetadata[mr.key]=mr.result
continue
case<-report:
iflen(updatedMetadata)>0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

If I'm understanding this correctly - say you have two agent metadata definitions, one which updates every 1 second, the other which updates every 10 seconds.

In the previous request-per-metadata-key approach, this would cause (0.1 + 1) 1.1 requests per second, while in this new approach we would end up with a minimum of 1 requests per second, as the more frequently updated metadatum would cause a batch request.

I'm think we should update the documentation with this change to reflect the new behaviour. I think it would also make sense to recommend users to keep in mind the minimum metadata refresh interval when writing their templates; any metadatum with a refresh interval of 1 second will cause frequent metadata updates from what I understand here.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

In the previous request-per-metadata-key approach, this would cause (0.1 + 1) 1.1 requests per second, while in this new approach we would end up with a minimum of 1 requests per second.

I'm not sure where0.1 comes from in your example, but for simplicity sake, let's say we have 1s and 2s intervals for the metadata, and each take 0ns to execute. In the previous implementation we would approximately do:

14:00:00 POST meta114:00:00 POST meta214:00:01 POST meta114:00:02 POST meta114:00:02 POST meta214:00:03 POST meta1

In the new implementation, we would:

14:00:00 POST meta1, meta214:00:01 POST meta114:00:02 POST meta1, meta214:00:03 POST meta1

With 2s and 3s metadata, it would look like this:

Old:

14:00:00 POST meta114:00:00 POST meta214:00:02 POST meta114:00:03 POST meta214:00:04 POST meta114:00:06 POST meta114:00:06 POST meta2

New:

14:00:00 POST meta1, meta214:00:02 POST meta114:00:03 POST meta214:00:04 POST meta114:00:06 POST meta1, meta2

This is an approximation of ideal conditions, though. And perhaps we should separatecollect andsend triggers by, say, 500ms. This would increase the likelyhood of ideal batching.

With regards to RPS the new implementation reduces RPS and doesn't necessarily guarantee 1 RPS, it depends on interval and how long commands take to execute.

I'm think we should update the documentation with this change to reflect the new behaviour.

Are you referring to thishttps://coder.com/docs/v2/latest/templates/agent-metadata#db-write-load?

The write load will be about the same, but the writes will be more performant since they're batched. I suppose we could call one batch one write, even though we're updating multiple rows. I'll amend this part.

Copy link
Member

@johnstcnjohnstcnOct 12, 2023
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Wouldn't an agent metadatum with a refresh interval of 10 seconds cause 1 request roughly every 10 seconds (i.e. 0.1 RPS)?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Yes, this is the case in both implementations.

johnstcn reacted with thumbs up emoji
Copy link
Member

@johnstcnjohnstcn left a comment
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The change to the docs is the cherry on top here!

-You can expect(10 * 6 * 2) / 4 or 30 writes per second.
+You can expect at most(10 * 2 / 4) + (10 * 2 / 6) or ~8 writes per second.

Copy link
Member

@mtojekmtojek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

As you refactored this function a bit, I'm wondering if it is possible to prepare extra unit tests for healthy/unhealthy conditions. I'm thinking about singleflight vs. duplicate backpressure,coderd is unavailable, etc.

case<-collect:
}

iflen(manifest.Metadata)>metadataLimit {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

We don't need this condition anymore? just in case...

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Previously there was a buffered channel, but that's no longer required so this artificial restriction got lifted.If we want to limit this, it should be in the coder tf provider, saying too many metadata items were added.

mtojek reacted with thumbs up emoji
reportSemaphore<-struct{}{}
}()

err:=a.client.PostMetadata(ctx, agentsdk.PostMetadataRequest{Metadata:metadata})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This behavior doesn't change, right:

If the agent fails to send the metadata, it will be lost. There is no "retry" mechanism in place now?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Correct, we don't retry since trying to send stale data wouldn't make much sense, we instead wait until the next update is available and try to send it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Ok! I guess we can debate about the lack of source data for insights, but that's a separate issue.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Ah, this is not used for insights, so it will not have an effect on that.

// benefit from canceling the previous send and starting a new one.
var (
updatedMetadata=make(map[string]*codersdk.WorkspaceAgentMetadataResult)
reportTimeout=30*time.Second
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

There are many magic timeouts hidden in this file. I'm wondering if we shouldn't move them to the top of file, and make them more dependent. For instance: instead of5sec ->reportTimeout * 1/6

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

With 5 sec, are you referring to the case wheremd.Timeout is not set (or zero), and we fall back to 5 sec?

I wanted to change as little about the metadata collection as possible, but I can go over this and see what can be consolidated.

This 30 second timeout was randomly picked by me as a crutch for scenarios where the network is super slow or down. It should be enough for at least something to get through, whilst avoiding sending very stale data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I see. I'm afraid that there are too many timeouts depending on each other in the entire implementation, and it might be hard to debug potential queueing problems.

Comment on lines +376 to +377
report=make(chanstruct{},1)
collect=make(chanstruct{},1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Would it change something if we increase buffers here?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Yes. They are single buffer to be non-blocking and not cause back-pressure. If we increase channel sizes then we may have back-pressure if an operation is taking longer than expected. Say collect was 10, if one iteration of starting collection took 10s then this channel would fill and afterwards 10 collections would start immediately after each other, even if they now finish in 0.1s each, possibly leading to 10 collections (and load) within a second (vs once per second).

mtojek reacted with thumbs up emoji
// The last collected value isn't quite stale yet, so we skip it.
ifcollectedAt.Add(time.Duration(md.Interval)*intervalUnit).After(time.Now()) {
return
ctxTimeout:=time.Duration(timeout)*time.Second
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I might be lost in the flow here... why is this 1sec timeout needed?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This also didn't change in this PR.

The timeout here can be 1s or it can be higher. I suppose the logic is that if you want metadata to update every 1s, collection should run faster than 1s, but this is likely not ideal.

We could change this in the future or if it becomes a problem.

mtojek reacted with thumbs up emoji
Copy link
Member

@mtojekmtojek left a comment
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I have better understanding of the code now. LGTM!

@ammarioammario removed their request for reviewOctober 12, 2023 15:32
@mafredrimafredriforce-pushed themafredri/feat-agentsdk-use-agent-metadata-batch-endpoint branch from7e7f6c0 to161c535CompareOctober 13, 2023 13:10
@mafredrimafredriforce-pushed themafredri/feat-agent-send-metadata-in-batches branch from4e3730e to44586d1CompareOctober 13, 2023 13:10
@mafredrimafredriforce-pushed themafredri/feat-agentsdk-use-agent-metadata-batch-endpoint branch 2 times, most recently from06d1c13 toe89259aCompareOctober 13, 2023 14:19
@mafredrimafredriforce-pushed themafredri/feat-agent-send-metadata-in-batches branch from44586d1 to3dd45f4CompareOctober 13, 2023 14:19
Base automatically changed frommafredri/feat-agentsdk-use-agent-metadata-batch-endpoint tomainOctober 13, 2023 14:32
@mafredriGraphite App
Copy link
MemberAuthor

mafredri commentedOct 13, 2023
edited
Loading

Merge activity

  • Oct 13, 10:32 AM:Graphite rebased this pull request after merging its parent, because this pull request is set to merge when ready.
  • Oct 13, 10:48 AM:@mafredri merged this pull request withGraphite.

@mafredrimafredriforce-pushed themafredri/feat-agent-send-metadata-in-batches branch from3dd45f4 to4758603CompareOctober 13, 2023 14:32
@mafredrimafredri merged commit76c65b1 intomainOct 13, 2023
@mafredrimafredri deleted the mafredri/feat-agent-send-metadata-in-batches branchOctober 13, 2023 14:48
@github-actionsgithub-actionsbot locked and limited conversation to collaboratorsOct 13, 2023
Sign up for freeto subscribe to this conversation on GitHub. Already have an account?Sign in.

Reviewers

@johnstcnjohnstcnjohnstcn approved these changes

@mtojekmtojekmtojek approved these changes

Assignees

@mafredrimafredri

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

batch coder agent metadata

3 participants

@mafredri@johnstcn@mtojek

[8]ページ先頭

©2009-2025 Movatter.jp