- Notifications
You must be signed in to change notification settings - Fork1.1k
WIP perf: significantly reduce DB calls toGetWorkspaceByAgentID via caching workspace information in Agent API#20662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Open
cstyan wants to merge2 commits intomainChoose a base branch fromcallum/workspace-agent-db-call-pt2
base:main
Could not load branches
Branch not found:{{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline, and old review comments may become outdated.
+780 −29
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
Signed-off-by: Callum Styan <callumstyan@gmail.com>
GetWorkspaceByAgentID for agent metadata updatesSigned-off-by: Callum Styan <callumstyan@gmail.com>
GetWorkspaceByAgentID via caching workspace information in Agent APIGetWorkspaceByAgentID via caching workspace information in Agent APISign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading.Please reload this page.
This query is still relatively expensive because of its call volume, at least 1 million times per day over the last week. After#20430 the two main remaining call paths are via the Workspace Agents Stats and Metadata APIs.
Disclaimer: Blink helped write tests, identify usage of
database.Workspacefields in downstream functions, identify existing pattern for and write the ticker function, and write comments.This PR adds caching of the Workspace information on the AgentAPI struct for usage in both child APIs, Notably, for the Metadata API we need to also implement a "fast path" for the section that currently calls
GetWorkspaceByAgentIDsince it's nested within a dbauthz call. We do this by attaching the relevant RBAC object from the Workspace object to the request context, then checking inUpdateWorkspaceAgentMetadataif we have the RBAC object in the contextand if it is correct for authorization purposes. If we do not have the object or it is invalid we fall back to still callingGetWorkspaceByAgentID.Note that while the workspace agent API struct is constructed at startup and (IIUC) is persistent for the lifetime of the agent/workspace, and makes a call to
GetWorkspaceByAgentIDduring this process, we also need to handle updating of some of the cached fields since the entries within the databasecan change as part of the workspace prebuild process. We could potentially due this via subscribing, via pubsub, to updates about prebuild workspace claims. However, for now this has simply been solved with a 5m ticker to make a call toGetWorkspaceByAgentID.This should mean we on average call
GetWorkspaceByAgentIDonce every 5 minutes per agent, as opposed to multiple times per-minute per agent for the Stats API and again for the Metadata API. Even in the worst case scenario of a workspace prebuild being claimed at t=N and the next ticker not occurring until t=N+5minutes we should have at most 5 minutes of the current behaviour for each agent in a workspace.NOTE: there is a remaining issue here; caching of the workspace for stats as it is currently implemented is actually invalid, at least for the ~5 minute period between when a prebuild is claimed and the next update of the ticker the cached Owner information will be incorrect and we'll update some incorrect metrics + skip the entire
if !workspace.IsPrebuild()block.We can either:
GetWorkspaceByAgentIDfor every stats updateGetWorkspaceByAgentIDif the workspace is a prebuild, and the first time itis not we could grab the lock and update the cached workspace informationGetWorkspaceByAgentID