- Notifications
You must be signed in to change notification settings - Fork905
feat: add cli command to report task status#18262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Uh oh!
There was an error while loading.Please reload this page.
Conversation
7d3cfe5
to80543df
Comparee2b8d8a
to04b4e91
CompareSince we will be sending status updates without messages on an interval,we need to prevent duplicates and preserve the previous message.Also, I think this makes the endpoint more in alignment with PATCHsemantics.
It was using the latest state for all statuses.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
if err != nil && !errors.Is(err, context.Canceled) { | ||
cliui.Warnf(inv.Stderr, "failed to fetch status: %s", err) | ||
} else { | ||
// Currently we only update the status, which leaves the last summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
AgentAPI reports that an agent is running whenever the underlying agent's terminal output changes. This also happens when a user uses the "control" mode to send raw keystrokes to the terminal. Once we implement web push notifications, they might start going off annoyingly in control mode.
A more robust solution might also take into account:
- agentapi conversation length
- latest app summary
If we reported that the status is stable for a given length and summary, we could stop sending updates until either length or summary changes. If these values don't change, it likely means the user is just fiddling in the terminal rather than the agent doing actual work.
To avoid data races with the agent, the status update endpoint would:
- need to be atomic (i.e. a single transaction with a for update lock on the workspace)
- optionally accept a "latest_summary" argument, which would be the latest app status summary the reporting client has seen. If it doesn't match the current summary, the endpoint would skip the update.
In cases where the conversation length increases but the app status doesn't change, we could send a "Working..." summary update, which the agent would later replace with a richer description via MCP.
code-asherJun 6, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
That makes sense.
If we reported that the status is stable for a given length and summary, we could stop sending updates until either length or summary changes. If these values don't change, it likely means the user is just fiddling in the terminal rather than the agent doing actual work.
Should we make these checks in agentapi itself? It feels surprising to me (as a casual user of agentapi) that I could hit/status
and it might reportrunning
because of my own typing, especially since the description says it is the status of the agent.
optionally accept a "latest_summary" argument, which would be the latest app status summary the reporting client has seen. If it doesn't match the current summary, the endpoint would skip the update
I have the endpoint skipping the update if the summary is unchanged by checking the last summary in the DB. We could have the client send the last summary, but we already have it in the DB, so it seems more robust to me if we just fetch it as part of the API call. Thoughts?
need to be atomic (i.e. a single transaction with a for update lock on the workspace)
Oh this is perfect, I was not entirely sure what mechanisms I could or should use for this, the for update lock looks perfect ty tyEdit: maybe I should do the lock in code, to not block anything unrelated to statuses trying to update the workspace? Another thought I had was that maybe it could be done as some kind of procedure or trigger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Should we make these checks in agentapi itself?
It's a good feature request, but it's a bit hard to do because of how agentapi is currently implemented. The "control mode" is an escape hatch that bypasses standard agentapi message processing. I'm worried that if I started refactoring the logic now it'd take too long, and we only have until the June 25th code freeze to ship tasks early access. I believe adding the check here would be more straightforward. Would you be ok with that?
it seems more robust to me if we just fetch it as part of the API call. Thoughts?
I'm worried about the following data race:
- agent submits a "working" update with summary A. coderd receives it
- task reporter sees that the agent went idle and submits a "completed" update. the update is in transit, coderd hasn't received it yet
- user sends a new message to the agent, agent submits a "working" update with summary B, coderd receives it
- coderd receives the "completed" update from point 2. It marks status with summary B as "completed" even though the agent is still working on a task.
If the task reporter submitted the summary it wanted to update as part of the message, coderd could detect a mismatch by comparing it with what's in the db and ignore the update in point 4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
It's a good feature request, but it's a bit hard to do because of how agentapi is currently implemented
Aw I was hoping I could make it work by just checking if the message count or summary had changed in agentapi, but I guess I forgot we have no summary there so I see what you mean, we would have to fix it in some other way that requires a more fundamental change. 😭
I'm worried about the following data race
Ahhhhhhhhh I see. If it was just the task reporter, there would be no out-of-order requests because it blocks waiting for the response, but with both the task reporter and the agent there could be out-of-order requests. 🤔
I think, though, because the task reporter would need to receive the last summary out of band, this introduces its own race. Something like:
- agent sends
{status: working, summary: doing thing 1, last_summary: <blank>}
, coderd receives - task reporter sees idle, requests the current summary
- agent sends
{status: working, summary: doing thing 2, last_summary: doing thing 1}
, coderd receives - coderd receives 2, sends back
{current_summary: doing thing 2}
- task reporter sends
{status: complete, summary: <blank>, last_summary: doing thing 2}
- coderd does not discard 5 when it should because
last_summary == current_summary
Also I wonder if the agent could race against itself, I am not sure if there are concurrency limits on the report tool in the MCP server.
Discarding might be unfortunate anyway though because then you lose the status history which we display on the workspace page. Personally I am not sure we even need the history though, maybe we could just store the latest status?
Hmmm...maybe we could add aDate
header so we can insert the requests in the right order? Or, if we move this code into the MCP server itself and ensure correct ordering for all requests in one place? The date seems cool though, lets any number of clients update the status.
If we reported that the status is stable for a given length and summary, we could stop sending updates until either length or summary changes. If these values don't change, it likely means the user is just fiddling in the terminal rather than the agent doing actual work.
- Right now the task reporter does not have access to the last summary, so we would have to fetch it but for the racy reasons above I think we cannot do this. If we merged the code into the MCP server though?
- Even if we have the summary, the agent not sending it reliably could be an issue. So if the user types a new prompt without using the API and the agent starts running, that all goes into the last message (I think I have that right? From what I read we only add a new message after a user message is explicitly added through the API). So we may never get a new summary, the message count will be unchanged, and we will skip the update (similar to the original problem).
Maybe we could try detecting if someone is typing in the prompt area, or if it looks like the screen is only changing because it is being scrolled or something, seems tricky though.
The only other thing I can think of, if we need to continue supporting direct terminal access, is if we abandon running the TUIs and create our own wrapper that lets us accurately (maybe??) report when an agent is executing.
Uh oh!
There was an error while loading.Please reload this page.
Plus some modification to the PATCH endpoint to make it more PATCH-like.
The command loops itself, but I wonder if we should handle looping externally.