- Notifications
You must be signed in to change notification settings - Fork1k
Description
Description
Context
@Kludex suggested opening an issue based on thisSlack thread.
I believe that access to the request–response duration for a model call should be a built-in metric, on par with what is provided inusage
.
Currently, you have to manuallyiter()
through the agent graph to track inner request–response timings, which is cumbersome.
Problem
ModelResponse
holds atimestamp
, but it is somewhat unreliable, as it is defined as:
The timestamp of the response. If the model provides a timestamp in the response (as OpenAI does), that will be used.
Conversely,ModelRequest
doesnot have a timestamp. Its individual parts do, presumably at build time, but the full request itself is not timestamped when it is actually sent.
ModelResponse
does have a timestamp—presumably when the response is received—but its parts do not, which makes sense since all parts arrive in the same response.
This leads to ambiguity and unnecessary work for anyone wanting to measure or log latency.
Proposal
- Add a
timestamp
field (local, UTC) toModelRequest
, marking the actual send time. - Clarify the
timestamp
inModelResponse
as the local time the response was received. - Add a
provider_timestamp
field toModelResponse
(if available, otherwiseNone
). - Add a
duration
field (timedelta
orfloat
in seconds) toModelResponse
, computed asresponse.timestamp - request.timestamp
. - (Optional) Add a
duration
field to agent runs, to capture total run time.
Note also that the graph persistent API tracks bothts
(timestamp) andduration
forNodeSnapshot andNodeSnapshot
References
No response