- Notifications
You must be signed in to change notification settings - Fork3.2k
Expose usage counts in OpenAI streamed responses (Fixes #2003)#2016
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Uh oh!
There was an error while loading.Please reload this page.
Conversation
…acking-in-streaming-functionsAdd streaming usage totals to AI chunks
CLAassistant commentedNov 24, 2025
Nihhaar Saini seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, pleaseadd the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let usrecheck it. |
f3799eb to44d31c2Compare
Salazareo left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Need to run test on this before merge but generally looks good to me
Salazareo left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
some changes required there.
I'm also wondering if this should just be implemented as part of the puterai stream instead of here in claude. But might be too specific to the model.
@ProgrammerIn-wonderland since you've been mostly working on the ai stuff, what do you think
| constinit_chat_stream=async({ chatStream})=>{ | ||
| constcompletion=awaitanthropic.messages.stream(sdk_params); | ||
| constusageSum={}; | ||
| construnningUsage={ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
| construnningUsage={ | |
| construnningUsage=this.usageFormatterUtil({}); |
should match actual claude usages to make it more visible
| // Each emitted content block now carries an incremental usage object | ||
| // ({ input_tokens, output_tokens, total_tokens }) for live metering. | ||
| constgetUsage=()=>({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
can just spread op the data to copy it
{...runningUsage}
| constpayload={ | ||
| type:'text', | ||
| text, | ||
| usage:getUsage(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
| usage:getUsage(), | |
| usage:{ ...runningUsage}, |
| input:JSON.parse(buffer), | ||
| ...(block.contentBlock?.text ?{} :{text:''}), | ||
| type:'tool_use', | ||
| usage:getUsage(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
| usage:getUsage(), | |
| usage:{ ...runningUsage}, |
| ...block.contentBlock, | ||
| input:JSON.parse(buffer), | ||
| ...(block.contentBlock?.text ?{} :{text:''}), | ||
| type:'tool_use', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This needs to go at top of the block, as stream block.contentBlock might override it, to match existing method
| if(!usageSum[key])usageSum[key]=0; | ||
| usageSum[key]+=meteredData[key]; | ||
| }); | ||
| runningUsage.input_tokens+=meteredData.input_tokens||0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
| runningUsage.input_tokens+=meteredData.input_tokens||0; | |
| for(constusageTypeinrunningUsage){ | |
| runningUsage[usageType]+=meteredData[usageType]; | |
| } |
Fixes#2003
This PR adds token usage exposure for streamed OpenAI responses:
stream_options.include_usagewhenstream: trueClaude implementation is already complete — this PR finishes OpenAI support.