- Notifications
You must be signed in to change notification settings - Fork1k
Support image output#1130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Support image output#1130
Conversation
Docs Preview
|
@Kludex Are you planning to work on this or are we better off closing it for now? |
This is still in my radar, I prefer to keep it open. |
ollz272 commentedJun 30, 2025
hi, would love access to this feature, is there an ETA? |
lshamis commentedJul 7, 2025
I think something like this will be necessary sooner than later. Many models can/will generate interleaved multimodal content. Slightly philosophical question, but why are the output types of an LLM different from those of ToolCall? |
@lshamis Because the types of data LLMs support as input (whether that's via the user prompt as a tool call result) are not the same as the types of data they can output. For example, all models support text input and text output, and many support image, video, audio, and document input, but only a handful support image output, and as far as I know none can output e.g. PDF files. So there's necessarily a difference between the types of things we allow tools to output (as it's anything that can be sent back to the model as input) and what models themselves can output. |
Uh oh!
There was an error while loading.Please reload this page.
Still a lot to do, and decide... It's still not type safe, and can't use
message_history
properly.The
main.py
in the files already work tho.