Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Support image output#1130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
Kludex wants to merge1 commit intomain
base:main
Choose a base branch
Loading
fromplaying-with-gemini-images-output
Draft

Conversation

Kludex
Copy link
Member

@KludexKludex commentedMar 15, 2025
edited
Loading

Still a lot to do, and decide... It's still not type safe, and can't usemessage_history properly.

Themain.py in the files already work tho.

image

ipfans reacted with thumbs up emojipappasam reacted with eyes emoji
@KludexKludex marked this pull request as draftMarch 15, 2025 13:20
@github-actionsGitHub Actions
Copy link

Docs Preview

commit:e8ba35b
Preview URL:https://6c9c1503-pydantic-ai-previews.pydantic.workers.dev

@DouweM
Copy link
Contributor

@Kludex Are you planning to work on this or are we better off closing it for now?

@Kludex
Copy link
MemberAuthor

This is still in my radar, I prefer to keep it open.

@ollz272
Copy link

hi, would love access to this feature, is there an ETA?

@lshamis
Copy link

I think something like this will be necessary sooner than later. Many models can/will generate interleaved multimodal content.

Slightly philosophical question, but why are the output types of an LLM different from those of ToolCall?

@DouweM
Copy link
Contributor

Slightly philosophical question, but why are the output types of an LLM different from those of ToolCall?

@lshamis Because the types of data LLMs support as input (whether that's via the user prompt as a tool call result) are not the same as the types of data they can output. For example, all models support text input and text output, and many support image, video, audio, and document input, but only a handful support image output, and as far as I know none can output e.g. PDF files. So there's necessarily a difference between the types of things we allow tools to output (as it's anything that can be sent back to the model as input) and what models themselves can output.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers
No reviews
Assignees

@KludexKludex

Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

Support for model Gemini Flash 2.0 Image Generation
4 participants
@Kludex@DouweM@ollz272@lshamis

[8]ページ先頭

©2009-2025 Movatter.jp