- Notifications
You must be signed in to change notification settings - Fork673
Decoder metadata interface#2672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Conversation
mstoeckl commentedNov 27, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
With this framing, I think Limits::max_image_width and Limits::max_image_height no longer need to be communicated to or handled by the ImageDecoder trait, because the external code can check ImageDecoder::dimensions() before invoking ImageDecoder::read_image(); only the memory limit (Limits::max_alloc) is essential. That being said, the current way Limits are handled by ImageDecoder isn't that awkward to implement, so to reduce migration costs keeping the current ImageDecoder::set_limits() API may be OK. |
fintelia commentedNov 27, 2025
A couple thoughts... I do like the idea of handling animation decoding with this same trait. To understand, are you thinking of "sequences" as being animations or also stuff like the multiple images stored in a TIFF file? Even just handling animation has some tricky cases though. For instance in PNG, the default image that you get if you treat the image as non-animated may be different from the first frame of the animation. We might need both a The addition of an |
197g commentedNov 27, 2025
It's a dyn-compatible way that achieves the goal of the constructor so it is actually an abstraction.
What do you by this? The main problem in I'm also not suggesting that calling |
197g commentedDec 4, 2025
@fintelia This now includes the other changes including to
|
RunDevelopment commentedDec 6, 2025
I can't speak about image metadata, but I really don't like the new
Regarding rectangle decoding, I think it would be better if we force decoders to support arbitrary rects. That's because the current interface is actually less efficient by allowing decoder to support only certain rects. To read a specific rect that is not supported as is, However, most image formats are based on lines of block (macro pixels). So we can do a trick. Decode a line according to the too-large rect, and then only copy the pixels in the real rect to the output buffer. This reduces the memory overhead for unsupported rects from And if a format can't do the line-based trick for unsupported rects, then decoders should just allocate a temp buffer for the too-large rect and then crop (=copy what is needed). This is still just as efficient as the best For use cases where users can use rowpitch to ignore the exccess parts of the too-large rect, we could just have a method that gives back a preferred rect, which can be decoded very efficiently. So the API could look like this: traitImageDecoder{// .../// Returns a viewbox that contains all pixels of the given rect but can potentially be decoded more efficiently./// If rect decoding is not supported or no more-efficient rect exists, the given rect is returned as is.fnpreferred_viewbox(&self,viewbox:Rect) ->Rect{ viewbox// default impl}fnread_image_rect(&mutself,buf,viewbox) ->ImageResult{Err(ImageError::Decoding(Decoding::RectDecodingNotSupported))// or similar} This API should make rect decoding easier to use, easier to implement, and allow for more efficient implementations. |
86c9194 tocdc0363Compare197g commentedDec 7, 2025
That was one of the open questions, the argument you're presenting makes it clear it should return the layout and that's it. Renamed to
It's suppose to be to the full image. Yeah, that needs more documentation and pointers to the proper implementation. |
| self.limits.check_dimensions(layout.width, layout.height)?; | ||
| // Check that we do not allocate a bigger buffer than we are allowed to | ||
| // FIXME: should this rather go in `DynamicImage::from_decoder` somehow? | ||
| self.limits.reserve(layout.total_bytes())?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This has a subtle behavior change. Previously we reserved the space for the output buffer andthen passed the reduced memory limit onto the decoder to limit its metadata/scratch buffer allocations. But at this point in the code we've already told the decoder that it is free to use the entire memory limit amount for those purposes. So we effectively only limiting memory to 2x the requested amount.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
That's true. The quicker fix here is to pass limits only once we have gotten the layout which would be closest to what we're doing right now on the main branch but obviously not really address if decoders are able to handle it. Ingif we .. duplicate the allocation arbitrarily, right now, too:
Lines 172 to 174 in7f44147
| self.limits.reserve_usize(buffer_size)?; | |
| letmut frame_buffer =vec![0; buffer_size]; | |
| self.limits.free_usize(buffer_size); |
We must rework the limits system regardless if it is to work with animations, too. That is to say at some point the struct will have been moved into the decoder but some higher level part of IO is still going to do allocations; but also de-allocations (!). Should those count? How do we access the limits if so?
For consuming animations you'd typically have a swapchain-like fixed capacity queue setup. The decoder would consume some of the limits while allocating for each new frame, then those limits should get refreshed over time when the frames have been fully consumed and dropped.. We can't fully handle this either right now (as seen in gif).
src/io/image_reader_type.rs Outdated
| /// methods than [`ImageFile`] and under the hood dispatches into the set of supported | ||
| /// [`ImageDecoder`] implementations. For decoder interface that are provided for efficiency it | ||
| /// negotiates support with the underlying decoder and then emulates them if necessary. | ||
| pubstructImageReader<'lt>{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I like having two structs, though it would be nice to pick names that make the connection clearer. PerhapsImageReader /ImageReaderOptions orImageFile /ImageFileBuilder or similar? We could make it a bit less verbose by adding constructors directly on the second struct for the common operations:
implImageReader{fnnew<R>(reader:R) ->ImageResult<Self>{ImageReaderOptions::from_reader(reader).with_guessed_format().load()}fnopen<P>(path:P) ->ImageResult<Self>{ImageReaderOptions::from_path(path).load()}}
| } | ||
| } | ||
| implImageReader<'_>{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
nit: it is easier for scrolling through the file if the impl blocks for each struct immediately follow the definitions
| /// The decoder should attempt to restrict the amount of decoder image data to the indicated | ||
| /// box. It should return `Ok` if it supports the viewbox directly, or `Err` if it does not | ||
| /// support that exact viewbox. In the either case the decoder must indicate the new image | ||
| /// dimensions in subsequent calls to [`Self::init`]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Which formats do we expect to be able to implement viewboxes?
TIFF should be able to do it by only reading the necessarily strips/tiles. And DDS can because GPU compressed formats are designed for random access. But for other compressed formats, I think it might be rather annoying?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Evenjpeg could profit. We'll need to be more precise in the description but I had intended that, in either return case, thevalues outside the initially indicated viewbox will be allowed to be treated as 'dont-care' by the decoder in their original (or only partially cropped) layout. For these pixels the decoder not need present the actual underlying image data.
With some integration that would allow wins beyond just a binary decoding decision. Some form of integration would be possible for:
- all the PAM varieties, bmp, tga save on IO with full support; for small viewboxes if we
Seek/read_at - in
jpegwe need not process the data from any of the MCU's that do not intersect the data even if the bitstream must be decoded to align with the file avifandwebpseem similar to that where they might avoid processing values (obviously needing integration of it).
For anything non-RGB we get to save on conversion, too. I expect we uncover different forms of gains once we have a better understanding of a declarative form.
| self.viewbox =Some(viewbox); | ||
| } | ||
| /// Get the previously decoded EXIF metadata if any. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The "previously decoded" part here makes me a bit nervous. I think we'll want to be clearer about what the user has to do to make sure they don't getNone for images that actually do have EXIF data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This is a bit odd and will depend on the format and metadatum. For instance, XMP is encoded per-image in tiff but only once in gif (despite us representing this as an image sequence) and also only once in png (no word about APNG). Problematically, in gif and png the standard requires absolutely no ordering with any of the other chunks. So it might be encountered before all of the header information is done; or after all the images have been consumed.
The problem with back references is of course the unclear association. And when multiple are included we always have a problem with consuming them 'after the end' since it should need to be buffered or the decoder able to store seek-back points (like png). Encoding all that in the interface is a challenge, iwill incur some unavoidable complexity.
Uh oh!
There was an error while loading.Please reload this page.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
cdc0363 to1a114c3CompareThis was one of the silently broken features when adding the newmethods. While reviewing how to address the underlying cause with aclippy fix it became obvious a few other methods were missing, too.
Change ImageDecoder::read_image to `fn(&mut self)`. There are somecodecs where metadata and other information is only available afterdecoding.Add ImageDecoder::init for codecs where reading the header needs to takeinto account limits and other configuration we are to introduce.
Renames it `peek_layout` to clarify the relation to `read_image`, thatit is to be called potentially multiple times for each such call.Document the trait more with this relationship, moving some methodsaround to clarify.
The latter interface will be something that interfaces and interactswith the underlying decoder. Introduces ImageDecoder::set_viewport as astand-in for what is intended here: color, subsampling, gain mapapplication will take more interaction, clarification of how thelibrary will request their data, and adjustments to the layoutdefinition by the decoder.
The purpose of the trait is to be careful with the boundary to the`moxcms` dependency. We use it because it is a quality implementationbut it is heavy weight for what we need. There's other possible ways toprovide transfer functions and color space transforms. Now this alsointroduces ICC profile parsing but again that could be done with a*much* lighter dependency as we only need basic information from it. Thetrait should make every little additional cross-dependency a consciousdecision.Also it should be the start of a customization point, by feature flag oractually at runtime.
1a114c3 to306c6d2Compare197g commentedDec 16, 2025
Resolving the naming question as |
No longer responsible for ensuring the size constraints are met underthe new policy and with available of constructing a reader from aninstance of a boxed decoder.
This allows us to write a generic iterator which uses the same decoderfunction to generate a whole sequence of frames. The attributes aredesigned to be extensible to describe changes in available metadata aswell, very concretely some formats require that XMP/ICC/… are polled foreach individual image whereas others have one for the whole file and putthat at the end. So there's no universal sequence for querying themetadata, and we need to hold runtime information. This will be thefocus of the next commit.
Uh oh!
There was an error while loading.Please reload this page.
See#2245, the intended
ImageDecoderchanges.This changes the
ImageDecodertrait to fix some underlying issues. The main change is a clarification to the responsibilities; the trait is an interface from an implementortowards theimagelibrary. That is, the protocol established from its interface should allowus to drive the decoder into our buffers and our metadata. It isnot optimized to be used by an external caller which should prefer the use ofImageReaderand other inherent methods instead.This is a work-in-progress, below motivates the changes and discusses open points.
ImageDecoder::peek_layoutencourages decoders to read headers after the constructor. This fixes the inherent problem we had with communicating limits. The sequences for internal use is roughly:ImageDecoder::read_image(&mut self)no longer consumesself. We no longer need the additionalboxedmethod and its trait work around, the trait is now dyn-compatible.Discussion
next_layoutmore consistentlyinitpeek_layoutshould return the full layout information in a single struct. We have a similar open issue forpngin its own crate, and the related work fortiffis in the pipeline where itsBufferLayoutPreferencealready exists to be extended with said information.imageside.1.1, but it's not highly critical.read_imageis 'destructive' in all decoders, i.e. re-reading an image and reading an image beforeinitshould never access an incorrect part of the underlying stream but instead return an error. Affects pnm and qoi for instance where the read will interpret bytes based on the dimensions and color, which would be invalid before reading the header and only valid forone read.read_imagethen switching to a sequence reader. But that is supposed to become mainly an adapter that implements the iterator protocol.ImageReaderwith a new interface to return some of it. That may be better suited for a separate PR though.CicpRgband apply it to a decodedDynamicImage.