- Notifications
You must be signed in to change notification settings - Fork153
Allow seeking back to read skipped chunks#657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:master
Are you sure you want to change the base?
Conversation
197g left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I like it in principle, seeking is a capability that will be highly in the streaming decoder. Since all the seeks are relation we should make sure that intermittent interrupts work correctly with the position tracking. I don't forsee many problems with this but a few tests would still be added.
| ifself.exif_position.is_none(){ | ||
| self.exif_position =Some(chunk_start); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The error treatment here differs from parsing theeXIf andiCCP chunk. In the latter case we error first with aFormat and then later, probably, downgrade that toBadAncillaryChunk in most cases. Here however we do nothing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Also do we consider the optionchanging throughset_ignore_exif_chunk while it is read, i.e. first reading one fully and then only recording a position would still be a duplicate chunk. That may not happen but without the interface really assuring it we should know if that is in-scope of the implementation or not.
| let chunk_start = stream_position -8 -self.current_chunk.raw_bytes.len()asu64; | ||
| matchself.current_chunk.type_{ | ||
| chunk::tEXt | chunk::zTXt | chunk::iTXt =>{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Wouldn't we still want to know the exact kind of text chunk?zTXt is handled very different from the other chunks. We can of course parse it back when we seek back and re-read the chunk anew but it's fixed-size metadata—so we might as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Yeah, I wasn't sure whether to save the chunk size/type or just re-read it when requested. Easy enough to track it alongside the chunk start position.
With#630 in place, we now have most of the pieces to load PNG metadata without consuming an unbounded amount of space.
The key here is that we track just the start position of each unbounded-size chunk we might care about. Which means that we can track thousands of text chunks with even a tiny space limit. When it comes time to read the chunks, we can seek back to them and provide the desired info. But if the chunk info is never requested then we don't pay the overhead of storing them. When in recording-mode, I believe that the only non-constant memory usage should be recording text chunk positions and reserving two rows of space for the unfiltering buffer.
This is particularly useful for the
imagecrate which needs to know whether there's a tRNS chunk (to determine whether the color type is RGB or RGBA) before it knows whether the EXIF or ICC profile will be requested. By doing things this way, it neatly sidesteps the question of setting decoding limits before reading metadata. The main thing decoding limits were being used for was prevent zip-bombs in the compressed iCCP chunks and to a lesser degree preventing unreasonably sized EXIF or text chunks.