Movatterモバイル変換


[0]ホーム

URL:


homepage

Issue27164

This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title:zlib can't decompress DEFLATE using shared dictionary
Type:behaviorStage:resolved
Components:Extension ModulesVersions:Python 3.6, Python 3.5
process
Status:closedResolution:fixed
Dependencies:Superseder:
Assigned To:Nosy List: Vladimir Mihailenco, gregory.p.smith, martin.panter, nadeem.vawda, python-dev, twouters, xiang.zhang
Priority:normalKeywords:patch

Created on2016-05-31 09:41 byVladimir Mihailenco, last changed2022-04-11 14:58 byadmin. This issue is nowclosed.

Files
File nameUploadedDescriptionEdit
issue27164.patchxiang.zhang,2016-06-01 11:02review
issue27164_with_test.patchxiang.zhang,2016-06-01 14:59review
raw-zdict.v3.patchmartin.panter,2016-06-05 06:21review
Messages (9)
msg266747 -(view)Author: Vladimir Mihailenco (Vladimir Mihailenco)Date: 2016-05-31 09:41
Consider following code:```import zlibhello = b'hello'# Compress data using deflate and shared/predefined dict.co = zlib.compressobj(wbits=-zlib.MAX_WBITS, zdict=hello)data = co.compress(hello) + co.flush()# Decompress deflate by providing same dict.do = zlib.decompressobj(wbits=-zlib.MAX_WBITS, zdict=hello)data = do.decompress(data)print(data)```Decompression fails with "zlib.error: Error -3 while decompressing data: invalid distance too far back".My original task was to decompress data I get from Golang library -https://golang.org/pkg/compress/flate/#NewWriterDict
msg266804 -(view)Author: Xiang Zhang (xiang.zhang)*(Python committer)Date: 2016-06-01 07:48
This seems related with the difference between zlib format and raw format. When you do raw inflate, you have to inflateSetDictionary before any inflate. You cannot rely on the first inflate to return Z_NEED_DICT and then do inflateSetDictionary.Although I have the solution with experiment, but there is nothing can be found in the official zlib documentation. It is rather vague. I'll make more research and submit a patch later..Right now, you can use zlib format (positive wbits) if it can achieve your goal. It's behaviour is right.
msg266814 -(view)Author: Xiang Zhang (xiang.zhang)*(Python committer)Date: 2016-06-01 11:02
I think my conclusion in last message is right. A similar bug in CPAN can be found inhttps://rt.cpan.org/Public/Bug/Display.html?id=36046. And I can find in zlib's changelog inflateSetDictionary starts to support raw inflate from 1.2.2.1.Patch attached now allows Python to decompress raw deflate format with a dictionary. Hope some core devs are willing to review.
msg266819 -(view)Author: Xiang Zhang (xiang.zhang)*(Python committer)Date: 2016-06-01 14:59
Forget to add test. Upload a new one with test included.
msg266871 -(view)Author: Martin Panter (martin.panter)*(Python committer)Date: 2016-06-02 03:59
Thanks for the patches. Maybe Gregory knows more about this than I do (I never used zdict). But I think I have figured it out today, so I left some review comments :)I’m not sure whether this really is a bug fix, although it would have little if any impact on existing code. It seems zdict was implemented with just the non-raw zlib format in mind, so supporting raw deflate mode is a new feature. If we were to add full support of inflateSetDictionary() for raw deflate mode, I imagine it would be a new decompressobj.set_dictionary() method.I did some quick tests with specifying an unnecessary zdict buffer to the decompressor (when zdict not used for compression), and it doesn’t _seem_ to break anything, but I am not 100% sure.Notes from my learning what zdict means:Zdict seems to be a bunch of user data to setup or train the (de)compressor with. It was added byIssue 14684. (Not a Python dictionary, nor any special zlib dictionary format. The Python documentation is confusing, and the Go documentation made more sense, assuming it is equivalent.)See the documentation of in/deflateSetDictionary() in <http://www.zlib.net/manual.html> or zlib.h. Looking through the history at <https://github.com/madler/zlib> can also be useful.
msg266876 -(view)Author: Xiang Zhang (xiang.zhang)*(Python committer)Date: 2016-06-02 05:59
Thanks for your review, Martin. :) I have replied with after thinking and try.Yes, this is more like a feature request. But for the end user, this behaviour is quite like a bug.> If we were to add full support of inflateSetDictionary() for raw deflate mode, I imagine it would be a new decompressobj.set_dictionary() method.I don't support adding a new set_dictionary() method. Right now decompressobj handles raw deflate format, zlib format and gzip format. Adding a set_dictionary() method for raw deflate format seems weird. If we are going to do so, I suggest reorganize the zlib like PHP and nodejs do, separating different formats into different objects.> I did some quick tests with specifying an unnecessary zdict buffer to the decompressor (when zdict not used for compression), and it doesn’t _seem_ to break anything, but I am not 100% sure.This patch only affects raw inflate. You have to set wbits negative. If you are decompressing with an unnecessary zdict, I think it ought to emit an error.With Python background, zlib dictionary is really hard to understand. I think manual doesn't tell much. I understand why zdict is needed until readinghttps://blog.cloudflare.com/improving-compression-with-preset-deflate-dictionary/.
msg267391 -(view)Author: Martin Panter (martin.panter)*(Python committer)Date: 2016-06-05 06:21
Here is a patch which I propose to use that factors out a set_inflate_zdict() function. I think this will help with maintainence, e.g. if someone changes the UINT_MAX checking, but forgets that there are two places than need changing. Let me know if that is okay with you :)
msg267405 -(view)Author: Xiang Zhang (xiang.zhang)*(Python committer)Date: 2016-06-05 09:05
It looks good. I don't think that is a matter since usually I will do a search before change. But of course extracting them to a subroutine helps.
msg267416 -(view)Author: Roundup Robot (python-dev)(Python triager)Date: 2016-06-05 12:36
New changesetd91b951e676f by Martin Panter in branch '3.5':Issue#27164: Allow decompressing raw Deflate streams with predefined zdicthttps://hg.python.org/cpython/rev/d91b951e676fNew changeset470954641f3b by Martin Panter in branch 'default':Issue#27164: Merge raw Deflate zdict support from 3.5https://hg.python.org/cpython/rev/470954641f3b
History
DateUserActionArgs
2022-04-11 14:58:31adminsetgithub: 71351
2016-06-06 01:48:25martin.pantersetstatus: open -> closed
resolution: fixed
stage: commit review -> resolved
2016-06-05 12:36:56python-devsetnosy: +python-dev
messages: +msg267416
2016-06-05 09:05:31xiang.zhangsetmessages: +msg267405
2016-06-05 06:21:19martin.pantersetfiles: +raw-zdict.v3.patch
versions: + Python 3.6
messages: +msg267391

components: + Extension Modules, - Library (Lib)
stage: patch review -> commit review
2016-06-02 18:50:16gregory.p.smithsetassignee:gregory.p.smith ->
2016-06-02 05:59:08xiang.zhangsetmessages: +msg266876
2016-06-02 03:59:46martin.pantersetmessages: +msg266871
stage: patch review
2016-06-01 21:30:16gregory.p.smithsetassignee:gregory.p.smith

nosy: +gregory.p.smith
2016-06-01 14:59:58xiang.zhangsetfiles: +issue27164_with_test.patch

messages: +msg266819
2016-06-01 11:02:41xiang.zhangsetfiles: +issue27164.patch
keywords: +patch
messages: +msg266814
2016-06-01 07:48:27xiang.zhangsetnosy: +twouters,nadeem.vawda,martin.panter
messages: +msg266804
2016-05-31 10:33:32xiang.zhangsetnosy: +xiang.zhang
2016-05-31 09:41:03Vladimir Mihailencocreate
Supported byThe Python Software Foundation,
Powered byRoundup
Copyright © 1990-2022,Python Software Foundation
Legal Statements

[8]ページ先頭

©2009-2026 Movatter.jp