- Notifications
You must be signed in to change notification settings - Fork425
Introduced new chunked transfer encoding parser to remove chunk markers#720
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
chunk_encoding_parser() : state(state_t::header), chunk_size(0) {} | ||
enum state_t { header, header_end, data, data_end }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I think you can use anenum class
here instead.
size_t chunk_size; | ||
std::array<typename char_<Tag>::type, 1024> buffer; | ||
void update_chunk_size(boost::iterator_range<typename std::array<typename char_<Tag>::type, 1024>::const_iterator> const& range) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Yeah, this is a really long line. We prefer usingclang-format
to make sure we have consistent formatting in the project, so if you don't mind running this patch through it, that would be really good for readability.
ss << std::hex << range; | ||
size_t size; | ||
ss >> size; | ||
chunk_size = (chunk_size << (range.size()*4)) + size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Can you help me understand this math?
Ifrange
has a size of 1024, this meanschunk_size
is shifted left 4096 times?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Yes, but range must correspond to the number of digits and 1024 digits would be an unrealistically large number (it would not throw an error though cause there is no overflow checking). I replaced the '+' with a '|' to make it more clear that this expression is just about appending the new bits to the lower end (and shifting the existing bits by the number of the new bits). Every hex digit corresponds to a quadruple of bits, that's why i multiply the number of digits by four. I didn't want to introduce another char buffer to accumulate the hex digits, so I decided to incrementally compute the chunk size.
// } | ||
// return body; | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Consider just removing these lines?
typename client::options options; | ||
options.remove_chunk_markers(true); | ||
client client_; | ||
typename TypeParam::request request("http://www.boost.org:80/"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I know this might be out-of-scope, but could you make this point tocpp-netlib.org
instead? That would be great, thanks. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Unfortunately cpp-netlib.org does not yield chunk encoded content. It does not matter much for this particular test, but for the streaming test we'd have all scenarios covered withwww.boost.org: it yields chunk encoded content for http/1.1 clients and non encoded content for http/1.0 clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Ah, good point, thanks.
Is there any chance to merge this PR in the near future? Is there still something missing? |
Hi@umennel -- I'm so sorry about the delay, this just slipped through the cracks. :( Let me have a quick look and merge. |
LGTM |
body_string.append(this->part.begin(), this->part.begin() + bytes_transferred); | ||
if (this->is_chunk_encoding) { | ||
this->body_promise.set_value(parse_chunk_encoding(body_string)); | ||
if (this->is_chunk_encoding && remove_chunk_markers_) { |
igorpeshanskyMar 7, 2018 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
@umennel Careful -- this actually breaks previous behavior. Now, unless you setremove_chunk_markers
totrue
, the following code will produce a chunked-encoded string:
http::client client;http::client::requestrequest("http://www.boost.org:80/");http::client::response response = client.get(request);std::cerr << body(response) << std::endl;
@deanberris Is this what we want?
To put it another way, can you think of a case where we wouldn't wantremove_chunk_markers
to default totrue
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
@igorpeshansky, indeed, the default should be true. Otherwise it is hard to keep the behavior consistent. We should decide on this before merging into 0.13.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Great, I'll send a PR shortly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Sent#830.
I introduced a new client option called remove_chunk_markers. With this option selected, a new chunked transfer encoding parser is enabled which is able to parse a stream of small chunks and incrementally remove the chunk markers. The parser requires to keep internal state, so I had to parser function with a function object.
I selected the new option in some of the tests to cover the new code. Proper unit tests are missing yet.
Compiled with gcc 5.4 only.
Some issues might be worth to discuss a bit further: