Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Selected patches from Calibre#245

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
gsnedders merged 5 commits intohtml5lib:masterfromgsnedders:calibre-selected-1
May 22, 2016

Conversation

gsnedders
Copy link
Member

See#119. CC@kovidgoyal.

This cherry-picks a few things fromhttps://github.com/gsnedders/html5lib-python/commits/calibre-patches, which was a complete set of Calibre's patches from November 2013.https://github.com/kovidgoyal/calibre/commits/master/src/html5lib has very little changed in it since then, primarily a move to 0.999999-dev and a separate downstream fix for0c551c9.

So, of those on that branch…

  • gsnedders@49f37d2,gsnedders@64e8b0b, andgsnedders@cc9f28a are all things I'm against landing upstream (they should just be special-cased in the downstream tree builder rather than requiring invoking separate handling for all tree builders).
  • gsnedders@7702d80 is something I find very inelegant, and goes against the recommendations laid out in PEP 8. Also, with CPython, in theTrue/False cases it's likely slower, therefore failing at its stated goal, as it results in more byte code andPOP_JUMP_IF_FALSE andPOP_JUMP_IF_TRUE special-case the condition beingTrue orFalse (oddly, they don't specialiseNone, though it is inPyObject_IsTrue; if that makes any notable performance difference then I'd suggest fixing that in CPython).
  • gsnedders@a2d2e05 is unlikely to land because the input stream isn't public API (not that we document what is anywhere…).
  • gsnedders@1576515 in principle is something we want, but I think we want for all tokens and not just elements. (See alsoAdd line/column number in tokens from HTMLTokenizer #87 andAdd element.sourceline (lxml) #97.) That or something similar will likely land in future.
  • gsnedders@124e975 is included here, after9dca7d8 which fixes the DOM tree builder to follow the tree builder API correctly. (It exposed bugs in the DOM tree builder, yay!)
  • gsnedders@db0b5a0 is what most of the commits on this PR deal with, fixing up small issues with the Calibre code.

@gsnedders
Copy link
MemberAuthor

This seems to have broken 2.6 badly. Huh.

@codecov-io
Copy link

codecov-io commentedMay 7, 2016
edited
Loading

Current coverage is89.15%

Merging#245 intomaster will decrease coverage by1.39%

@@             master       #245   diff @@==========================================  Files            51         50     -1     Lines          6817       6726    -91     Methods           0          0            Messages          0          0            Branches       1316       1307     -9   ==========================================- Hits           6172       5996   -176- Misses          485        559    +74- Partials        160        171    +11

Powered byCodecov. Last updated by143b0d4...82b971c

@gsnedders
Copy link
MemberAuthor

Oh, right, this isviewkeys (and dictionary views in general) not existing on 2.6.

@gsneddersgsnedders modified the milestone:0.99999999May 8, 2016
@kovidgoyal
Copy link
Contributor

kovidgoyal commentedMay 8, 2016
edited
Loading

How do you suggest I override the application of attributes for html and body tags in my builder? Since without those patches, it would require overriding the entire getPhases() method in html5parser.py Remember that the problem those patches is solving is that there can be multiple<html> and<body> tags in malformed documents that I need to handle by merging the attributes from those tags into the first / tag. Doing that after parsing would require an whole extra tree traversal which can be slow for large trees.

If you dont want to mergegsnedders/html5lib-python@a2d2e05 then how do you suggest I replace the the stream input class? The one in html5lib is too slow. The only alternative I can see is monkey patching -- which is less than optimal for obvious reasons.

@gsnedders
Copy link
MemberAuthor

@kovidgoyal I'll take a look at dealing with html/body attributes later (I'm literally amount to board a plane).

When it comes to the input stream, if it yields good perf increases when given a byte/unicode object we should just specialise them in html5lib.

@kovidgoyal
Copy link
Contributor

Sure you are welcome to take the input stream class from calibre for dealing with unicode objects. It is faster because it avoids wrapping the unicode in StringIO. And it actually implements tracking of positions. For my use case, that is important, since I need line and col numbers.

@landscape-bot
Copy link

Code Health
Code quality remained the same when pulling1f04a3f on gsnedders:calibre-selected-1 intocc99095 on html5lib:master.

@landscape-bot
Copy link

Code Health
Code quality remained the same when pulling96643a2 on gsnedders:calibre-selected-1 intocc99095 on html5lib:master.

@landscape-bot
Copy link

Code Health
Code quality remained the same when pulling76bf242 on gsnedders:calibre-selected-1 intocc99095 on html5lib:master.

@landscape-bot
Copy link

Code Health
Code quality remained the same when pulling761f3ab on gsnedders:calibre-selected-1 into143b0d4 on html5lib:master.

@gsneddersgsnedders merged commit5288737 intohtml5lib:masterMay 22, 2016
@gsneddersgsnedders deleted the calibre-selected-1 branchMay 22, 2016 02:05
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers
No reviews
Assignees
No one assigned
Projects
None yet
Milestone
0.99999999
Development

Successfully merging this pull request may close these issues.

4 participants
@gsnedders@codecov-io@kovidgoyal@landscape-bot

[8]ページ先頭

©2009-2025 Movatter.jp