Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

etree lxml.etree walker can't serialize full documents #345

Open
@cjerdonek

Description

@cjerdonek

The etree walker with implementationlxml.etree doesn't work when passed a full html document (having typelxml.etree._ElementTree).

To reproduce--

def serialize(element, treebuilder, implementation=None):    walker_cls = html5lib.getTreeWalker(treebuilder, implementation=implementation)    walker = walker_cls(element)    serializer = HTMLSerializer(omit_optional_tags=False)    html = serializer.render(walker)    print(html)html = """<!DOCTYPE html><html><head>    <title>foo</title></head><body>    <p>a</p><p>b</p></body></html>"""builder = html5lib.getTreeBuilder('lxml')parser = html5lib.HTMLParser(builder, namespaceHTMLElements=False)element = parser.parse(html)serialize(element, 'lxml')serialize(element, 'etree', implementation=lxml.etree)

The last line fails with the following error:

Traceback (most recent call last):  File "test-html5lib.py", line 98, in <module>    parse_and_serialize(element, 'etree', implementation=lxml.etree)  File "test-html5lib.py", line 79, in serialize    html = serializer.render(walker)  File "/.../python3.6/site-packages/html5lib/serializer.py", line 323, in render    return "".join(list(self.serialize(treewalker)))  File "/.../python3.6/site-packages/html5lib/serializer.py", line 209, in serialize    for token in treewalker:  File "/.../python3.6/site-packages/html5lib/treewalkers/base.py", line 128, in __iter__    firstChild = self.getFirstChild(currentNode)  File "/.../python3.6/site-packages/html5lib/treewalkers/etree.py", line 88, in getFirstChild    if element.text:AttributeError: 'lxml.etree._ElementTree' object has no attribute 'text'

The walker should probably first be callingroot = element.getroot(). This seems to be on the same wave length as the issue withtreewalkers/etree.py I described in this comment:#338 (comment)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions


      [8]ページ先頭

      ©2009-2025 Movatter.jp