- Notifications
You must be signed in to change notification settings - Fork294
Closed
Milestone
Description
Hi,
I'm facing an issue with this code: "AttributeError: 'unicode' object has no attribute 'tag'"
importhtml5libparser=html5lib.HTMLParser(tree=html5lib.treebuilders.getTreeBuilder("lxml"),namespaceHTMLElements=False)serializer=html5lib.serializer.HTMLSerializer(omit_optional_tags=False)walker=html5lib.treewalkers.getTreeWalker("lxml")# workssrc=u"experiences"tree=parser.parseFragment(src,container="div")stream=walker(tree)output=serializer.serialize(stream)print("\n".join(output))# Doesn't worksrc=u"exp\xe9riences"tree=parser.parseFragment(src,container="div")stream=walker(tree)output=serializer.serialize(stream)print("\n".join(output))
I think the error lies in theisstring
method ofFragmentWrapper
class in treewalker/lxmletree.py
Changing:
defensure_str(s):ifsisNone:returnNoneelifisinstance(s,text_type):returnselse:returns.decode("utf-8","strict")classFragmentWrapper(object):def__init__(self,fragment_root,obj): ...self.isstring=isinstance(obj,str)orisinstance(obj,bytes)# Support for bytes here is Py2ifself.isstring:self.obj=ensure_str(self.obj)
to
defensure_str(s):ifsisNone:returnNoneelifisinstance(s,text_type):returnselse:returns.decode("utf-8","strict")classFragmentWrapper(object):def__init__(self,fragment_root,obj): ...self.isstring=isinstance(obj,str)orisinstance(obj,bytes)orisinstance(obj,text_type)# Support for bytes here is Py2ifself.isstring:self.obj=ensure_str(self.obj)
seems to do the job... What do you think?
Metadata
Metadata
Assignees
Labels
No labels