xml.dom.pulldom
--- 支援建置部分 DOM 樹¶
Thexml.dom.pulldom
module provides a "pull parser" which can also beasked to produce DOM-accessible fragments of the document where necessary. Thebasic concept involves pulling "events" from a stream of incoming XML andprocessing them. In contrast to SAX which also employs an event-drivenprocessing model together with callbacks, the user of a pull parser isresponsible for explicitly pulling events from the stream, looping over thoseevents until either processing is finished or an error condition occurs.
警告
Thexml.dom.pulldom
module is not secure againstmaliciously constructed data. If you need to parse untrusted orunauthenticated data seeXML 漏洞.
在 3.7.1 版的變更:The SAX parser no longer processes general external entities by default toincrease security by default. To enable processing of external entities,pass a custom parser instance in:
fromxml.dom.pulldomimportparsefromxml.saximportmake_parserfromxml.sax.handlerimportfeature_external_gesparser=make_parser()parser.setFeature(feature_external_ges,True)parse(filename,parser=parser)
範例:
fromxml.domimportpulldomdoc=pulldom.parse('sales_items.xml')forevent,nodeindoc:ifevent==pulldom.START_ELEMENTandnode.tagName=='item':ifint(node.getAttribute('price'))>50:doc.expandNode(node)print(node.toxml())
event
is a constant and can be one of:
START_ELEMENT
END_ELEMENT
COMMENT
START_DOCUMENT
END_DOCUMENT
CHARACTERS
PROCESSING_INSTRUCTION
IGNORABLE_WHITESPACE
node
is an object of typexml.dom.minidom.Document
,xml.dom.minidom.Element
orxml.dom.minidom.Text
.
Since the document is treated as a "flat" stream of events, the document "tree"is implicitly traversed and the desired elements are found regardless of theirdepth in the tree. In other words, one does not need to consider hierarchicalissues such as recursive searching of the document nodes, although if thecontext of elements were important, one would either need to maintain somecontext-related state (i.e. remembering where one is in the document at anygiven point) or to make use of theDOMEventStream.expandNode()
methodand switch to DOM-related processing.
- classxml.dom.pulldom.PullDom(documentFactory=None)¶
Subclass of
xml.sax.handler.ContentHandler
.
- classxml.dom.pulldom.SAX2DOM(documentFactory=None)¶
Subclass of
xml.sax.handler.ContentHandler
.
- xml.dom.pulldom.parse(stream_or_string,parser=None,bufsize=None)¶
Return a
DOMEventStream
from the given input.stream_or_string may beeither a file name, or a file-like object.parser, if given, must be anXMLReader
object. This function will change thedocument handler of theparser and activate namespace support; other parser configuration (likesetting an entity resolver) must have been done in advance.
If you have XML in a string, you can use theparseString()
function instead:
- xml.dom.pulldom.parseString(string,parser=None)¶
Return a
DOMEventStream
that represents the (Unicode)string.
- xml.dom.pulldom.default_bufsize¶
Default value for thebufsize parameter to
parse()
.The value of this variable can be changed before calling
parse()
andthe new value will take effect.
DOMEventStream 物件¶
- classxml.dom.pulldom.DOMEventStream(stream,parser,bufsize)¶
在 3.11 版的變更:Support for
__getitem__()
method has been removed.- getEvent()¶
Return a tuple containingevent and the currentnode as
xml.dom.minidom.Document
if event equalsSTART_DOCUMENT
,xml.dom.minidom.Element
if event equalsSTART_ELEMENT
orEND_ELEMENT
orxml.dom.minidom.Text
if event equalsCHARACTERS
.The current node does not contain information about its children, unlessexpandNode()
is called.
- expandNode(node)¶
Expands all children ofnode intonode. Example:
fromxml.domimportpulldomxml='<html><title>Foo</title> <p>Some text <div>and more</div></p> </html>'doc=pulldom.parseString(xml)forevent,nodeindoc:ifevent==pulldom.START_ELEMENTandnode.tagName=='p':# Following statement only prints '<p/>'print(node.toxml())doc.expandNode(node)# Following statement prints node with all its children '<p>Some text <div>and more</div></p>'print(node.toxml())
- reset()¶