xml.dom.pulldom --- 支援建置部分 DOM 樹

原始碼:Lib/xml/dom/pulldom.py


Thexml.dom.pulldom module provides a "pull parser" which can also beasked to produce DOM-accessible fragments of the document where necessary. Thebasic concept involves pulling "events" from a stream of incoming XML andprocessing them. In contrast to SAX which also employs an event-drivenprocessing model together with callbacks, the user of a pull parser isresponsible for explicitly pulling events from the stream, looping over thoseevents until either processing is finished or an error condition occurs.

備註

如果你需要剖析不受信任或未經驗證的資料,請參閱XML 安全性

在 3.7.1 版的變更:The SAX parser no longer processes general external entities by default toincrease security by default. To enable processing of external entities,pass a custom parser instance in:

fromxml.dom.pulldomimportparsefromxml.saximportmake_parserfromxml.sax.handlerimportfeature_external_gesparser=make_parser()parser.setFeature(feature_external_ges,True)parse(filename,parser=parser)

範例:

fromxml.domimportpulldomdoc=pulldom.parse('sales_items.xml')forevent,nodeindoc:ifevent==pulldom.START_ELEMENTandnode.tagName=='item':ifint(node.getAttribute('price'))>50:doc.expandNode(node)print(node.toxml())

event 是一個常數,可以是以下其中一個:

  • START_ELEMENT

  • END_ELEMENT

  • COMMENT

  • START_DOCUMENT

  • END_DOCUMENT

  • CHARACTERS

  • PROCESSING_INSTRUCTION

  • IGNORABLE_WHITESPACE

node is an object of typexml.dom.minidom.Document,xml.dom.minidom.Element orxml.dom.minidom.Text.

Since the document is treated as a "flat" stream of events, the document "tree"is implicitly traversed and the desired elements are found regardless of theirdepth in the tree. In other words, one does not need to consider hierarchicalissues such as recursive searching of the document nodes, although if thecontext of elements were important, one would either need to maintain somecontext-related state (i.e. remembering where one is in the document at anygiven point) or to make use of theDOMEventStream.expandNode() methodand switch to DOM-related processing.

classxml.dom.pulldom.PullDom(documentFactory=None)

xml.sax.handler.ContentHandler 的子類別。

classxml.dom.pulldom.SAX2DOM(documentFactory=None)

xml.sax.handler.ContentHandler 的子類別。

xml.dom.pulldom.parse(stream_or_string,parser=None,bufsize=None)

Return aDOMEventStream from the given input.stream_or_string may beeither a file name, or a file-like object.parser, if given, must be anXMLReader object. This function will change thedocument handler of theparser and activate namespace support; other parser configuration (likesetting an entity resolver) must have been done in advance.

If you have XML in a string, you can use theparseString() function instead:

xml.dom.pulldom.parseString(string,parser=None)

回傳一個表示 (Unicode)stringDOMEventStream

xml.dom.pulldom.default_bufsize

Default value for thebufsize parameter toparse().

The value of this variable can be changed before callingparse() andthe new value will take effect.

DOMEventStream 物件

classxml.dom.pulldom.DOMEventStream(stream,parser,bufsize)

在 3.11 版的變更:已移除對__getitem__() 方法的支援。

getEvent()

Return a tuple containingevent and the currentnode asxml.dom.minidom.Document if event equalsSTART_DOCUMENT,xml.dom.minidom.Element if event equalsSTART_ELEMENT orEND_ELEMENT orxml.dom.minidom.Text if event equalsCHARACTERS.The current node does not contain information about its children, unlessexpandNode() is called.

expandNode(node)

展開node 的所有子節點到node。範例:

fromxml.domimportpulldomxml='<html><title>Foo</title> <p>Some text <div>and more</div></p> </html>'doc=pulldom.parseString(xml)forevent,nodeindoc:ifevent==pulldom.START_ELEMENTandnode.tagName=='p':# 以下陳述式只會印出 '<p/>'print(node.toxml())doc.expandNode(node)# 以下陳述式會印出包含所有子節點的節點 '<p>Some text <div>and more</div></p>'print(node.toxml())
reset()