Movatterモバイル変換


[0]ホーム

URL:


homepage

Issue36407

This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title:xml.dom.minidom wrong indentation writing for CDATA section
Type:enhancementStage:resolved
Components:XMLVersions:Python 3.8
process
Status:closedResolution:fixed
Dependencies:Superseder:
Assigned To:Nosy List: eli.bendersky, scoder, serhiy.storchaka, vsurjaninov
Priority:normalKeywords:patch

Created on2019-03-23 15:38 byvsurjaninov, last changed2022-04-11 14:59 byadmin. This issue is nowclosed.

Pull Requests
URLStatusLinkedEdit
PR 12514mergedvsurjaninov,2019-03-23 16:00
PR 12578closedmiss-islington,2019-03-27 06:19
Messages (5)
msg338681 -(view)Author: Vladimir Surjaninov (vsurjaninov)*Date: 2019-03-23 15:38
If we are writing xml with CDATA section and leaving non-empty indentation and new-line parameters, a parent node of the section will contain useless indentation, that will be parsed as a text.Example:>>>doc = minidom.Document()>>>root = doc.createElement('root')>>>doc.appendChild(root)>>>node = doc.createElement('node')>>>root.appendChild(node)>>>data = doc.createCDATASection('</data>')>>>node.appendChild(data)>>>print(doc.toprettyxml(indent=‘  ‘ * 4)<?xml version="1.0" ?><root>    <node><![CDATA[</data>]]>    </node></root>If we try to parse this output doc, we won’t get CDATA value correctly.Following code returns a string that contains only indentation characters:>>>doc = minidom.parseString(xml_text)>>>doc.getElementsByTagName('node')[0].firstChild.nodeValueReturns a string with CDATA value and indentation characters:>>>doc.getElementsByTagName('node')[0].firstChild.wholeTextBut we have a workaround:>>>data.nodeType = data.TEXT_NODE…>>>print(doc.toprettyxml(indent=‘  ‘ * 4)<?xml version="1.0" ?><root>    <node><![CDATA[</data>]]></node></root>It will be parsed correctly:>>>doc.getElementsByTagName('node')[0].firstChild.nodeValue</data>But I think it will be better if we fix the writing function, which would set this as default behavior.
msg338701 -(view)Author: Stefan Behnel (scoder)*(Python committer)Date: 2019-03-23 21:33
Yes, this case is incorrect. Pretty printing should not change character content inside of a simple tag.The PR looks good to me.
msg338936 -(view)Author: Serhiy Storchaka (serhiy.storchaka)*(Python committer)Date: 2019-03-27 05:59
New changeset384b81d923addd52125e94470b11d2574ca266a9 by Serhiy Storchaka (Vladimir Surjaninov) in branch 'master':bpo-36407: Fix writing indentations of CDATA section (xml.dom.minidom). (GH-12514)https://github.com/python/cpython/commit/384b81d923addd52125e94470b11d2574ca266a9
msg338939 -(view)Author: Serhiy Storchaka (serhiy.storchaka)*(Python committer)Date: 2019-03-27 06:19
Should we backport this change? I am not sure.
msg338943 -(view)Author: Stefan Behnel (scoder)*(Python committer)Date: 2019-03-27 07:04
I don't think this should be backported. Pretty-printing is not a production relevant feature, more of a "debugging, diffing and help users see what they get" kind of feature. It's good to have it fixed for the future, but we shouldn't bother users with it during a point release.
History
DateUserActionArgs
2022-04-11 14:59:12adminsetgithub: 80588
2019-03-27 12:08:27serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2019-03-27 07:04:43scodersetmessages: +msg338943
2019-03-27 06:19:42serhiy.storchakasetmessages: +msg338939
2019-03-27 06:19:22miss-islingtonsetpull_requests: +pull_request12522
2019-03-27 05:59:02serhiy.storchakasetmessages: +msg338936
2019-03-23 21:33:28scodersetmessages: +msg338701
versions: + Python 3.8
2019-03-23 16:00:14vsurjaninovsetkeywords: +patch
stage: patch review
pull_requests: +pull_request12465
2019-03-23 15:40:39xtreaksetnosy: +scoder,eli.bendersky,serhiy.storchaka
2019-03-23 15:38:49vsurjaninovcreate
Supported byThe Python Software Foundation,
Powered byRoundup
Copyright © 1990-2022,Python Software Foundation
Legal Statements

[8]ページ先頭

©2009-2026 Movatter.jp