
This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.
Created on2019-03-23 15:38 byvsurjaninov, last changed2022-04-11 14:59 byadmin. This issue is nowclosed.
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 12514 | merged | vsurjaninov,2019-03-23 16:00 | |
| PR 12578 | closed | miss-islington,2019-03-27 06:19 | |
| Messages (5) | |||
|---|---|---|---|
| msg338681 -(view) | Author: Vladimir Surjaninov (vsurjaninov)* | Date: 2019-03-23 15:38 | |
If we are writing xml with CDATA section and leaving non-empty indentation and new-line parameters, a parent node of the section will contain useless indentation, that will be parsed as a text.Example:>>>doc = minidom.Document()>>>root = doc.createElement('root')>>>doc.appendChild(root)>>>node = doc.createElement('node')>>>root.appendChild(node)>>>data = doc.createCDATASection('</data>')>>>node.appendChild(data)>>>print(doc.toprettyxml(indent=‘ ‘ * 4)<?xml version="1.0" ?><root> <node><![CDATA[</data>]]> </node></root>If we try to parse this output doc, we won’t get CDATA value correctly.Following code returns a string that contains only indentation characters:>>>doc = minidom.parseString(xml_text)>>>doc.getElementsByTagName('node')[0].firstChild.nodeValueReturns a string with CDATA value and indentation characters:>>>doc.getElementsByTagName('node')[0].firstChild.wholeTextBut we have a workaround:>>>data.nodeType = data.TEXT_NODE…>>>print(doc.toprettyxml(indent=‘ ‘ * 4)<?xml version="1.0" ?><root> <node><![CDATA[</data>]]></node></root>It will be parsed correctly:>>>doc.getElementsByTagName('node')[0].firstChild.nodeValue</data>But I think it will be better if we fix the writing function, which would set this as default behavior. | |||
| msg338701 -(view) | Author: Stefan Behnel (scoder)*![]() | Date: 2019-03-23 21:33 | |
Yes, this case is incorrect. Pretty printing should not change character content inside of a simple tag.The PR looks good to me. | |||
| msg338936 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2019-03-27 05:59 | |
New changeset384b81d923addd52125e94470b11d2574ca266a9 by Serhiy Storchaka (Vladimir Surjaninov) in branch 'master':bpo-36407: Fix writing indentations of CDATA section (xml.dom.minidom). (GH-12514)https://github.com/python/cpython/commit/384b81d923addd52125e94470b11d2574ca266a9 | |||
| msg338939 -(view) | Author: Serhiy Storchaka (serhiy.storchaka)*![]() | Date: 2019-03-27 06:19 | |
Should we backport this change? I am not sure. | |||
| msg338943 -(view) | Author: Stefan Behnel (scoder)*![]() | Date: 2019-03-27 07:04 | |
I don't think this should be backported. Pretty-printing is not a production relevant feature, more of a "debugging, diffing and help users see what they get" kind of feature. It's good to have it fixed for the future, but we shouldn't bother users with it during a point release. | |||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:59:12 | admin | set | github: 80588 |
| 2019-03-27 12:08:27 | serhiy.storchaka | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
| 2019-03-27 07:04:43 | scoder | set | messages: +msg338943 |
| 2019-03-27 06:19:42 | serhiy.storchaka | set | messages: +msg338939 |
| 2019-03-27 06:19:22 | miss-islington | set | pull_requests: +pull_request12522 |
| 2019-03-27 05:59:02 | serhiy.storchaka | set | messages: +msg338936 |
| 2019-03-23 21:33:28 | scoder | set | messages: +msg338701 versions: + Python 3.8 |
| 2019-03-23 16:00:14 | vsurjaninov | set | keywords: +patch stage: patch review pull_requests: +pull_request12465 |
| 2019-03-23 15:40:39 | xtreak | set | nosy: +scoder,eli.bendersky,serhiy.storchaka |
| 2019-03-23 15:38:49 | vsurjaninov | create | |