Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-50002: xml.dom.minidom now preserves whitespaces in attributes#107947

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation

@serhiy-storchaka
Copy link
Member

@serhiy-storchakaserhiy-storchaka commentedAug 14, 2023
edited by bedevere-bot
Loading

Also double quotes (") are now only quoted in attributes.

It is based on changes inbpo-17582 andbpo-39011 for ElementTree.

It alsofixes#81555.

Also double quotes (") are now only quoted in attributes.
@serhiy-storchaka
Copy link
MemberAuthor

It turned out that there were no any tests for quoting in the XML output. The tests would pass even if quoting& and< was broken.

Copy link
Contributor

@scoderscoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thanks. Seeing that there are issues with whitespace that suggest a change in output, I think it's ok to solve#81555 (quote escaping) along the way. I'll reopen the ticket.

data=data.replace("&","&amp;").replace("<","&lt;"). \
replace("\"","&quot;").replace(">","&gt;")
writer.write(data)
iftext:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I usually prefer representing easy special cases in a short

Suggested change
iftext:
ifnottext:
return

rather than adding indentation to long chunks of code that only makes readers look down to see if something joined is still done at the end. Better make it clear right away that we're done handling this case.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Agree. I just kept the style of the old code.

if"\n"intext:
text=text.replace("\n","&#10;")
if"\t"intext:
text=text.replace("\t","&#09;")
Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I wonder, why&#09; and not simply&#9;? Is there a reason?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

No technical reason. Probably based on Python's hex character spelling or due to consistency with the other two-digit codes above. The XML character spec does not need (or mention) leading zeros.

I'm happy to keep the leading zero. If you need compact data, use compression. That's way more effective than stripping some zeros from rare tab characters.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

xml.sax.saxutils.quoteattr() uses&#9;.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Which makes me wonder why we need a new implementation here, rather than importing the existing one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I now looked up the implementation insaxutils.py – it looks fairly slow.minidom will probably not become high-performance by any accident, but it doesn't feel good to slow it down even more. It's probably worth a new implementation..

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It is sad, but there is a copy of escaping function in almost every module which outputs XML or HTML. On one hand, it is a trivial function, and we want to avoid unneeded dependencies. On other hand, efficient and complete implementation is not so trivial. Butxml.sax.saxutils version is too generalized and far from been efficient.

It was worse in the past. Now many code just usehtml.escape().

if"\n"intext:
text=text.replace("\n","&#10;")
if"\t"intext:
text=text.replace("\t","&#09;")
Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It is sad, but there is a copy of escaping function in almost every module which outputs XML or HTML. On one hand, it is a trivial function, and we want to avoid unneeded dependencies. On other hand, efficient and complete implementation is not so trivial. Butxml.sax.saxutils version is too generalized and far from been efficient.

It was worse in the past. Now many code just usehtml.escape().

@bedevere-bot
Copy link

There's a new commit after the PR has been approved.

@scoder: please review the changes made to this pull request.

@serhiy-storchakaserhiy-storchaka deleted the minidom-preserve-whitespaces-in-attributes branchAugust 23, 2023 12:23
@serhiy-storchaka
Copy link
MemberAuthor

Thank you for your review@scoder!

@bedevere-bot
Copy link

⚠️⚠️⚠️ Buildbot failure⚠️⚠️⚠️

Hi! The buildbotAMD64 RHEL8 LTO + PGO 3.x has failed when building commit154477b.

What do you need to do:

  1. Don't panic.
  2. Checkthe buildbot page in the devguide if you don't know what the buildbots are or how they work.
  3. Go to the page of the buildbot that failed (https://buildbot.python.org/all/#builders/568/builds/4615) and take a look at the build logs.
  4. Check if the failure is related to this commit (154477b) or if it is a false positive.
  5. If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/all/#builders/568/builds/4615

Summary of the results of the build (if available):

== Tests result: ENV CHANGED ==

431 tests OK.

10 slowest tests:

  • test_concurrent_futures: 2 min 14 sec
  • test_multiprocessing_spawn: 1 min 33 sec
  • test_multiprocessing_forkserver: 1 min 12 sec
  • test_multiprocessing_fork: 1 min 6 sec
  • test_signal: 1 min 1 sec
  • test_io: 34.8 sec
  • test_socket: 31.9 sec
  • test_imaplib: 30.4 sec
  • test_subprocess: 29.5 sec
  • test_xmlrpc: 27.5 sec

1 test altered the execution environment:
test_ssl

15 tests skipped:
test.test_asyncio.test_windows_events
test.test_asyncio.test_windows_utils test_devpoll test_gdb
test_ioctl test_kqueue test_launcher test_startfile test_tkinter
test_ttk test_winconsoleio test_winreg test_winsound test_wmi
test_zipfile64

Total duration: 2 min 18 sec

Click to see traceback logs
Note:switching to '154477be722ae5c4e18d22d0860e284006b09c4f'.You are in 'detached HEAD' state. You can look around, make experimentalchanges and commit them, and you can discard any commits you make in thisstate without impacting any branches by switching back to a branch.If you want to create a new branch to retain commits you create, you maydo so (now or later) by using -c with the switch command. Example:  git switch -c <new-branch-name>Or undo this operation with:  git switch -Turn off this advice by setting config variable advice.detachedHead to falseHEAD is now at 154477be72 gh-50002: xml.dom.minidom now preserves whitespaces in attributes (GH-107947)Switched to and reset branch 'main'find:‘build’: No such file or directoryfind:‘build’: No such file or directoryfind:‘build’: No such file or directoryfind:‘build’: No such file or directorymake[2]:[Makefile:2809: clean-retain-profile] Error 1 (ignored)./Modules/_decimal/libmpdec/context.c:57: warning: mpd_setminalloc: ignoring request to set MPD_MINALLOC a second time./Modules/_decimal/libmpdec/context.c:57: warning: mpd_setminalloc: ignoring request to set MPD_MINALLOC a second timemake:*** [Makefile:2034: buildbottest] Error 3

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@scoderscoderAwaiting requested review from scoder

Assignees

No one assigned

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

Minidom does not have to escape quote inside text segments

3 participants

@serhiy-storchaka@bedevere-bot@scoder

[8]ページ先頭

©2009-2025 Movatter.jp