Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Google Code Issue 157: Add "escape invisible characters" option#38

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
gsnedders wants to merge2 commits intomaster
base:master
Choose a base branch
Loading
fromgcode-157

Conversation

gsnedders
Copy link
Member

Reported by@fantasai, Jul 27, 2010
Having invisible characters in the source code can be confusing to someone who's trying to figure out what's going on. Adding an escape_invisible option would make those characters visible in the source code.

The attached patch implements an escape_invisible option. The list of invisible characters is probably incomplete for this iteration, but you get the idea. It depends on the patch in issue 156 .

@ghostghost assignedgsneddersMay 4, 2013
@gsneddersgsnedders modified the milestones:1.1,0.99999999May 8, 2016
@gsneddersgsnedders removed this from the0.99999999 milestoneMay 20, 2016
@gsneddersgsnedders removed their assignmentSep 1, 2017
@gsnedders
Copy link
MemberAuthor

gsnedders commentedSep 1, 2017
edited
Loading

My preference would be something based onunicodedata and blacklisting General Category C* (though that has the problem that you'll end up blacklisting different sets of characters depending on the Python version and the Unicode version, and generating that set is expensive and hence likely should be precomputed at dist build-time, and likely needs to be represented as a segment tree rather than a set of millions of characters out of concern for memory consumption).

We also need to be careful on narrow Python builds and make sure we don't encode surrogate pairs, as\uD800\uDC00 needs to end up unchanged.

It's also notable that AFAICT the origin reason for this patch no longer holds true (the CSS testsuite build system is basically a historical artefact now and hasn't used an html5lib fork with this for years), though as#197 shows other people do care.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers
No reviews
Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

2 participants
@gsnedders@fantasai

[8]ページ先頭

©2009-2025 Movatter.jp