- Notifications
You must be signed in to change notification settings - Fork5
Addis_potential_mixed_script_confusable_char function#13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Addis_potential_mixed_script_confusable_char function#13
Uh oh!
There was an error while loading.Please reload this page.
Conversation
35d49f3 to1fb9c04Compare1fb9c04 to72cefffComparecrlf0710 commentedMay 6, 2020 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There's a debug boolean value in the python script (search "debug = False",https://github.com/unicode-rs/unicode-security/pull/13/files#diff-c87c196441d88317a5ea3bf97e9fde0aR536), if anyone's curious why these codepoints are confusable, you can toggle it and regenerate the table file, which will include comments on why these code points are considered mixed script confusable. |
crlf0710 commentedMay 7, 2020
The python calculation part is a little complex there. So i'll leave a brief description. The main idea is creating equivalence classes from |
Manishearth left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
It's really hard to review the python stuff, it would be nice if each function had a long comment explaining what its purpose is. currently i have to kinda figure that out from the code, and it's hard to review things when you don't know what they're doing
| returnconfusables | ||
| defaliases(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
perhaps link to the original source in unicode-script so things can be manually updated if necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Added comments below this line.
crlf0710 commentedMay 12, 2020
@Manishearth Will add some comments, thanks! |
crlf0710 commentedMay 12, 2020
Added some comments. Also cc@pyfisch here. |
dac6a41 toa18eb9cComparecrlf0710 commentedMay 30, 2020
Adjusted the APIs and added a very simple test. |
is_potential_mixed_script_confusable_char function
Manishearth left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Overall looks good, would appreciate docs on each function
| defis_script_ignored_in_mixedscript(source): | ||
| returnsource=='Zinh'orsource=='Zyyy'orsource=='Zzzz' | ||
| defprocess_mixedscript_single_to_multi(item_i,script_i,proto_lst,scripts): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
what is this function trying to do? what does it operate on? where did those rules come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I will try to explain in comments.
| returnTrue | ||
| returnFalse | ||
| defload_potential_mixedscript_confusables(f,identifier_allowed,scripts): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
what this function returns, and what each argument is, should be documented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
ok, i will add more comments.
crlf0710 commentedJun 9, 2020
Added more comments to document the details. |
…=ManishearthImplement mixed script confusable lint.This implements the mixed script confusable lint defined in RFC 2457.This is blocked onrust-lang#72069 andunicode-rs/unicode-security#13, and will need a Cargo.toml version bump after those are resolved.The lint message warning is sub-optimal for now. We'll need a mechanism to properly output `AugmentScriptSet` to screen, this is to be added in `unicode-security` crate.r?@Manishearth
…=ManishearthImplement mixed script confusable lint.This implements the mixed script confusable lint defined in RFC 2457.This is blocked onrust-lang#72069 andunicode-rs/unicode-security#13, and will need a Cargo.toml version bump after those are resolved.The lint message warning is sub-optimal for now. We'll need a mechanism to properly output `AugmentScriptSet` to screen, this is to be added in `unicode-security` crate.r?@Manishearth
…=ManishearthImplement mixed script confusable lint.This implements the mixed script confusable lint defined in RFC 2457.This is blocked onrust-lang#72069 andunicode-rs/unicode-security#13, and will need a Cargo.toml version bump after those are resolved.The lint message warning is sub-optimal for now. We'll need a mechanism to properly output `AugmentScriptSet` to screen, this is to be added in `unicode-security` crate.r?@Manishearth
Uh oh!
There was an error while loading.Please reload this page.
This is a prototype of the data required from mixed_script_confusable lint of rustc. I'm not really sure whether we need to give more detailed data to rustc for diagnostics. (e.g. This code point is potentially confusable with which code point or which script?)Putting those aside, maybe we can have a early review first.
Implements
is_potential_mixed_script_confusable_charfunction.A few other issues raised up duringadding the actual lint -#15#16