- Notifications
You must be signed in to change notification settings - Fork10
Fix set semantics for Common and Inherited#21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
Manishearth commentedAug 13, 2025
Yeah, I think that that behavior is a bug. |
dcsommer commentedAug 13, 2025
Thanks. If you think the test looks right, I'll go ahead and make it pass and update the PR. |
Manishearth left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The test looks good
…ited/CommonAfter finding the iterator after a union with Common only yielded a singleelement, I overhauled the representation and semantics of`ScriptExtension`. This is a breaking change for most APIs.Summary of improvements to `ScriptExtension`:* Improved representation to be able to track multiple scripts as well as Inherited/Common* "Inherited" and "Common" no longer intersect with everything and have no subset/superset relationship between them.* `for_str` is a union, not intersection, of all chars* Added `is_subset_or_equal()` for easier comparison of unions and intersections* Changed `Debug` impl to a vanilla derive to allow comparing hex bits* Fixed `Display` impl to properly show each script, separated by pluses* New test for iterator
dcsommer commentedAug 13, 2025
@Manishearth it ended up being a breaking change for several APIs, but I think the behavior is much easier to reason about now and more useful overall. LMK what you think when you have a moment. |
Manishearth left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
No, this is incorrect: Common and Inherited are not allowed to coexist with other scx values
That behavior is deliberate
https://www.unicode.org/reports/tr24/#Scx_Bad_Example_Table
I misunderstood the point of the test, sorry.
dcsommer commentedAug 14, 2025
Got it, will close this PR then. |
Description: On reviewing the source code, it appears that the iterator implementation for
ScriptExtensiononly ever yields a single element. I did not see this documented, and assuming this is a bug, I have written a test case to clarify the expected subset, union, and iterator behavior. If maintainers believe this behavior as shown in the proposed test is correct, I can go ahead and implement the fixes in the rest of the library.