Support Unicode 15.1#124

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Merged

Manishearth merged 1 commit intounicode-rs:masterfromsyvb:unicode-15-1

Sep 25, 2023

Merged

Support Unicode 15.1#124

Manishearth merged 1 commit intounicode-rs:masterfromsyvb:unicode-15-1

Sep 25, 2023

Conversation

Copy link

Contributor

syvb commentedSep 22, 2023•
edited
Loading

Adds Unicode 15.1 support.

Updating tests

Turns outscripts/unicode_gen_breaktests.py was last run for Unicode 11 - every subsequent updater forgot to run it. I updated the GitHub Action that checksscripts/unicode.py was run to also check forscripts/unicode_gen_breaktests.py being run.

Devanagari mis-segmentation

There are a few cases where Devanagari grapheme segmentation fails after updating the test data from Unicode 11 to Unicode 15. I just skipped those failing tests for now.

syvb force-pushed theunicode-15-1 branch 5 times, most recently from83dcbc1 toa909537Compare

September 22, 2023 21:56

Support Unicode 15.1

69f3b02

syvb force-pushed theunicode-15-1 branch froma909537 to69f3b02Compare

September 22, 2023 21:58

Copy link

ContributorAuthor

syvb commentedSep 22, 2023•
edited
Loading

I originally described a categorization issue with ۝ - turns out the Unicode data files are correct, I was just using outdated ones. Oops. I kept the tests that verify ۝ (and the Syriac abbreviation mark) are categorized correctly.

Manishearth reviewed

Sep 25, 2023

View reviewed changes

.github/workflows/rust.yml

		run:cargo fmt --check
		-name:Verify regenerated files
		run:./scripts/unicode.py && diff tables.rs src/tables.rs
		run:./scripts/unicode.py && diff tables.rs src/tables.rs