Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
This repository was archived by the owner on Dec 15, 2022. It is now read-only.
/language-phpPublic archive

limit allowed characters to UTF-8 range (0x10FFFF)#444

Merged
darangi merged 2 commits intoatom:masterfromIonBazan:utf-8-compliance
Jan 13, 2022

Conversation

IonBazan
Copy link
Contributor

@IonBazanIonBazan commentedOct 21, 2021
edited
Loading

Requirements

This change limits valid characters recognized to 0x10FFFF according to UTF-8 specification.

Description of the Change

This is to make the regular expressions PCRE-compliant, where matched characters should not fall out of UTF-8 bounds:https://www.pcre.org/original/doc/html/pcreunicode.html
Since UTF-8 is a standard encoding for PHP files according toPSR-1, I don't see a point supporting any invalid Unicode characters.

Alternate Designs

Benefits

The reason for this change is actually to make GitHub Linguist support this grammar as it sticks to strict PCRE rules.

Before
image

After

image

Possible Drawbacks

Any invalid character will stop being recognized as a variable name but that shouldn't occur in real world.

Applicable Issues

github-linguist/linguist#5522

While awaiting for workflow run approval, let me just confirm that tests are passing locally.

@KapitanOczywisty
Copy link
Contributor

@sadick254 This is ready to be merged. Other PRs could introduce more7fffffff, so this should probably go after them and some "replace all" might be needed.

IonBazan reacted with thumbs up emoji

@darangi
Copy link
Contributor

@IonBazan could you take a look at the conflicts?

@IonBazan
Copy link
ContributorAuthor

@darangi fixed 😉

KapitanOczywisty reacted with thumbs up emojidarangi reacted with hooray emoji

@darangi
Copy link
Contributor

Thanks for the contribution@IonBazan 🙇🏾

IonBazan reacted with rocket emoji

@darangidarangi merged commitb029889 intoatom:masterJan 13, 2022
@IonBazanIonBazan deleted the utf-8-compliance branchJanuary 13, 2022 10:50
Sign up for freeto subscribe to this conversation on GitHub. Already have an account?Sign in.
Reviewers
1 more reviewer

@KapitanOczywistyKapitanOczywistyKapitanOczywisty approved these changes

Reviewers whose approvals may not affect merge requirements
Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

3 participants
@IonBazan@KapitanOczywisty@darangi

[8]ページ先頭

©2009-2025 Movatter.jp