Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Encoding scheme to encode any Unicode string with only [0-9a-zA-Z_]. Similar to URL percent-encoding. Especially useful for GraphQL ID generation.

License

NotificationsYou must be signed in to change notification settings

Airsequel/double-x-encoding

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Encoding scheme to encode any Unicode stringwith only characters from[0-9a-zA-Z_].Therefore it's quite similar to URL percent-encoding.It's especially useful for GraphQL ID generation.

Constraints for the encoding scheme:

  1. Common IDs likefile_format,fileFormat,FileFormat,FILE_FORMAT,__file_format__, … must not be altered
  2. Support all Unicode characters
  3. Characters of the ASCII range must lead to shorter encodings
  4. Optional support for encoding leading digits (like in1_file_format)to fulfill constraints of some ID schemes (e.g. GraphQL's).

Examples

InputOutput
camelCaseIdcamelCaseId
snake_case_idsnake_case_id
__Schema__Schema
doxxingdoxxing
DOXXINGDOXXXXXXING
id with spacesidXX0withXX0spaces
id-with.special$chars!idXXDwithXXEspecialXX4charsXX1
id_with_ümläutßid_with_XXaaapmmlXXaaaoeutXXaaanp
Emoji: 😅EmojiXXGXX0XXbpgaf
Multi Byte Emoji: 👨‍🦲MultiXX0ByteXX0EmojiXXGXX0XXbpegiXXacaanXXbpjlc
\u{100000}XXYbaaaaa
\u{10ffff}XXYbapppp

With encoding of leading digit and double underscore activated(necessary for GraphQL ID generation):

InputOutput
1FileFormatXXZ1FileFormat
__index__XXRXXRindexXXRXXR

Explanation

The encoding scheme is based on the following rules:

  1. All characters in[0-9A-Za-z_] except forXX are encoded as is
  2. XX is encoded asXXXXXX
  3. All other printable characters inside the ASCII rangeare encoded as a sequence of 3 characters:XX[0-9A-W]
  4. All other Unicode code points untilU+fffff (e.g. Emojis)are encoded as a sequence of 7 characters:XX[a-p]{5}, where the 5 characters are the hexadecimal representationwith an alternative hex alphabet ranging froma top instead of0 tof.
  5. All Unicode code points in the Supplementary Private Use Area-B(U+100000 toU+10ffff) are encoded as a sequence of 9 characters:XXY[a-p]{6}

If the optional leading digit encoding is enabled,a leading digit is encoded asXXZ[0-9].

If the optional double underscore encoding is enabled,double underscores are encoded asXXRXXR.

Installation

  • Haskell:Via Hackage
  • Other languages:
    The code is not yet available via common package managers.Please copy the code into your project for the time being.

About

Encoding scheme to encode any Unicode string with only [0-9a-zA-Z_]. Similar to URL percent-encoding. Especially useful for GraphQL ID generation.

Topics

Resources

License

Stars

Watchers

Forks

Sponsor this project

 

[8]ページ先頭

©2009-2025 Movatter.jp