scru128/specPublic

NotificationsYou must be signed in to change notification settings
Fork0
Star78

SCRU128 Specification

License

CC-BY-4.0 license

78 stars 0 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
base36_128.c		base36_128.c

Repository files navigation

SCRU128: Sortable, Clock and Random number-based Unique identifier

SCRU128 ID is yet another attempt to supersedeUUID for the users who needdecentralized, globally unique time-ordered identifiers. SCRU128 is inspired byULID andKSUID and has the following features:

128-bit unsigned integer type
Sortable by generation time (as integer and as text)
25-digit case-insensitive textual representation (Base36)
48-bit millisecond Unix timestamp that ensures useful life until year 10889
Up to 281 trillion time-ordered but unpredictable unique IDs per millisecond
80-bitthree-layer randomness forglobal uniqueness

Examples in the 25-digit canonical textual representation:

0372hg16csmsm50l8dikcvukc0372hg16csmsm50l8djl6xi250372hg16csmsm50l8dmgepzz10372hg16csmsm50l8doir38270372hg16cy3nowracls909wcd0372hg16cy3nowraclvp355ce0372hg16cy3nowraclxf2ctzh0372hg16cy3nowraclyunyjke

SCRU128's size (128 bits) might not fit in some use cases because of storageefficiency. If you need compact, time-ordered unique identifiers generated bydistributed nodes, considerSCRU64. See the following comparison table for theproperties of the two schemes.

	SCRU64	SCRU128
Long name	Sortable, Clock-based, Realm-specifically Unique identifier	Sortable, Clock and Random number-based Unique identifier
Binary size	63 bits	128 bits
Textual size	12 digits	25 digits
Textual encoding	Base36 with`0-9a-z` (case-insensitive)	Base36 with`0-9a-z` (case-insensitive)
Timestamp range	January 1, 1970 - February 27, 4261 (UTC)	January 1, 1970 - August 2, 10889 (UTC)
Timestamp resolution	256 milliseconds	1 millisecond
Number of IDs per timestamp	Up to 8.4 million per 256 milliseconds per node (configurable)	140.7 trillion per millisecond per node (on average)
Number of distributed nodes	Up to 8.4 million (configurable)	No specific limitation
Source of uniqueness	Centrally pre-assigned node ID	Independently generated random numbers
Choose it when you ...	Prefer 64-bit integer for storage, indexing, and other reasons	Want unique IDs without hassle to coordinate generators

Implementations

If you are interested in implementing SCRU128, see alsoSCRU128 GeneratorTester.

Specification v2.1.1

A SCRU128 ID is a 128-bit unsigned integer consisting of four terms:

timestamp*2^80+counter_hi*2^56+counter_lo*2^32+entropy

Where:

timestamp is a 48-bit Unix timestamp in milliseconds (i.e., millisecondselapsed since 1970-01-01 00:00:00+00:00, ignoring leap seconds).
counter_hi is a 24-bit randomly initialized counter that is incremented byone whencounter_lo reaches its maximum value.counter_hi is reset to arandom number whentimestamp has moved forward by one second or more sincethe last renewal ofcounter_hi.
- Note:counter_hi effectively works like an entropy component (rather thana counter) that is refreshed only once per second.
counter_lo is a 24-bit randomly initialized counter that is incremented byone for each new ID generated within the sametimestamp.counter_lo isreset to a random number whenevertimestamp moves forward. Whencounter_loreaches its maximum value,counter_hi is incremented andcounter_lo isreset to zero.
entropy is a 32-bit random number renewed for each new ID generated.

This definition is equivalent to allocating four unsigned integer fields to a128-bit space according to the following layout:

Bit numbers	Field name	Size	Data type	Byte order
Msb 0 - 47	timestamp	48 bits	Unsigned integer	Big-endian
Msb 48 - 71	counter_hi	24 bits	Unsigned integer	Big-endian
Msb 72 - 95	counter_lo	24 bits	Unsigned integer	Big-endian
Msb 96 - 127	entropy	32 bits	Unsigned integer	Big-endian

Note that this specification does not specify a canonical bit layout of SCRU128ID. An implementation may employ any binary form of a 128-bit unsigned integerto represent a SCRU128 ID.

Textual representation

A SCRU128 ID is encoded in a string using theBase36 encoding. The Base36denotes a SCRU128 ID as a 128-bit unsigned integer in the radix of 36 using thedigits of0-9a-z (0123456789abcdefghijklmnopqrstuvwxyz), with leading zerosadded to form a 25-digit canonical representation. The following pseudo equationillustrates the encoding algorithm:

1993501768880490086615869617690763354    =  0  * 36^24 +  3  * 36^23 +  7  * 36^22 + ... + 27  * 36^2 + 29  * 36^1 + 22    = '0' * 36^24 + '3' * 36^23 + '7' * 36^22 + ... + 'r' * 36^2 + 't' * 36^1 + 'm'    = "0372ijojuxuhjsfkeryi2mrtm"

Although a 25-digit Base36 numeral can encode more than 128-bit information, anynumeral greater thanf5lxx1zz5pnorynqglhzmsp33 (2^128 - 1, the largest128-bit unsigned integer) is not a valid SCRU128 ID.

For the sake of uniformity, an encoder should use lowercase letters in encodingIDs. A decoder, on the other hand, must always ignore cases when interpreting orlexicographically sorting encoded IDs.

The Base36 encoding shown above is available by default in several languages(e.g.,BigInteger#toString(int radix) andBigInteger(String val, int radix)constructor in Java). Another easy way to implement it is by using 128-bit orarbitrary-precision integer division and modulo operations. The following C codeillustrates a naive algorithm based on normal arrays and integers:

constuint8_tid[16]= {1,127,239,57,194,100,27,165,106,148,131,24,136,65,224,90};// convert byte array into digit value arrayuint8_tdigit_values[25]= {0};for (inti=0;i<16;i++) {unsignedintcarry=id[i];for (intj=24;j >=0;j--) {carry+=digit_values[j]*256;digit_values[j]=carry %36;carry=carry /36;  }}// convert digit value array into stringstaticconstchardigits[]="0123456789abcdefghijklmnopqrstuvwxyz";chartext[26];for (inti=0;i<25;i++) {text[i]=digits[digit_values[i]];}text[25]='\0';puts(text);// 0372ijojuxuhjsfkeryi2mrtm

Seethe attached reference code for a comprehensive example and test vectors.

Special-purpose IDs

The IDs withtimestamp set at zero or2^48 - 1 are reserved for specialpurposes (e.g., use as dummy, error, or example values) and must not be used orassigned as an identifier of anything.

Considerations

Quality of random numbers

A generator should employ a cryptographically strong random or pseudorandomnumber generator to generate unpredictable IDs.

Counter overflow handling

Counter overflow occurs at an extremely low probability when the randomlyinitializedcounter_hi andcounter_lo do not provide sufficient space forthe IDs generated within a millisecond. The recommended approach to handlecounter overflow is to incrementtimestamp and continue in the following way:

Incrementtimestamp by one.
Resetcounter_hi to zero.
Resetcounter_lo to a random number.

This approach is recommended over other options such as the "sleep till nexttick" approach because this technique allows the generation of monotonicallyordered IDs in a non-blocking manner. Raising an error on a counter overflow isgenerally not recommended because a counter overflow is not a fault of users ofSCRU128.

This approach results in a greatertimestamp value than the real-time clock.Such a gap betweentimestamp and the wall clock should be handled as a smallclock rollback discussed below.

Clock rollback handling

A SCRU128 generator relies on a real-time clock to ensure the monotonic order ofgenerated IDs; therefore, it cannot guarantee monotonicity when the clock movesback. When a generator detects a clock rollback by comparing the up-to-datetimestamp from the system clock and the one embedded in the last generated ID,the recommended treatment is:

If the rollback is small enough (e.g., a few seconds), treat thetimestampof the last generated ID as the up-to-date one, betting that the wall clockwill catch up soon.
Otherwise, resettimestamp to the wall clock andcounter_hi andcounter_lo to random numbers if the monotonic order of IDs is notcritically important, or raise an error if it is.

This approach keeps the monotonic order of IDs when a clock rollback is small,while it otherwise resets the generator and proceeds as if another new generatorwere created to minimize the chance of collision.

Stateless variant

A generator may fillcounter_hi andcounter_lo with random numbers if itgenerates IDs infrequently. Such a stateless implementation is acceptable,though not recommended, because the outcome is not distinguishable fromcompliant IDs.

Design notes: three-layer randomness

SCRU128 utilizes timestamps and counters to ensure the uniqueness of IDsgenerated by a single generator, whereas it relies on 80-bit entropy in the usecases with distributed generators. SCRU128 fills the 80-bit field with a randomnumber when a new ID is infrequently (less than one ID per second) generated.For the distributed high-load use cases, SCRU128 assigns different lifetimes tothe three entropy components to improve the collision resistance:

24-bitcounter_hi: reset to a random number every second
24-bitcounter_lo: reset to a random number every millisecond
32-bitentropy: reset to a random number for every new ID generated

The longer lifetimes ofcounter_hi andcounter_lo reduce the number ofrandom numbers consumed and accordingly reduce the probability of at least onecollision because, for a given length of random bits, the less the number ofdice throws, the lower the chance of collision.

In other words, generators are assigned to 24-bitcounter_hi buckets everysecond, and thus they will not collide with each other as long as their bucketsdiffer, even if each generates a bunch of IDs. 24-bit random numbers usuallycollide if millions of instances are generated, but the one-second interval ofcounter_hi renewals decreases the number of trials drastically. Nevertheless,counter_hi is refreshed every second to prevent potential attackers fromexploiting this field as a generator's fingerprint.

Even within the same bucket, the generators will not collide as long as initialcounter_lo values are sufficiently distant from each other. Such a near matchprobability, if tried only once a millisecond, is much lower thanthe simplebirthday collision probability calculated over all the IDs generated within amillisecond.entropy provides additional protection in the extremely rarecases where bothcounter_hi andcounter_lo collide, but it is primarilyintended to ensure a certain level of unguessability of consecutive IDsgenerated by a single generator.

License

This work is licensed under aCreative Commons Attribution 4.0 International(CC BY 4.0) License.

About

SCRU128 Specification

Releases

5tags

Languages

C100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

SCRU128: Sortable, Clock and Random number-based Unique identifier

Implementations

Specification v2.1.1

Textual representation

Special-purpose IDs

Considerations

Quality of random numbers

Counter overflow handling

Clock rollback handling

Stateless variant

Design notes: three-layer randomness

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Uh oh!

Languages

Movatterモバイル変換

License

scru128/spec

Folders and files

Latest commit

History

Repository files navigation

SCRU128: Sortable, Clock and Random number-based Unique identifier

Implementations

Specification v2.1.1

Textual representation

Special-purpose IDs

Considerations

Quality of random numbers

Counter overflow handling

Clock rollback handling

Stateless variant

Design notes: three-layer randomness

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Uh oh!

Languages