- Notifications
You must be signed in to change notification settings - Fork0
scru128/spec
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
SCRU128 ID is yet another attempt to supersedeUUID for the users who needdecentralized, globally unique time-ordered identifiers. SCRU128 is inspired byULID andKSUID and has the following features:
- 128-bit unsigned integer type
- Sortable by generation time (as integer and as text)
- 25-digit case-insensitive textual representation (Base36)
- 48-bit millisecond Unix timestamp that ensures useful life until year 10889
- Up to 281 trillion time-ordered but unpredictable unique IDs per millisecond
- 80-bitthree-layer randomness forglobal uniqueness
Examples in the 25-digit canonical textual representation:
0372hg16csmsm50l8dikcvukc0372hg16csmsm50l8djl6xi250372hg16csmsm50l8dmgepzz10372hg16csmsm50l8doir38270372hg16cy3nowracls909wcd0372hg16cy3nowraclvp355ce0372hg16cy3nowraclxf2ctzh0372hg16cy3nowraclyunyjke
SCRU128's size (128 bits) might not fit in some use cases because of storageefficiency. If you need compact, time-ordered unique identifiers generated bydistributed nodes, considerSCRU64. See the following comparison table for theproperties of the two schemes.
SCRU64 | SCRU128 | |
---|---|---|
Long name | Sortable, Clock-based, Realm-specifically Unique identifier | Sortable, Clock and Random number-based Unique identifier |
Binary size | 63 bits | 128 bits |
Textual size | 12 digits | 25 digits |
Textual encoding | Base36 with0-9a-z (case-insensitive) | Base36 with0-9a-z (case-insensitive) |
Timestamp range | January 1, 1970 - February 27, 4261 (UTC) | January 1, 1970 - August 2, 10889 (UTC) |
Timestamp resolution | 256 milliseconds | 1 millisecond |
Number of IDs per timestamp | Up to 8.4 million per 256 milliseconds per node (configurable) | 140.7 trillion per millisecond per node (on average) |
Number of distributed nodes | Up to 8.4 million (configurable) | No specific limitation |
Source of uniqueness | Centrally pre-assigned node ID | Independently generated random numbers |
Choose it when you ... | Prefer 64-bit integer for storage, indexing, and other reasons | Want unique IDs without hassle to coordinate generators |
- C/C++
- Go
- Java (with Kotlin and Android compatibility)
- JavaScript
- Python (and command-line tools)
- Rust
- Swift
If you are interested in implementing SCRU128, see alsoSCRU128 GeneratorTester.
A SCRU128 ID is a 128-bit unsigned integer consisting of four terms:
timestamp*2^80+counter_hi*2^56+counter_lo*2^32+entropy
Where:
timestamp
is a 48-bit Unix timestamp in milliseconds (i.e., millisecondselapsed since 1970-01-01 00:00:00+00:00, ignoring leap seconds).counter_hi
is a 24-bit randomly initialized counter that is incremented byone whencounter_lo
reaches its maximum value.counter_hi
is reset to arandom number whentimestamp
has moved forward by one second or more sincethe last renewal ofcounter_hi
.- Note:
counter_hi
effectively works like an entropy component (rather thana counter) that is refreshed only once per second.
- Note:
counter_lo
is a 24-bit randomly initialized counter that is incremented byone for each new ID generated within the sametimestamp
.counter_lo
isreset to a random number whenevertimestamp
moves forward. Whencounter_lo
reaches its maximum value,counter_hi
is incremented andcounter_lo
isreset to zero.entropy
is a 32-bit random number renewed for each new ID generated.
This definition is equivalent to allocating four unsigned integer fields to a128-bit space according to the following layout:
Bit numbers | Field name | Size | Data type | Byte order |
---|---|---|---|---|
Msb 0 - 47 | timestamp | 48 bits | Unsigned integer | Big-endian |
Msb 48 - 71 | counter_hi | 24 bits | Unsigned integer | Big-endian |
Msb 72 - 95 | counter_lo | 24 bits | Unsigned integer | Big-endian |
Msb 96 - 127 | entropy | 32 bits | Unsigned integer | Big-endian |
Note that this specification does not specify a canonical bit layout of SCRU128ID. An implementation may employ any binary form of a 128-bit unsigned integerto represent a SCRU128 ID.
A SCRU128 ID is encoded in a string using theBase36 encoding. The Base36denotes a SCRU128 ID as a 128-bit unsigned integer in the radix of 36 using thedigits of0-9a-z
(0123456789abcdefghijklmnopqrstuvwxyz
), with leading zerosadded to form a 25-digit canonical representation. The following pseudo equationillustrates the encoding algorithm:
1993501768880490086615869617690763354 = 0 * 36^24 + 3 * 36^23 + 7 * 36^22 + ... + 27 * 36^2 + 29 * 36^1 + 22 = '0' * 36^24 + '3' * 36^23 + '7' * 36^22 + ... + 'r' * 36^2 + 't' * 36^1 + 'm' = "0372ijojuxuhjsfkeryi2mrtm"
Although a 25-digit Base36 numeral can encode more than 128-bit information, anynumeral greater thanf5lxx1zz5pnorynqglhzmsp33
(2^128 - 1
, the largest128-bit unsigned integer) is not a valid SCRU128 ID.
For the sake of uniformity, an encoder should use lowercase letters in encodingIDs. A decoder, on the other hand, must always ignore cases when interpreting orlexicographically sorting encoded IDs.
The Base36 encoding shown above is available by default in several languages(e.g.,BigInteger#toString(int radix)
andBigInteger(String val, int radix)
constructor in Java). Another easy way to implement it is by using 128-bit orarbitrary-precision integer division and modulo operations. The following C codeillustrates a naive algorithm based on normal arrays and integers:
constuint8_tid[16]= {1,127,239,57,194,100,27,165,106,148,131,24,136,65,224,90};// convert byte array into digit value arrayuint8_tdigit_values[25]= {0};for (inti=0;i<16;i++) {unsignedintcarry=id[i];for (intj=24;j >=0;j--) {carry+=digit_values[j]*256;digit_values[j]=carry %36;carry=carry /36; }}// convert digit value array into stringstaticconstchardigits[]="0123456789abcdefghijklmnopqrstuvwxyz";chartext[26];for (inti=0;i<25;i++) {text[i]=digits[digit_values[i]];}text[25]='\0';puts(text);// 0372ijojuxuhjsfkeryi2mrtm
Seethe attached reference code for a comprehensive example and test vectors.
The IDs withtimestamp
set at zero or2^48 - 1
are reserved for specialpurposes (e.g., use as dummy, error, or example values) and must not be used orassigned as an identifier of anything.
A generator should employ a cryptographically strong random or pseudorandomnumber generator to generate unpredictable IDs.
Counter overflow occurs at an extremely low probability when the randomlyinitializedcounter_hi
andcounter_lo
do not provide sufficient space forthe IDs generated within a millisecond. The recommended approach to handlecounter overflow is to incrementtimestamp
and continue in the following way:
- Increment
timestamp
by one. - Reset
counter_hi
to zero. - Reset
counter_lo
to a random number.
This approach is recommended over other options such as the "sleep till nexttick" approach because this technique allows the generation of monotonicallyordered IDs in a non-blocking manner. Raising an error on a counter overflow isgenerally not recommended because a counter overflow is not a fault of users ofSCRU128.
This approach results in a greatertimestamp
value than the real-time clock.Such a gap betweentimestamp
and the wall clock should be handled as a smallclock rollback discussed below.
A SCRU128 generator relies on a real-time clock to ensure the monotonic order ofgenerated IDs; therefore, it cannot guarantee monotonicity when the clock movesback. When a generator detects a clock rollback by comparing the up-to-datetimestamp from the system clock and the one embedded in the last generated ID,the recommended treatment is:
- If the rollback is small enough (e.g., a few seconds), treat the
timestamp
of the last generated ID as the up-to-date one, betting that the wall clockwill catch up soon. - Otherwise, reset
timestamp
to the wall clock andcounter_hi
andcounter_lo
to random numbers if the monotonic order of IDs is notcritically important, or raise an error if it is.
This approach keeps the monotonic order of IDs when a clock rollback is small,while it otherwise resets the generator and proceeds as if another new generatorwere created to minimize the chance of collision.
A generator may fillcounter_hi
andcounter_lo
with random numbers if itgenerates IDs infrequently. Such a stateless implementation is acceptable,though not recommended, because the outcome is not distinguishable fromcompliant IDs.
SCRU128 utilizes timestamps and counters to ensure the uniqueness of IDsgenerated by a single generator, whereas it relies on 80-bit entropy in the usecases with distributed generators. SCRU128 fills the 80-bit field with a randomnumber when a new ID is infrequently (less than one ID per second) generated.For the distributed high-load use cases, SCRU128 assigns different lifetimes tothe three entropy components to improve the collision resistance:
- 24-bit
counter_hi
: reset to a random number every second - 24-bit
counter_lo
: reset to a random number every millisecond - 32-bit
entropy
: reset to a random number for every new ID generated
The longer lifetimes ofcounter_hi
andcounter_lo
reduce the number ofrandom numbers consumed and accordingly reduce the probability of at least onecollision because, for a given length of random bits, the less the number ofdice throws, the lower the chance of collision.
In other words, generators are assigned to 24-bitcounter_hi
buckets everysecond, and thus they will not collide with each other as long as their bucketsdiffer, even if each generates a bunch of IDs. 24-bit random numbers usuallycollide if millions of instances are generated, but the one-second interval ofcounter_hi
renewals decreases the number of trials drastically. Nevertheless,counter_hi
is refreshed every second to prevent potential attackers fromexploiting this field as a generator's fingerprint.
Even within the same bucket, the generators will not collide as long as initialcounter_lo
values are sufficiently distant from each other. Such a near matchprobability, if tried only once a millisecond, is much lower thanthe simplebirthday collision probability calculated over all the IDs generated within amillisecond.entropy
provides additional protection in the extremely rarecases where bothcounter_hi
andcounter_lo
collide, but it is primarilyintended to ensure a certain level of unguessability of consecutive IDsgenerated by a single generator.
This work is licensed under aCreative Commons Attribution 4.0 International(CC BY 4.0) License.
About
SCRU128 Specification
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Uh oh!
There was an error while loading.Please reload this page.