- Notifications
You must be signed in to change notification settings - Fork182
K-Sortable Globally Unique IDs
License
segmentio/ksuid
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
ksuid is an efficient, comprehensive, battle-tested Go library forgenerating and parsing a specific kind of globally unique identifiercalled aKSUID. This library serves as its reference implementation.
go get -u github.com/segmentio/ksuid
KSUID is for K-Sortable Unique IDentifier. It is a kind of globallyunique identifier similar to aRFC 4122 UUID, built from the ground-up to be "naturally"sorted by generation timestamp without any special type-aware logic.
In short, running a set of KSUIDs through the UNIXsort
command will resultin a list ordered by generation time.
There are numerous methods for generating unique identifiers, so why KSUID?
- Naturally ordered by generation time
- Collision-free, coordination-free, dependency-free
- Highly portable representations
Even if only one of these properties are important to you, KSUID is a greatchoice! :) Many projects chose to use KSUIDsjust because the textrepresentation is copy-and-paste friendly.
For a follow up read on the topic:A brief history of UUID
Unlike the more ubiquitous UUIDv4, a KSUID contains a timestamp componentthat allows them to be loosely sorted by generation time. This is not a strongguarantee (an invariant) as it depends on wall clocks, but is still incrediblyuseful in practice. Both the binary and text representations will sort bycreation time without any special sorting logic.
While RFC 4122 UUIDv1sdo include a time component, there aren't enoughbytes of randomness to provide strong protection against collisions(duplicates). With such a low amount of entropy, it is feasible for amalicious party to guess generated IDs, creating a problem for systems whosesecurity is, implicitly or explicitly, sensitive to an adversary guessingidentifiers.
To fit into a 64-bit number space,Snowflake IDsand its derivatives require coordination to avoid collisions, whichsignificantly increases the deployment complexity and operational burden.
A KSUID includes 128 bits of pseudorandom data ("entropy"). This number spaceis 64 times larger than the 122 bits used by the well-accepted RFC 4122 UUIDv4standard. The additional timestamp component can be considered "bonus entropy"which further decreases the probability of collisions, to the point of physicalinfeasibility in any practical implementation.
The textand binary representations are lexicographically sortable, whichallows them to be dropped into systems which do not natively support KSUIDsand retain their time-ordered property.
The text representation is an alphanumeric base62 encoding, so it "fits"anywhere alphanumeric strings are accepted. No delimiters are used, sostringified KSUIDs won't be inadvertently truncated or tokenized wheninterpreted by software that is designed for human-readable text, a commonproblem for the text representation of RFC 4122 UUIDs.
Binary KSUIDs are 20-bytes: a 32-bit unsigned integer UTC timestamp anda 128-bit randomly generated payload. The timestamp uses big-endianencoding, to support lexicographic sorting. The timestamp epoch is adjustedto May 13th, 2014, providing over 100 years of life. The payload isgenerated by a cryptographically-strong pseudorandom number generator.
The text representation is always 27 characters, encoded in alphanumericbase62 that will lexicographically sort by timestamp.
This library is designed to be used in code paths that are performancecritical. Its code has been tuned to eliminate all non-essentialoverhead. TheKSUID
type is derived from a fixed-size array, whicheliminates the additional reference chasing and allocation involved ina variable-width type.
The API provides an interface for use in code paths which are sensitiveto allocation. For example, theAppend
method can be used to parse thetext representation and replace the contents of aKSUID
valuewithout additional heap allocation.
All public package level "pure" functions are concurrency-safe, protectedby a global mutex. For hot loops that generate a large amount of KSUIDsfrom a single Goroutine, theSequence
type is provided to elide thepotential contention.
By default, out of an abundance of caution, the cryptographically-securePRNG is used to generate the random bits of a KSUID. This can be relaxedin extremely performance-critical code using the includedFastRander
type.FastRander
uses the standard PRNG with a seed generated by thecryptographically-secure PRNG.
NOTE: While there is no evidence thatFastRander
will increase theprobability of a collision, it shouldn't be used in scenarios whereuniqueness is important to security, as there is an increased chancethe generated IDs can be predicted by an adversary.
This code has been used in production at Segment for several years,across a diverse array of projects. Trillions upon trillions ofKSUIDs have been generated in some of Segment's mostperformance-critical, large-scale distributed systems.
Designed to be integrated with other libraries, theKSUID
typeimplements many standard library interfaces, including:
Stringer
database/sql.Scanner
anddatabase/sql/driver.Valuer
encoding.BinaryMarshal
andencoding.BinaryUnmarshal
encoding.TextMarshal
andencoding.TextUnmarshal
(encoding/json
friendly!)
This package comes with a command-line toolksuid
, useful forgenerating KSUIDs as well as inspecting the internal components ofexisting KSUIDs. Machine-friendly output is provided for scriptinguse cases.
Given a Go build environment, it can be installed with the command:
$ go install github.com/segmentio/ksuid/cmd/ksuid
$ ksuid0ujsswThIGTUYm2K8FjOOfXtY1K
$ ksuid -n 40ujsszwN8NRY24YaXiTIE2VWDTS0ujsswThIGTUYm2K8FjOOfXtY1K0ujssxh0cECutqzMgbtXSGnjorm0ujsszgFvbiEr7CDgE3z8MAUPFt
$ ksuid -f inspect 0ujtsYcgvSTl8PAuAdqWYSMnLOvREPRESENTATION: String: 0ujtsYcgvSTl8PAuAdqWYSMnLOv Raw: 0669F7EFB5A1CD34B5F99D1154FB6853345C9735COMPONENTS: Time: 2017-10-09 21:00:47 -0700 PDT Timestamp: 107608047 Payload: B5A1CD34B5F99D1154FB6853345C9735
$ ksuid -f inspectREPRESENTATION: String: 0ujzPyRiIAffKhBux4PvQdDqMHY Raw: 066A029C73FC1AA3B2446246D6E89FCD909E8FE8COMPONENTS: Time: 2017-10-09 21:46:20 -0700 PDT Timestamp: 107610780 Payload: 73FC1AA3B2446246D6E89FCD909E8FE8
$ ksuid -f template -t'{{ .Time }}: {{ .Payload }}' 0ujtsYcgvSTl8PAuAdqWYSMnLOv2017-10-09 21:00:47 -0700 PDT: B5A1CD34B5F99D1154FB6853345C9735
$ ksuid -f template -t'{{ .Time }}: {{ .Payload }}'$(ksuid -n 4)2017-10-09 21:05:37 -0700 PDT: 304102BC687E087CC3A811F21D113CCF2017-10-09 21:05:37 -0700 PDT: EAF0B240A9BFA55E079D887120D962F02017-10-09 21:05:37 -0700 PDT: DF0761769909ABB0C7BB9D66F79FC0412017-10-09 21:05:37 -0700 PDT: 1A8F0E3D0BDEB84A5FAD702876F46543
$ ksuid -f template -t'{ "timestamp": "{{ .Timestamp }}", "payload": "{{ .Payload }}", "ksuid": "{{.String}}"}' -n 4{"timestamp":"107611700","payload":"9850EEEC191BF4FF26F99315CE43B0C8","ksuid":"0uk1Hbc9dQ9pxyTqJ93IUrfhdGq"}{"timestamp":"107611700","payload":"CC55072555316F45B8CA2D2979D3ED0A","ksuid":"0uk1HdCJ6hUZKDgcxhpJwUl5ZEI"}{"timestamp":"107611700","payload":"BA1C205D6177F0992D15EE606AE32238","ksuid":"0uk1HcdvF0p8C20KtTfdRSB9XIm"}{"timestamp":"107611700","payload":"67517BA309EA62AE7991B27BB6F2FCAC","ksuid":"0uk1Ha7hGJ1Q9Xbnkt0yZgNwg3g"}
There are times when you are sure your ksuid is correct. But you need to get it from bytes or string and pass itit's to the structure. For this, there are OrNil functions that return ksuid.Nil on error and can be calleddirectly in the structure.
Functions:
ParseOrNil()
FromPartsOrNil()
FromBytesOrNil()
An example of using the function without OrNil:
funcgetPosts(before,after []byte) {b,err:=ksuid.FromBytes(before)iferr!=nil {// handle error}a,err:=ksuid.FromBytes(after)iferr!=nil {// handle error}sortOptions:=SortOptions{Before:b,After:a}}
It is much more convenient to do it like this:
funcgetPosts(before,after []byte) {sortOptions:=SortOptions{Before:ksuid.FromBytesOrNil(before),After:ksuid.FromBytesOrNil(after),}}
OrNil functions are also used in many other libraries:
- satori/go.uuid
- oklog/ulid (panic)
- Python:svix-ksuid
- Python:cyksuid
- Ruby:ksuid-ruby
- Java:ksuid
- Java:ksuid-creator
- Rust:svix-ksuid
- dotNet:Ksuid.Net
- dotnet:KsuidDotNet
- Erlang:erl-ksuid
- Zig:zig-ksuid
ksuid source code is available under an MITLicense.
About
K-Sortable Globally Unique IDs