Server

We've been working on Keybase.io for a little over half a year now, and wewould like it succeed, but we're a little bit nervous. The more successful weare, the more valuable target we become.

Here are the attacks we are most concerned about:

Server DDOS'ed
Server compromised; attacker corrupts server-side code and keys to send bad data to clients
Server compromised; attacker distributes corrupted client-side code

We've taken some steps to protect the service from these attacks,and we wanted to describe them so you know what to look for.

What Keybase is Really Doing

Before we can describe how we protect keybase, we have to describe what it'sactually doing, and what warrants protection. The central function of Keybaseis to store, in a standardized format, public signatures for our users. Theimportant signatures are of the form:

Identity proofs: "I am Joe on Keybase and MrJoe on Twitter"
Follower statements: "I am Joe on Keybase and I just looked at Chris's identity"
Key ownership: "I am Joe on Keybase and here's my public key"
Revocations: "I take back what I said earlier"

For instance, when Joe wants to establish a connection to an identity onTwitter, he would sign a statement of the first form, and then post thatstatement both on Twitter and Keybase. Outside observers can then reassurethemselves that the accounts Joe on Keybase and MrJoe on Twitter are controlledby the same person. This person is usually the intended keyholder, but ofcourse could be an attacker who broke intoboth accounts.

When an honest Joe signs such a proof, he also signs the hash of his previoussignature. Thus, outside observers who want to verify all of Joe's signaturesneed only verify the last in the chain; the others follow. For example, Ilast signed astatementthat I follow Keybase useral3x. I signed a JSON blob thatcontains relevant information about me and Alex, and also the key-value pair"prev":"d0bd03...", whered0bd03... is the SHA-256 hash of theprevious JSON blob I signed.

For a given user, the sum total of their signatures captures the state theywish to remember and to advertise to the world. For instance, my currentprofile shows that I am maxtaco on Twitter, that I was TacoPlusPlus on GitHub,but now I'm maxtaco there, too, and that I believe the Chris who ismalgorithms on Twitter and malgorithms on GitHub is the "correct" Chris Coyne.Five signatures (one of which is a revocation) comprise this state; andan honest Keybase server should always show everyone these fivesignatures, so we can faithfully reconstruct my state in our clients.

Attacks 1 and 2: DDOS and Corrupted Data

We mentioned three attacks on this system. Consider the first two, which aimto prevent honest clients from retrieving signature data for honest users. Ablunt attacker might DDoS Keybase's servers, preventing anyone from accessingKeybase's data. A more sophisticated attacker might root keybase's server,compromise its signing keys, and start sending back corrupted data to honestclients.

Two mechanisms, enforced by clients and third-party observers, defend againstboth attacks:

All user signature chains must grow monotonically, and can never be "rolled back"
Whenever a user posts an addition to a signature chain, the site mustsign and advertise a change in global site state, and these updates aretotallyordered.

Untrusted Mirrors

The first implication of these requirements is that untrusted third partiescan mirror the site state, and clients can access data from either the Keybaseserver or the mirrors. By requirement (2), the server must publish and signall site updates. A client doesn't care where these updates come from, as longas the signature verifies, and the site state jibes with the signature.

(We're not aware of third-party mirrorsyet, and our reference clientwould need some modifications to handle a read-only server. However, weencourage all to scrape our APIs in preparation.)

Be Honest or Get Caught

The second implication of these requirements is that a compromised serverhas a choice of acting like an honest server, or making "mistakes" thathonest users can detect. An attacker who gains control of the server can:

Selectively rollback a user's signature chain and/or suppress updates
Fake a "key update", and append signatures at the end of a user's chain
Show different versions of the site state to different users

Since version v0.3.0, the Keybasecommand-line clientdefends clients from these server attacks. Take the example of what happens whenI "follow"Alex. My client downloads bothof our signature chains from the server, and runs them through cryptographic verification,checking that our hash chains are well-formed and signed. It furthermorechecks new data against cached data and complains if the server has "rolled back" eitherchain. My client prevents a compromised server from changing Alex's keythe same way it prevents Eve from impersonating Alex: it checks for corroborationof Alex's identity and key proofs on other services (like Twitter, GitHub and DNS).

To prevent the server from"forking" my view of the site data from Alex's, my client checksthat all signature chains are accurately captured in the site's globalMerkle Tree data structure.It downloads therootof this tree from the server, and verifies it against the site'spublic key. If the checkpasses, it fetches thesigned root block. My UID isdbb165..., so my client follows thedb... pathdown the tree, which is block68b5d3.... Now, my leaf is visible, showing my signature chainfinishing off at link 42, with hashd0bd03..., which matches the data it fetchedearlier. My client does the same for Alex's chain. After all checks succeed,my client signs my chain, Alex's chain and also Merkle root at the timeof the signature; it poststhis signature as a follower statement.

A very sophisticated attacker could show my client and Alex's clientdifferent signed Merkle roots, but must maintain these forks permanentlyandcan never merge.Users "comparing notes" out-of-band immediately expose server duplicity.

Keybase Client Integrity

Thus, the keybase clients in the wild play a crucial role in keeping the Keybase serverhonest. They check the integrity of user signature chains, and can findevidence of malicious rollback. They alert Alice when her following ofBob breaks, if either Bob or the server was compromised.They check the site's published Merkle tree root for consistency againstknown signature chains. And they sign proofs when all these checks complete,setting up known safe checkpoints to hold the server accountable to in the future.

So everything depends on the integrity of the Keybase clients, that theyare functioning properly and aren't compromised. We offer severalsafeguards to protect client integrity. First, we keep an Open API and state thatour open-source client is simply areference client, and that developers are freeto make new clients in different languages if they think we've done a bad job.Second, we sign all updates to the Keybase reference client, andprovide anupdate mechanismto download new clients without trusting HTTPS, only the integrityof our key. We keep that private key offline, so that it wouldn't be compromisedin the case of a server compromise.

We fully understand that users of the Keybase Web client don't get these guarantees.But our hope is that enough users will use the Keybase command-line clientto keep the Web users safe, by catching server misbehavior in the case of acompromise.

Next Steps

The purpose of this article was to explain the security mechanisms the keybasesystem currently has in place. Going forward, it would be great if thirdparties were interested in hosting untrusted mirrors. These mirrors couldeventually become auditors, too, allowing Alice and Bob to compare notes andconvince themselves they're seeing a consistent view of the site's state.

And...an update! We're now publishing the merkle root intothe bitcoin block chain.

Thanks for reading, and happy keybasing!

Meet your sigchain (and everyone else’s)

Every Keybase account has a public signature chain (called asigchain), which is an ordered list of statements about how the account has changed over time. When youfollow someone, add a key, or connect a website, your client signs a new statement (called alink) and publishes it to your sigchain.

As JSON (some fields removed), a sigchain looks like this:

[{"body": {"device": { "name": "squares" },"key": { "kid": "01208…" },"type": "eldest"},"prev": null,"seqno": 1},{"body": {"device": { "name": "squares" },"key": { "kid": "01208…" },"type": "web_service_binding","service": { "name": "github", "username": "keybase" }},"prev": "038cd…","seqno": 2},{"body": {"device": { "name": "rectangles" },"key": { "kid": "01208…" },"type": "sibkey","sibkey": { "kid": "01204…", "reverse_sig": "g6Rib…" },},"prev": "192fe…","seqno": 3},{"body": {"device": { "name": "squares" },"key": { "kid": "01208…" },"type": "track","track": {"basics": { "username": "cecileb" },"key": { "kid": "01014…" },"remote_proofs": [{"ctime": 1437414090,"remote_key_proof": {"check_data_json": {"name": "twitter","username": "cecileboucheron"},},},]},},"prev": "9fcc8…","seqno": 3,}]

This sigchain is from a user who…

Signed up for Keybase from a device called “squares” which generated aNaCl device key
Proved their GitHub account
Used squares to add another device called “rectangles” with its own key
Used rectangles to followcecileb

You can try browsing a real sigchainonline or throughthe API. Since sigchains arepublic, you can do this for any user on Keybase!

Every sigchain link is signed by one of the user’s keys and includes a sequence number and the hash of the previous link. Because of this, the server can’t create links on its own or omit links without invalidating the whole sigchain. We use apublic Merkle tree to make it difficult for us to roll back a sigchain to an earlier state without being noticed.

Sibkeys

A Keybase account can have any number of sibling keys (calledsibkeys) which can all sign links. This is different from PGP, which has a “master key” that you’re expected to keep tucked away in a fireproof safe — because if you misplace a device that has a copy of it, your only option is torevoke the whole key and start from scratch. We discuss this problem ina blog post.

You add and remove sibkeys by adding links to your sigchain. Since every link is checked against the state of the accountat that point in the sigchain, old links remain valid even if their signing keys are revoked later. Revoking a key doesn’t affect your identity proofs, other keys, or followers.

Playback

To find the current state of an account (e.g. when you runkeybase id max), the client starts out assuming that the key specified for the account in the Merkle tree is a sibkey, thenplays back the sigchain link by link, keeping track of valid sibkeys and the effects of other links.

An implementation detail: since accounts can be reset, it actually starts playback at the most recent link whoseeldest_kid matches the one in the Merkle tree.

Link structure

A complete version of the first link from the sigchain above looks like this:

{"body": {"device": {"id": "ff07c…","kid": "01208…","name": "squares","status": 1,"type": "desktop"},"key": {"host": "keybase.io","kid": "01208…","uid": "e560f…","username": "sidney"},"type": "eldest","version": 1},"client": {"name": "keybase.io go client","version": "1.0.0"},"ctime": 1443241228,"expire_in": 504576000,"merkle_root": {"ctime": 1443217312,"hash": "06de9…","seqno": 292102},"prev": null,"seqno": 1,"tag": "signature"}

Some properties are common to every type of link. Here’s an overview:

body – Information specific to the type of link, plus some common properties:
- type – The type of the link
- device – Optional details about the device that made the link
- key – Information about the key that will sign the link. Contains these properties:
  - host – Currently always “keybase.io”
  - eldest_kid – TheKID of the eldest key in this subchain. If missing, then eldest key is assumed to be the signing key (helps to identify account resets)
  - kid – The key’s KID
  - key_id: For a PGP key, the last eight bytes of its fingerprint (legacy)
  - fingerprint: For a PGP key, its full fingerprint
  - uid: The user ID of the sigchain’s owner
  - username: The username of the sigchain’s owner
  When a PGP key is being introduced or updated, there can also be afull_hash property which is a SHA-256 hash of an armored copy of the public key. This pins the key to a specific version.
client – Optional version information about the client that made the link
ctime – When the link was created, as aUnix timestamp
expire_in – How long the statement made by the link should be considered valid, in seconds, or0 if it doesn’t expire
merkle_root – The creation time, hash, and sequence number of the Merkle tree root at the time the link was created
prev – The hash of the previous sigchain link when packed ascanonical JSON, ornull if this is the first link
seqno – Specifies that this is thenth link in the user’s sigchain
tag – Currently always “signature”. There may be other tag types in the future.

Properties have been added and deprecated over time, so there’s some duplication and not all links in the wild have them all.

Link types

Each section below starts with an examplebody (and leaves outkey,device, andversion, which were described above).

`eldest`

{"type": "eldest"}

Appears at the beginning of a sigchain or after an account reset (may not have been inserted by legacy clients). The link’s signing key becomes the account’s first sibkey.

`sibkey`

{"type": "sibkey","sibkey": { "kid": "01204…", "reverse_sig": "g6Rib…" }}

Add a new sibkey to the account.reverse_sig is a signature of the link by the new sibkey itself, made with thereverse_sig field set tonull, and makes sure that a user can’t claim another user’s key as their own.

`subkey`

{"type": "subkey","subkey": { "kid": "01216…", "parent_kid": "01204…" }}

Add a new encryption-onlysubkey to the account. We plan to use these in the future.

`pgp_update`

{"type": "pgp_update","pgp_update": {"kid": "01012ba0d60aa99320643f47eb787dc637821bc77cc89ccffbdbfd62124c1c22c1460a","key_id": "0DAA1A4AB1D88291","fingerprint": "5e685e60eb8733654dcb00570daa1a4ab1d88291","full_hash": "e02a1871c01285608c5bac3fb00be419b982c3c312c2c517ec6a1d9f7323be4f",}}

Update a PGP key to a new version (which may have new subkeys, revoked subkeys, new user IDs…). Thepgp_update section contains the same properties akey section would.full_hash is expected to have changed, the other properties should be unchanged.

`revoke`

{"type": "revoke","revoke": {"kids": [ "01201…", "01215…" ],"sig_ids": [ "038cd…", "f927c…" ]}}

Remove the keys inkids from your account. Any previous links they’ve signed are still valid, but they can no longer sign new links and other users should no longer encrypt for them after seeing therevoke link. Also reverse the effects of the links insig_ids — this can be used to remove, for instance, aweb_service_binding.

`web_service_binding`

{"type": "web_service_binding","service": { "name": "github", "username": "keybase" }}

Claim, “I amusername on the websitename”. The client will look for a copy of the link and signature on the website, in an appropriate place. The server searches for the proof when the link is first posted, and caches its permalink (e.g. the tweet, on Twitter, the Gist, on GitHub) so that the client doesn’t have to rediscover it each time.

Theservice section can also look like this, which claims that you control the given domain name (the client looks for the proof in DNS):

{ "domain": "keybase.io", "protocol": "dns" }`

…or like this, which claims that you control the given website (the client looks for the proof at a known path):

{ "hostname": "keybase.io", "protocol": "http:" }`

`track` (to 'follow' someone)

We call this "follow" around the interface now, but our old word is "track"...so that's what you'll see in your sig chain:

{"type": "track","track": {"id": "673a7…","basics": {"id_version": 30,"last_id_change": 1440211236,"username": "cecileb"},"key": {"kid": "01018…","key_fingerprint": "6f989…"},"pgp_keys": [{"kid": "01018…","key_fingerprint": "6f989…"}],"remote_proofs": [{"ctime": 1437414090,"curr": "ee483…","etime": 1595094090,"remote_key_proof": {"check_data_json": {"name": "twitter","username": "cecileboucheron"},"proof_type": 2,"state": 1},"seqno": 1,"sig_id": "02ad8…","sig_type": 2}]}}

Make asnapshot of another user’s identity that your other devices trust. Thetrack section contains these properties:

id – Their user ID
basics – Contains their username and identity generation. The server bumps the identity generation whenever the state of any of their proofs changes, as a hint to the client that it should recheck them all and possibly alert the user to the change.
key – Contains the KID and fingerprint (if applicable) of their eldest key.
pgp_keys – An array of the KID and fingerprint of every one of their PGP keys
remote_proofs – An array of objects which represent their proofs. Many properties are copied directly from the relevant links in the followee’s sigchain, but there are some non-obvious ones:
- curr – The hash of the link which contains the proof
- sig_type – An integer representation of the proof’s link type, currently always2 (web_service_binding)
- remote_key_proof
  - check_data_json – Theservice section of the identity proof link
  - proof_type – An integer representation of the account being proven (Twitter, GitHub, etc.)
  - state – An integer representation of whether the client could successfully verify the proof when making the tracking statement.

A repeatfollow link for the same user replaces the previous one (the user may have re-followed due to a change in proofs).

`untrack` (to "unfollow" someone)

{"type": "untrack","untrack": {"basics": { "username": "maria" },"id": "47968…"}}

Stop following a user. Your other devices will resume checking their identity proofs and presenting them to you whenever you interact with them.id is the user’s UID.

`cryptocurrency`

{"type": "cryptocurrency","cryptocurrency": { "address": "1BYzr…", "type": "bitcoin" }}

Advertise a cryptocurrency address. Currently Bitcoin, Zcash and Zcash sapling addresses are supported.

`per_user_key`

{"per_user_key": {"encryption_kid": "0121ef031c4b97e9e7febbfcce64952acba528a0f1b3f67b9c4264fa0a4ebefd401b0a","generation": 15,"reverse_sig": "hKRib2R5hqhkZXRhY2hlZ...","signing_kid": "0120eb42e0f5db28909adae170de9f5fc24016dc716b4fcc5f6b3956ee1e4937e9880a"    },    "type": "per_user_key",}

Add or rotate aPer-user signing and encryption key.reverse_sig is the signature over the sigchain link with new per-user signing key itself.Thegeneration number starts at one and increments whenever the per-user keys are rotated, typicallyafter a device revocation.

footnote 1: PGP key servers and lying by omission

When someone changes a PGP key — to update its expiration date or add a signature, for example — they’re expected to broadcast the change to akey server. That key server is responsible for forwarding the change to other key servers, and so on. Eventually, someone else can ask any other key server if there have been updates to the key, and receive them.

Notably, nothing stops someone from making a change to their PGP key on one computer, a different change on another computer, and sending each change to a different key server. The key servers are expected to share updates and offer their own combined versions of the key for download.

The design of PGP keys stops an attacker from creating fake updates, but a dishonest key server can still choose to ignore updates that revoke keys, revoke signatures, and add expiration dates, but publish updates that add new keys, add new signatures, and take away expiration dates.

Keybase sigchains aim to avoid this.

Understanding following (previously called "tracking")

We get some big questions about Keybase following:

When should I follow?
What does it get me?
Is it a "web of trust?"

Hopefully this page can clarify and answer your q's.

But first, the goal of Keybase

Keybase aims to provide public keys that can be trusted without any backchannel communication. If you need someone's public key, you should be able to get it, and know it's the right one, without talking to them in person.

This is a daunting proposition: servers can be hacked or coerced into lying about a key. So when you run a Keybase client - whether it's ourreference client or someone else's - that client needs to be highly skeptical about what the server says.

When the Keybase server replies"this is twitter user @maria2929's public key", there has to be a protocol for verification.

Therefore, any cryptographic action on Maria follows 3 basic steps:

The server provides maria's info
Your client verifies her identity proofs on its own
You perform a human review of her usernames

Let's go over these three steps.

Step 1: the request

When you wish to encrypt a message for your friend Maria, you might execute a command like this:

bash keybase encrypt maria -m "grab a beer tonight?"

So, first, your client asks the Keybase server who this mysterious maria is.

Keybase, theserver, provides a response that explains its view of "maria". Technically speaking, it's a JSON object and there's a little more data in there, but the meat is something like this:

json { "keybase_username": "maria", "public_key": "---- BEGIN PGP PUBLIC KEY...", "twitter_username": "@maria2929", "twitter_proof": "https://twitter.com/maria2929/2423423423" }

Keybase has done its own server-side verification of maria, and it won't pass back identities that it hasn't checked.

Step 2: the computer review

The keybase client does not trust the Keybase server. The server has justclaimed thatkeybase:maria and@maria2929 are the same person. But are they? In step 2, the client checks on its own.

Fortunately, the server included a link to maria's tweet. The Keybase client scrapes it.

To satisfy the client, the tweet must be special. It must link to a signed statement which claims to be from maria on Keybase.

In simplest terms, the Keybase client guarantees that "maria" has access to three things: (1) the Keybase account, (2) the twitter account, and (3) the private key referenced back in step 1.

All this happens really fast in the client with no inconvenience to you. And it happens for all of maria's identities: her twitter account, her personal website, her github account, etc.

Step 3: the human review

Recall, in Step 2 your client proved "maria" has a number of identities, and it cryptographically verified all of them. Now you can review the usernames it verified, to determine if it's the maria you wanted.

✔maria2929 ontwitter: https://twitter.com/2131231232133333...✔pasc4l_programmer ongithub: https://gist.github.com/pasc4...✔ admin ofmariah20.com viaHTTPS: https://mariah20/keybase.tx...

Is this the maria you wanted? [y/N]

If it is, the Keybase client encrypts and you're done.

Finally: following

Steps 2 and 3 were easy enough, but it would stink to keep repeating them, every time you switched computers. Especially the human review. Ideally, once you're satisfied with maria, you can just do this from any computer:

bash

this should work with no interactivity

keybase encrypt maria -m "another beer?"

But we have a problem: recall, you don't trust the Keybase server. So how can you get maria's info when you switch machines, without doing that username review thing again? The answer is following.

"Following" (which we used to call "tracking") is taking a signed snapshot.

Using your own private key, you can sign a snapshot of her identity. Specifically, you're signing the data from step 1, with some extra info about your own review.

When you switch computers, the Keybase server can provide you with your own definition of maria, which is signed by you, so it can't be tampered with.

Your client can continue to perform the computer review as often as it wants. If the tweet disappears, your client will want to know.

The advantages of public following

When Maria is followed by 100 people, and they've all signed identical snapshots to yours, this is helpful.

If some of these statements are months old, but your own is only 1 day old, you can get some peace of mind that her identity was not compromised today, the day you decided to follow her.

This is not a web of trust. You can prove maria's identity, even if there are no other followers. But more followers means more confidence in the age of her account.

Why follow now?

As hinted above, an older follower statement is superior to a new one. It's hard for a hacker to maintain apublic compromise of all of maria's accounts over a span of many months. Maria or maria's friends would surely notice.

By comparison, if you started following Maria right now, today could've been the day all her accounts were broken into, simultaneously.

Therefore, an older follower statement is a better one.

A gentle conclusion: if you find someone interesting on Keybase - say you know them, or you like to read things they write, or they're a software developer who might sign code - following them now makes sense. This will begin a long and auditable history of following their identity.

We hope this doc helps. We'll revise it as questions/suggestions arrive in ourgithub issue #100 (I don't understand tracking).

footnote 1: the PGP web of trust

In the web of trust model, you know you have Maria's key because you trust John, and John signed a statement saying that another key belongs to his friend "Carla", and then Carla in turn signed a statement saying that Maria is someone whose driver's license and key fingerprint she reviewed at a party. Your trust of Maria's key is a function of these such connections.

you → john → carla →maria
you → herkimer → carla →maria

The PGP web of trust has existed for over 20 years. However it is very difficult to use, it requires in-person verifications, and it's hard to know what trust level to assign transitively. (Herkimer reports that Carla was drunk; John can't remember, but he was drunk too, and who's Carla again???)

Movatterモバイル変換