Distributed systems• Understandingdistributed systems is essentialin order to understand blockchain becausebasically blockchain at its core is a distributedsystem. More precisely it is a decentralizeddistributed system.
5.
• Distributed systemsare a computing paradigmwhereby two or more nodes work with eachother in a coordinated fashion in order toachieve a common outcome and it's modeledin such a way that end users see it as a singlelogical platform.
7.
• A nodecan be defined as an individual playerin a distributed system. All nodes are capableof sending and receiving messages to andfrom each other. Nodes can be honest, faulty,or malicious and have their own memory andprocessor. A node that can exhibit arbitrarybehavior is also known as a Byzantine node.
8.
• This arbitrarybehavior can be intentionallymalicious, which is detrimental to theoperation of the network. Generally, anyunexpected behavior of a node on thenetwork can be categorized as Byzantine. Thisterm arbitrarily encompasses any behaviorthat is unexpected or malicious:
9.
• Design ofa distributed system; N4 is a Byzantine node,L2 is broken or a slow network link• The main challenge in distributed system design iscoordination between nodes and fault tolerance. Even ifsome of the nodes become faulty or network links break,the distributed system should tolerate this and shouldcontinue to work flawlessly in order to achieve thedesired result. This has been an area of active researchfor many years and several algorithms and mechanismshas been proposed to overcome these issues.
10.
• Distributed systemsare so challenging todesign that a theorem known as the CAPtheorem has been proved and states that adistributed system cannot have all muchdesired properties simultaneously.
11.
CAP theorem• Thisis also known as Brewer's theorem,introduced originally by Eric Brewer as aconjecture in 1998; in 2002 it was proved as atheorem by Seth Gilbert and Nancy Lynch.
12.
• The theoremstates that any distributed systemcannot have Consistency, Availability, and Partitiontolerance simultaneously:– Consistency is a property that ensures that all nodes in adistributed system have a single latest copy of data– Availability means that the system is up, accessible foruse, and is accepting incoming requests and respondingwith data without any failures as and when required– Partition tolerance ensures that if a group of nodes failsthe distributed system still continues to operate correctly
13.
• In orderto achieve fault tolerance, replication isused. This is a common and widely usedmethod to achieve fault tolerance.• Consistency is achieved using consensusalgorithms to ensure that all nodes have thesame copy of data.• This is also called state machine replication.Blockchain is basically a method to achievestate machine replication.
14.
Byzantine Generals problem•Before discussing consensus in distributedsystems, events in history are presented thatare precursors to the development ofsuccessful and practical consensusmechanisms.
15.
• In September1962, Paul Baran introduced theidea of cryptographic signatures with hispaper On distributed communicationsnetworks. This is the paper where the conceptof decentralized networks was also introducedfor the very first time.
16.
• Then in1982 a thought experiment was proposed by Lamport etal. whereby a group of army generals who are leading differentparts of the Byzantine army are planning to attack or retreat froma city.• The only way of communication between them is a messengerand they need to agree to attack at the same time in order to win.• The issue is that one or more generals can be traitors and cancommunicate a misleading message.• Therefore there is a need to find a viable mechanism that allowsagreement between generals even in the presence of treacherousgenerals so that the attack can still take place at the same time.
17.
• As ananalogy with distributed systems,generals can be considered as nodes, traitorscan be considered Byzantine (malicious)nodes, and the messenger can be thought ofas a channel of communication between thegenerals.
18.
• This problemwas solved in 1999 by Castro andLiskov who presented the Practical ByzantineFault Tolerance (PBFT) algorithm.• Later on in 2009, the first practicalimplementation was made with the inventionof bitcoin where the Proof of Work (PoW)algorithm was developed as a mechanism toachieve consensus.
19.
Consensus• Consensus isa process of agreement between distrustingnodes on a final state of data.• In order to achieve consensus different algorithms can beused.• It is easy to reach an agreement between two nodes (forexample in client-server systems) but when multiple nodesare participating in a distributed system and they need toagree on a single value it becomes very difficult to achieveconsensus.• This concept of achieving consensus between multiplenodes is known as distributed consensus.
20.
Consensus mechanisms• Aconsensus mechanism is a set of steps thatare taken by all, or most, nodes in order toagree on a proposed state or value.• For more than three decades this concept hasbeen researched by computer scientists in theindustry and Academia. Consensusmechanisms have recently come into thelimelight and gained much popularity with theadvent of bitcoin and blockchain.
21.
• There arevarious requirements which must bemet in order to provide the desired results in aconsensus mechanism.• The following are their requirements withbrief descriptions:
22.
• Agreement: Allhonest nodes decide on thesame value.• Termination: All honest nodes terminateexecution of the consensus process andeventually reach a decision.• Validity: The value agreed upon by all honestnodes must be the same as the initial valueproposed by at least one honest node.
23.
• Fault tolerant:The consensus algorithmshould be able to run in the presence of faultyor malicious nodes (Byzantine nodes).• Integrity: This is a requirement where by nonode makes the decision more than once. Thenodes make decisions only once in a singleconsensus cycle.
24.
Types of consensusmechanism• There are various types of consensusmechanism; some common types aredescribed as follows:– Byzantine fault tolerance-based– Leader-based consensus mechanisms
25.
Byzantine fault tolerance-based•With no compute intensive operations such aspartial hash inversion, this method relies on asimple scheme of nodes that are publishingsigned messages. Eventually, when a certainnumber of messages are received, then anagreement is reached.
26.
Leader-based consensus mechanisms•This type of mechanism requires nodes tocompete for the leader-election lottery andthe node that wins it proposes a final value.
27.
Paxos• Many practicalimplementations have beenproposed such as Paxos, the most famousprotocol introduced by Leslie Lamport in 1989.In Paxos nodes are assigned various roles suchas Proposer, Acceptor, and Learner. Nodes orprocesses are named replicas and consensus isachieved in the presence of faulty nodes byagreement among a majority of nodes.
28.
RAFT• Another alternativeto Paxos is RAFT, whichworks by assigning any of three states, that is,Follower, Candidate, or Leader, to the nodes. ALeader is elected after a candidate nodereceives enough votes and all changes nowhave to go through the Leader, who commitsthe proposed changes once replication on themajority of follower nodes is completed.
29.
• Introduction –basic ideas behind blockchain-Basic Cryptographic primitives used inBlockchain – Secure, Collison-resistant, hashfunctions, Properties of a hash function, digitalsignature, public key cryptosystems, zero-knowledge proof systems Basic DistributedSystem concepts
30.
basic ideas behindblockchain• Blockchain at its core is a peer-to-peerdistributed ledger that is cryptographicallysecure, append-only, immutable (extremelyhard to change), and updateable only viaconsensus or agreement among peers.
31.
• Blockchain canbe thought of as a layer of adistributed peer-to-peer network running ontop of the Internet, as can be seen below inthe diagram. It is analogous to SMTP, HTTP, orFTP running on top of TCP/IP. This is shown inthe following diagram:
• From abusiness point of view a blockchaincan be defined as a platform whereby peerscan exchange values using transactionswithout the need for a central trustedarbitrator. This allows blockchain to be adecentralized consensus mechanism where nosingle authority is in charge of the database.
34.
block• A blockis simply a selection of transactionsbundled together in order to organize themlogically. It is made up of transactions and itssize is variable depending on the type anddesign of the blockchain in use. A reference toa previous block is also included in the blockunless it's a genesis block.
35.
• A genesisblock is the first block in theblockchain that was hardcoded at the time theblockchain was started.
36.
• The structureof a block is also dependent on the typeand design of a blockchain, but generally there are a fewattributes that are essential to the functionality of a block,such as the– block header,– pointers to previous blocks,– the time stamp,– nonce,– transaction counter,– transactions, and– other attributes.
Various technical definitionsof blockchains• Blockchain is a decentralized consensusmechanism. In a blockchain, all peerseventually come to an agreement regardingthe state of a transaction.
39.
Various technical definitionsof blockchains• Blockchain is a distributed shared ledger.Blockchain can be considered a shared ledgerof transactions. The transaction are orderedand grouped into blocks. Currently, the real-world model is based on private databasesthat each organization maintains whereas thedistributed ledger can serve as a single sourceof truth for all member organizations that areusing the blockchain.
Generic elements ofa blockchain• AddressesAddresses are unique identifiers that are used ina transaction on the blockchain to denotesenders and recipients.An address is usually a public key or derivedfrom a public key. While addresses can bereused by the same user, addresses themselvesare unique.
42.
• Transaction• Atransaction is the fundamental unit of ablockchain. A transaction represents a transferof value from one address to another.
43.
• Block Ablock is composed of multipletransactions and some other elements such asthe previous block hash (hash pointer),timestamp, and nonce.
44.
Peer-to-peer network• Asthe name implies, this is a networktopology whereby all peers can communicatewith each other and send and receivemessages.
45.
Scripting or programminglanguage• This element performs various operations on atransaction.• Transaction scripts are predefined sets of commandsfor nodes to transfer tokens from one address toanother and perform various other functions.• Turing complete programming language is adesirable feature of blockchains; however, thesecurity of such languages is a key question and anarea of important and ongoing research.
46.
Virtual machine• Thisis an extension of a transaction script. Avirtual machine allows Turing complete code tobe run on a blockchain (as smart contracts)whereas a transaction script can be limited in itsoperation. Virtual machines are not available onall blockchains; however, various blockchainsuse virtual machines to run programs, forexample Ethereum Virtual Machine (EVM) andChain Virtual Machine (CVM).
47.
State machine• Ablockchain can be viewed as a statetransition mechanism whereby a state ismodified from its initial form to the next andeventually to a final form as a result of atransaction execution and validation processby nodes.
48.
Nodes• A nodein a blockchain network performs variousfunctions depending on the role it takes. A node canpropose and validate transactions and performmining to facilitate consensus and secure theblockchain. This is done by following a consensusprotocol. (Most commonly this is PoW.) Nodes canalso perform other functions such as simple paymentverification (lightweight nodes), validators, and manyothers functions depending on the type of theblockchain used and the role assigned to the node.
49.
Smart contracts• Theseprograms run on top of the blockchainand encapsulate the business logic to beexecuted when certain conditions are met.The smart contract feature is not available inall blockchains but is now becoming a verydesirable feature due to the flexibility andpower it provides to the blockchainapplications.
50.
• Introduction –basic ideas behind blockchain-Basic Cryptographic primitives used inBlockchain – Secure, Collison-resistant, hashfunctions, Properties of a hash function, digitalsignature, public key cryptosystems, zero-knowledge proof systems Basic DistributedSystem concepts
51.
Basic Cryptographic primitivesused inBlockchain• Cryptographic primitives are the basic buildingblocks of a security protocol or system.• A security protocol is a set of steps taken inorder to achieve required security goals byutilizing appropriate security mechanisms.
52.
• Various typesof security protocols are in use,such as authentication protocols, non-repudiation protocols, and key managementprotocols.
53.
A generic cryptographymodel is shown in thefollowing diagram:• P - Plain text• E - Encryption,• C - Cipher text,• D - Decryption
54.
• Entity: Itis either a person or a system that sends, receives,or performs operations on data• Sender: Sender is an entity that transmits the data• Receiver: Receiver is an entity that takes delivery of the data• Adversary: This is an entity that tries to circumvent thesecurity service• Key: A key is some data that is used to encrypt or decryptdata• Channel: Channel provides a medium of communicationbetween entities
Symmetric cryptography• Symmetriccryptography refers to a type ofcryptography whereby the key that is used toencrypt the data is the same for decryptingthe data, and thus it is also known as a sharedkey cryptography. The key must be establishedor agreed on before the data exchangebetween the communicating parties. This isthe reason it is also called secret keycryptography.
57.
• There aretwo types of symmetric ciphers,– stream ciphers and– block ciphers.Data Encryption Standard (DES) and AdvancedEncryption Standard (AES) are common examples ofblock ciphers,whereas RC4 and A5 are commonly used streamciphers.
58.
Stream ciphers• Theseciphers are encryption algorithms thatapply encryption algorithms on a bit-by-bitbasis to plain text using a key stream. There aretwo types of stream ciphers: synchronous andasynchronous. Synchronous stream ciphers areones where key stream is dependent only onthe key, whereas asynchronous stream ciphershave a key stream that is also dependent onthe encrypted data.
59.
• In streamciphers, encryption and decryptionare basically the same function because theyare simple modulo 2 additions or XORoperation. The key requirement in streamciphers is the security and randomness of keystreams. Various techniques have beendeveloped to generate random numbers, andit's vital that all key generators becryptographically secure:
Block ciphers• Theseare encryption algorithms that break upa text to be encrypted (plain text) into blocksof fixed length and apply encryption block byblock. Block ciphers are usually built using adesign strategy known as Fiestel cipher.Recent block ciphers, such as AES (Rijndael)have been built using a combination ofsubstitution and permutation calledsubstitution-permutation network (SPN).
62.
• Fiestel ciphersare based on the Fiestelnetwork, which is a structure developed byHorst Fiestel. This structure is based on theidea of combining multiple rounds of repeatedoperations to achieve desirable cryptographicproperties knows as confusion and diffusion.Fiestel networks operate by dividing data intotwo blocks (left and right) and process theseblocks via keyed round functions.
63.
• Confusion makesthe relationship between theencrypted text and plaintext complex. This is achievedby substitution in practice. For example, 'A' in plain textis replaced by 'X' in encrypted text. In moderncryptographic algorithms, substitution is performedusing lookup tables called S-boxes. Confusion isrequired to make finding the encryption key verydifficult even if many encrypted and decrypted datapairs are created using the same key. In practice, this isachieved by transposition or permutation.
64.
• A keyadvantage of using Fiestel cipher is thatencryption and decryption operations arealmost identical and only require a reversal ofthe encryption process in order to achievedecryption. DES is a prime example of Fiestel-based ciphers:
• Various modesof operation for block ciphersare Electronic Code Book (ECB), Cipher blockchaining (CBC), Output Feedback Mode (OFB),or Counter mode (CTR). These modes are usedto specify the way in which an encryptionfunction would be applied to the plain text.These modes will be explained later in thissection, but the first four categories of blockcipher encryption modes are introduced here.
67.
Block encryption mode•In this mode, plaintext is divided into blocks offixed length depending on the type of cipherused and then the encryption function isapplied on each block.
68.
Keystream generation modes•In this mode, the encryption functiongenerates a keystream that is then XORed withthe plaintext stream in order to achieveencryption.
69.
Message authentication modes•In this mode, a message authentication code iscomputed as a result of an encryptionfunction. MAC is basically a cryptographicchecksum that provides an integrity service.The most common method to generate MACusing block ciphers is CBC-MAC, where somepart of the last block of the chain is used as aMAC.
70.
Cryptographic hashes• Hashfunctions are basically used to compressa message to a fixed length digest. In thismode, block ciphers are used as acompression function to produce a hash ofplain text. The most common block encryptionmodes are discussed briefly.
71.
Electronic code book•This is a basic mode of operation in which theencrypted data is produced as a result ofapplying the encryption algorithm one by oneseparately to each block of plain text. This isthe simplest mode but should not be used inpractice as it is insecure and can revealinformation:
Cipher block chaining•In this mode, each block of plain text is XORedwith the previous encrypted block. The CBCmode uses initialization vector IV to encryptthe first block. It is recommended that IV berandomly chosen:
Counter mode• TheCTR mode effectively uses a block cipheras a stream cipher. In this case, a uniquenonce is supplied that is concatenated withthe counter value in order to produce a keystream:
• There areother modes, such as Cipher Feedbackmode (CFB), Galois Counter mode (GCM), andOutput Feedback mode, which are also used invarious scenarios. In the following section, youwill be introduced to the design and mechanismof a currently dominant block cipher know asAES. First, some history will be presented withregard to Data Encryption Standard (DES) thatled to the development of a new AES standard.
78.
• Introduction –basic ideas behind blockchain-Basic Cryptographic primitives used inBlockchain – Secure, Collison-resistant, hashfunctions, Properties of a hash function, digitalsignature, public key cryptosystems, zero-knowledge proof systems Basic DistributedSystem concepts
79.
Hash functions• Hashfunctions are used to create fixed lengthdigests of arbitrarily long input strings. Hashfunctions are keyless and provide the dataintegrity service. They are usually built usingiterated and dedicated hash functionconstruction techniques. Various families ofhash functions are available, such as MD,SHA1, SHA-2, SHA-3, RIPEMD, and Whirlpool.
80.
Hash functions• Hashfunctions are commonly used in digitalsignatures and message authentication codes.They have three security properties, namelypre-image resistance, second pre-imageresistance, and collision resistance.
81.
Hash functions• Hashfunctions are typically used to provide dataintegrity services. These can be used as one-wayfunctions and to construct other cryptographicprimitives, such as MACs and digital signatures.Some applications used hash functions as a meansof generating pseudo random numbers (PRNGs).Hash functions do not require a key. There are twopractical and three security properties of hashfunctions that must be met depending on the levelof requirements of integrity.
82.
Compression of arbitrarymessages into fixedlength digest• This property is concerned with the fact that ahash function must be able to take a longinput text of any length and output a fixedlength compressed message. Hash functionsproduce a compressed output in various bitsizes, usually between 128-bits and 512-bits.
83.
Easy to compute•Hash functions are efficient and fast one-wayfunctions. The requirement is that they bevery quick to compute regardless of themessage size. The efficiency may decrease ifthe message is too big but the function shouldstill be fast enough for practical use. In thefollowing section, security properties of hashfunctions are discussed.
84.
Pre-image resistance• Consideran equation: h(x) = y Here,h is the hash function, x is the input, and y is thehash. The first security property requires that ycannot be reverse computed to x. x isconsidered a pre-image of y, hence the namepre-image resistance. This is also called one-wayproperty.
86.
Second pre-image resistance•This property requires that given x and h(x) , itis almost impossible to find any other messagem , where m != x and hash of m = hash of x.h(m) = h(x). This property is also known asweak collision resistance.
88.
Collision resistance• Thisproperty requires that two different inputmessages should not hash to the same output.In other words, h(x) != h(z). This property isalso known as strong collision resistance.
89.
• Hash functions,due to their very nature, willalways have some collisions, and that is wheretwo different messages hash to the sameoutput, but they should be computationallyinfeasible to find. A concept known as avalancheeffect is desirable in all hash functions.Avalanche effect specifies that a small change,even a single character change in the input text,will result in a totally different hash output.
90.
• Hash functionsare usually designed byfollowing iterated hash functions approach. Inthis method, the input message is compressedin multiple rounds on a block-by-block basis toproduce the compressed output. A populartype of iterated hash function is Merkle-Damgard construction.
91.
• This constructionis based on the idea of dividing theinput data into equal sizes of blocks and then feedingthem through the compression functions in an iterativemanner. The collision resistance of the property ofcompression functions ensures that the hash output isalso collisionresistant. Compression functions can bebuilt using block ciphers. In addition to Merkle-Damgard, there are various other constructions ofcompression functions proposed by researchers, forexample, MiyaguchiPreneel and Davies-Meyer.
93.
Design of SecureHash Algorithms (SHA)• In the following section, you will beintroduced to the design of SHA-256 and SHA-3. Both of these are used in bitcoin andEthereum, respectively. Ethereum doesn't useNIST Standard SHA-3 but Keccak, which is theoriginal algorithm presented to NIST. NIST,after some modifications such as increase inthe number of rounds and simpler messagepadding, standardized Keccak as SHA-3
94.
SHA-256• SHA-256 hasthe input message size < 2^64-bits. Block size is 512-bits and has a word sizeof 32-bits. Output is 256-bit digest. Thecompression function processes a 512-bitmessage block and a 256- bit intermediatehash value. There are two main componentsof this function: compression function and amessage schedule.
95.
• The algorithmworks as follows:• Pre-processing: 1. Padding of the message,which is used to make the length of a block to512-bits if it is smaller than the required blocksize of 512-bits.
96.
• 2. Parsingthe message into message blocks thatensure that the message and its padding is dividedinto equal blocks of 512-bits.• 3. Setting up the initial hash value, which is the eight32-bit words obtained by taking the first 32-bits ofthe fractional parts of the square roots of the firsteight prime numbers. These initial values arerandomly chosen in order to initialize the processand gives a level of confidence that no backdoorexists in the algorithm.
97.
• Hash computation:•1. Each message block is processed in a sequence andrequires 64 rounds to compute the full hash output.Each round uses slightly different constants to ensurethat no two rounds are the same.• 2. First, the message schedule is prepared.• 3. Then, eight working variables are initialized.• 4. Then, the intermediate hash value is calculated.• 5. Finally, the message is processed and the output hashis produced:
99.
• In thepreceding diagram, a, b, c, d, e, f, g, andh are the registers. Maj and Ch are appliedbitwise. performs bitwise rotation. Roundconstants are Wj and Kj , which are addedmod 2^32.