1、ETH-以太坊概述
比特币和以太坊是两种最主要的加密货币,比特币被称为区块链1.0,以太坊被称为区块链2.0
Bitcoin and Etheria are the two most important encrypted currencies xff0c; Bitcoin is known as block chains 1.0xff0c; and Etheria is known as block chains 2.0.
以太坊在系统设计上针对比特币运行过程中出现的问题进行了改进,比如:
#xff0c; e.g. #xff1a;
- 出块时间,比特币的区块时间是10分钟,以太坊的出块时间大幅度降低到了十几秒,而且为了适应这种新的出块时间,以太坊还设计了一套基于GHOST的共识机制
- 以太坊的另一个改进就是挖矿使用的mining puzzle,比特币的mining puzzle是计算密集型的,比拼的是计算哈希值的算力,这样造成的结果是挖矿设备的专业化,这样跟以前宣扬的去中心化的理念是不符合的,所以以太坊设计的mining puzzle对内存的要求就是很高的(memory hard mining puzzle),这样设计的目的是限制了ASIC芯片的使用(ASIC resistance)
- 将来以太坊还会有些革命性的改变,用权益证明(POS,proof of stake)来替代工作量证明(POW,proof of work)。权益证明就是不挖矿而是按照类似于股份投票的方法决定下一个区块怎么产生
- 除此之外,以太坊还增加了一个重要的功能,对智能合约(smart contract)的支持
1)、去中心化的货币/合约的概念
比特币实现的是一种去中心化的货币,比特币取得成功之后,很多人就开始思考:除了货币可以去中心化,还有什么可以去中心化?以太坊的一个特性就是增加了对去中心化的合约的支持
Bitcoin achieved a decentralised currency xff0c; after Bitcoin succeeded xff0c; many began to think about xff1a; what else could be decentralized xff1f; and one of the characteristics of Etheria was increased support for decentralized contracts.
比特币(BitCoin):decentralized currency(去中心化的货币),符号是BTC,最小计量单位是Satoshi(一聪),1个比特币等于1亿聪,因为比特币的创始人名为中本聪(Satoshi Nakamoto)
Bitcoin( BitCoin) xff1a; decentralized currence( decentralised currency xff09; xff0c; symbol BTC, minimum unit of measure: Satoshi( one hearing #xff09; xff0c; one bitcoin equals 100 million xff0c; because the founder of Bitcoin is named #xff08; Satoshi Nakamotoff09;
以太坊(Ethereum):decentralized contract(去中心化的合约),符号是ETH,它的币通俗地叫做以太,也叫Ether,最小计量单位是Wei(一伟),是为了致敬密码学的先驱戴伟(Wei Dai)
xff08; Etheeum) xff1a; decentralized contract( decentralised contract xff09; xff0c; symbol ETH, commonly known as Ether#xff0c; or Ether, or Weixff08; Weixff09; xff0c; vanity xff0c; pioneer in cryptography; Wei Daiff09;
去中心化的货币:
decentralised currencyxff1a;
货币本来是应该由政府发行的,货币的价值建立在政府公信力的基础上,然后政府通过一些司法手段来维护货币的正常运行。比特币的出现用技术手段把政府的这些职能给取代了,通过密码学、共识机制来维护加密货币体系的正常运行
The currency was supposed to be issued by the government xff0c; the value of the currency was based on the credibility of the government xff0c; the government then maintained the proper functioning of the currency through a number of judicial means. Bitcoin’s emergence replaced the government’s functions by technical means xff0c; and maintained the proper functioning of the encrypted monetary system through cryptography and consensus mechanisms.
去中心化的合约:
decentralised contractxff1a;
现实生活中,合约的有效性也是应该通过司法手段,通过政府来维护的,比如和人签一个合同,这个合同如果出现纠纷,通过打官司/法院判决,法院先看一下这个合同是谁签的,有没有当事人的合法签名,合同当中如何规定,是谁违反了合同,看看哪一方有错,对于违约方按照合同中的条款应该给予什么样的处罚,这就是现实生活中的合同,通过司法手段维护合同的有效性。那么我们能不能也用技术手段这些司法手段给取代了,这就是以太坊智能合约的设计目的
In reality, xff0c; the validity of the contract is also to be maintained by judicial means xff0c; xff0c; xff0c; xff0c; for example, by signing a contract with a person xff0c; xff0c; xff0c by filing a lawsuit/court judgement; xff0c by the court; ff0c by who signed the contract; xff0c by the legal signature of the parties xff0c; ff0c in the contract; xff0c; xff0c; xff0c by the party in breach; xff0c; this is the actual contract xff0c; maintenance of the validity of the contract by judicial means.
如果合同中的内容是可以通过程序代码来实现出来的,那么就可以把代码放到区块链上,通过区块链的不可篡改性来保证代码的正确运行。当然,不是所有的合同内容都用编程语言来实现,也不是所有的合同条款都是可以被量化的,但是有一些逻辑比较简单,比较清晰的合同是可以写成智能合约的形式
If the content of the contract is program code xff0c; then the code can be placed on the block chain xff0c; it can be guaranteed by the immutable nature of the block chain. Of course xff0c; not all contract elements are programmed to be xff0c; not all contract terms are quantifiable xff0c; there are some logic that is simpler xff0c; clearer contracts can be written in the form of smart contracts.
2)、去中心化的货币/合约的好处
去中心化的货币的好处:
The benefits of decentralised currencies
应用场景举例:跨国转账
Examples of application scenes & #xff1a; cross-border transfers
比如说从美国转一笔钱到埃及,用法币(fait currency)是很麻烦的,时间很长,要办很多手续,交易费也贵,如果用比特币转账,就会好很多,这是比特币的一个优势。虽然说比特币每十分钟才出一个区块,有各种各样不是很完美的地方,但是用比特币跨国转账还是比法币要快很多
For example, the transfer of money from the United States to Egypt xff0c; the use of French currency xff08; the use of French currency xff09; the troublesome xff0c; the long xff0c; the many formalities to do xff0c; the expensive transaction costs xff0c; the fact that transfers from bitcoin xff0c; the advantage of bitcoin xff0c; the fact that bitcoin comes out of a block xff0c every 10 minutes; the variety of imperfect places xff0c; the fact that cross-border transfers from bitcoin are still much faster than French coins.
去中心化的合约的好处:
The benefits of a decentralised contract
应用场景举例:跨国合同签署
Examples of application scenes & #xff1a; cross-border contract signing
如果合同的签署方是来自世界各地的,没有一个统一的司法管辖权,这个时候用司法手段来维护合同的有效性比较困难。就像在网上弄一个众筹,众筹的参与方来自全国各地,彼此之间不认识,打官司也不知道到哪儿去打。这种情况下,如果通过事先写好的程序代码来保证每个人都只能按照规则来执行,这是一种比较好的解决方法
If the signatories to the contract are xff0c from all over the world; there is no single jurisdiction xff0c; it is difficult to use judicial means to preserve the validity of the contract at this time. It's like getting a crowd-raising xff0c; having participants from all over the country xff0c; not knowing each other xff0c; not knowing where to fight. In this case xff0c; ensuring that everyone can only be executed according to the rules through a pre-written code of procedure xff0c; this is a better solution.
就算合同的参与方都在同一个司法管辖权之内的,想通过司法手段来维护合同的执行也是一个比较费时费力的过程,打官司要花好多时间和精力。就算官司赢了,也不一定能拿到钱,还得申请冻结对方资产,申请强制执行之类的。所以最好是用技术手段保证合同的参与方从一开始就不能违约
Even if the parties to the contract are within the same jurisdiction xff0c; if the enforcement of the contract is to be upheld by judicial means, it is also a more time-consuming process xff0c; it takes a lot of time and effort to get a lawsuit. Even if the case wins xff0c; it doesn't necessarily get the money xff0c; it also has to apply for the freezing of the other party's assets xff0c; application for enforcement, etc.
智能合约的好处就在于这个代码一旦发布到区块链上,那么区块链的不可篡改性,只能按照代码中制定的规则来执行
The advantage of an intelligent contract is that once the code is published on the block chain, #xff0c; the non-alterable #xff0c of the block chain; it can only be executed according to the rules established in the code.
2、ETH-账户
1)、比特币:基于交易的账本
比特币中是用的基于交易的账本(transaction-based ledger),这种模式下,系统中并没有显式的记录每个账户上有多少钱,要根据UTXO里的信息推算,包括想知道这个人一共总资产有多少个比特币,就算一下这个人的所有账户,就是他有私钥的那些账户在UTXO里面一共有多少个币就可以了
这种模式的好处是隐私保护比较好,你有多少钱,可能连你自己都说不清楚,那别人就更不清楚了。但是这样就带来一个问题,就使用上比较别扭,跟我们的日常体验不太一样
The advantage of this model is that privacy protection is better xff0c; how much money do you have xff0c; maybe even you don't know xff0c; others don't know better. But that raises a problem xff0c; uses more ff0c; not much like our daily experience.
比如,A要转给B 10个比特币,A要说明这10个币的来源,其中7个币是前面某个交易中收到的,另外3个币是之前另外一个交易收到的,证明币的来源的合法性。这和我们平时去银行的体验是不太一样的,银行是你存钱的时候要说明钱的来源,花钱的时候是不用说明每一笔钱是从哪儿来的
For example, xff0c; A is to be transferred to B for 10 bitcoins xff0c; A is to indicate the origin of the 10 strong xff0c; seven of these are xff0c received in a previous transaction; three are xff0c received in another previous transaction; strong is to prove the legitimacy of the origin of the coin . This is not the same as our usual experience of going to a bank xff0c; banks are to indicate the source of the money when you save money xff0c; when you spend money, you don't have to state where every money comes from.
另外一个比较别扭的地方,在前面交易中收到一些币,将来要花的时候,必须要一次性都花出去,不能只花一部分
Another strange place xff0c; received currency xff0c in the previous transaction; spent time xff0c; must spend it all at once xff0c; not just part
A转给B 10个比特币,将来B要转给C 3个比特币,币的来源是A转给B的这个交易。如上图这样处理,剩下的7个比特币会当做交易费给花出去了
A to B 10 bitcoin #xff0c; B to C 3 bitcoin #xff0c; currency from A to B. Deal with #xff0c as above; the remaining seven bitcoin will be spent as a transaction fee
如上图,这个时候必须要把剩下的7个比特币转回给自己
As in #xff0c above; at this time, the remaining seven bitcoins have to be returned to themselves.
很多比特币钱包可以自动生成接收余额的地址,每次交易换一个新地址,也是有利于隐私保护的,但同样跟我们的日常生活习惯不太一样。比如说,银行当中,别人转给你10万块钱,你要把3万块钱转出去,剩下7万块钱就不用管,就放在账户上就行了。问题在于比特币系统中,没有显式的维护基于账户的交易概念
A lot of bitcoins can automatically generate the address of the received balance xff0c; change a new address for each transaction xff0c; xff0c, also for privacy protection; but also not much like our daily routines. For example, xff0c; xff0c in banks; other people transfer you $100,000 xff0c; you have to transfer 30,000 dollars out of xff0c; the remaining $70,000 doesn't have to handle xff0c; just put it on the accounts. The problem is xff0c in the Bitcoin system; there is no visible maintenance of the account-based concept of transactions.
2)、以太坊:基于账户的账本
以太坊采用的是基于账户的模型(account-based ledger),这种模型跟银行账户是比较相似的,系统中要显式的记录每个账户上有多少个以太币
uses account-based models ( account-based ledger) xff0c; this model is similar to bank accounts xff0c; how many in taks per account are to be clearly recorded in the system
比如,A要转给B 10个以太币,这个交易的合法性只要检查一下A账户上有没有足够的钱就行了。比如A账户上有100个以太币,要转10个给B,这就没问题,不用说明这100个以太币中是具体把哪10个转给了B,不用说明币的来源,是来自之前哪个交易。将来B要转给C 3个以太币也是可以的,不用把剩下的转给他自己,因为有显式余额的概念,所以剩下的币就直接放在账户上就行了。用于说明币的来源的哈希指针也不用了
For example, xff0c; A to transfer to B 10 xff0c; the legitimacy of this
比特币中面临的挑战是双花攻击(double spending attack),花钱的人不诚实,以前花过的钱想再花一次
The challenge in Bitcoin is double flower attack and #xff08; double spending attack) #xff0c;
基于账户模式的好处是对double spending attack有天然的防御作用,因为不用管币的来源,每花掉一次钱,就从你的账户上扣掉,花两次就扣两次 & #xff0c; ff0c for every time spent; xff0c for every time spent; ff0c for every time spent; xff0c for two times 以太坊中面临的挑战是重放攻击(replay attack),收钱的人不诚实,别人已经给他转过钱了,他想再转一次 The challenge for Ethio is A把自己转给B 10个以太币的转账交易发布到网络上,过一段时间之后,这个交易被写到区块链里了,A就以为转账交易完成了。假设B是有恶意的,把这个交易在网上重新广播一遍,其他节点以为是个新的转账就把A的钱扣了2次 A sent itself to B for 10 money transfer transactions on the Internet xff0c; xff0c after a while; this transaction was written into the block chain xff0c; A thought the transfer transaction was complete. Assuming B was malicious xff0c;
解决方案:
solution:
加一个计数器(nonce),记录一下这个账户有史以来一共发布过多少交易,然后转账的时候,交易次数要成为交易内容的一部分,一起包含进去,都是受到发布交易的签名的保护
Add a 如上图,A转给B 10个以太币,A一共发布过20个交易,这是第21个,所以写上nonce=21,然后整个内容写上A的签名 As shown in the figure above xff0c; A transferred to B 10 in xff0c; A issued a total of 20 transactions xff0c; this is the twenty-first xff0c; so it says nence61; 21xff0c; and then the whole text with A's signature. 把这个交易发布到网上,因为有签名的保护,所以nonce的值,别人是改不了的。系统中的每个节点维护A这个状态,不仅要维护A账户上的钱(balance),还要维护nonce的值,一开始nonce=0,每次收到A发布的一个交易,nonce+1 xff0c; strong> for signature protection xff0c; xff0c; xff0c; otherwise unchangeable. Each node in the system maintains A xff0c; not only maintains money on A Account xff08; xff0c; xff0c; xff0c; xff0c; xff0c; xff0c; xff0c; every transaction issued by A xff0c; nonce #43;1 这个节点一开始A的nonce=20,然后现在发布这个交易,这个节点一看,这个交易是合法的,是可以执行的,同时更新一下nonce=21。以后如果有人重放这个交易,这个节点一看,这个nonce已经是21了,已经被执行过了,就不会再执行一遍了 This node begins with A's nonce61; 20xff0c; then releases the transaction now xff0c; this node looks xff0c; the transaction is legal xff0c; it's enforceable xff0c; and updates the nonce61;21. If someone replays the transaction xff0c; this node looks xff0c; this nonce is already 21xff0c; it's already executable xff0c; it's not going to be executed again. 以太坊中有两类账户:外部账户、合约账户 There are two types of accounts in Etheria & #xff1a; external, contractual accounts externally owned account(外部账户):3)、以太坊账户的分类
外部账户是由公私钥控制的,本地产生一个公私钥对,私钥掌握账户的控制权(类似于比特币中的账户),也叫普通账户
External accounts are xff0c controlled by public and private keys; local generation of a public and private key pair xff0c; private key control of accounts xff08; accounts similar to those in Bitcoin xff09; xff0c; also known as ordinary accounts
外部账户的状态:
(a) The status of the external account xff1a;
- balance(账户余额)
- nonce(计数器)
smart contract account(合约账户):
smart contract account & #xff08; contract account & #xff09; xff1a;
合约账户不是通过公私钥对控制的
Contract accounts are not controlled by public-private key pairs.
合约账户的状态:
Status of the contracted account xff1a;
- balance(账户余额)
- nonce(计数器)
- code(代码)
- storage(相关状态存储,包括每个变量的取值)
一个合约可以调用另外一个合约,所以要通过nonce值记录一下调用的次数
One contract can call another #xff0c; so you have to record the number of calls through the nonce value
合约账户不能主动发起一个交易,以太坊中规定,所有的交易只能由外部账户发起。外部账户发起一个交易如果调用了一个合约账户,这个合约账户可以发送一个message调用另外一个合约,但是他不能自己平白的发起一个交易
The contract account of cannot initiate a transaction xff0c; xff0c in the courthouse; all transactions can only be initiated by an external account . An external account initiates a transaction if a contract account xff0c is called xff0c; this contract account can send a message to call another contract xff0c; but he cannot launch a transaction by himself.
创建合约会返回一个地址,知道这个合约的地址,就可以调用这个合约,调用的过程当中状态会发生变化,代码(code)不会变,存储(storage)会变
The creation of the contract returns an address xff0c;
4)、为什么要设计这样一种新的账户模式?
以太坊的创始人叫Vitalik,是个19岁的小孩,他当初创建以太坊的时候,比特币已经有比较成熟的代码可以作为参考,为什么不用比特币已有的代码,而要另弄一套?为什么不直接在比特币系统设计上改进,比如改出块时间、mining puzzle,为什么非要改账户体系?
Vitalik,, a 19-year-old child xff0c; when he was created xff0c; Bitcoin already has more sophisticated codes to refer to xff0c; why not use the code xff0c already in Bitcoin xff1f; why not improve xff0c directly on the design of the Bitcoin system; e.g., changing time, pizzle #xff0c; why not change the account system xff1f;
比特币基于交易模型的一个好处是隐私保护,但是以太坊要支持的是智能合约,对于合约来说,要求参与者有比较稳定的身份
One of the benefits of Bitcoin's transaction-based models is privacy protection & #xff0c; but it's smart contracts & #xff0c; for contracts & #xff0c; requiring participants to have a more stable identity
这跟日常生活当中是比较类似的,比如说,你跟某个人签个合同,如果说你跟他签合同的时候,他是一个身份,签完之后,他身份变了,你找不到了,那这就有问题了,也有可能突然冒出了另外一个人,说他当初就是跟你一块签合同的,只不过换了一个身份,这就给合同的执行带来一些困难。将来出现纠纷的时候,你也是需要知道这个合同当初是跟谁签的
This is similar to daily life xff0c; for example xff0c; you sign a contract with someone xff0c; if you sign a contract with him xff0c; he is an identity xff0c; after signing xff0c; he has changed his identity xff0c; you can't find xff0c; that's a problem xff0c; there's a possibility that someone else xff0c; that he signed a contract with you xff0c; just a change of identity xff0c; this poses some difficulties in the execution of the contract.
现在有人提出来用智能合约实现一些金融衍生品(financial derivative)
There are now proposals to use smart contracts to achieve some financial derivatives & #xff08; financial derivatives & #xff09;
比如期权/期货,往合约里投一笔钱,预测未来的价格走势,如果预测正确,给你一些收益,把钱还给你。但问题是,如果你投钱的这个账户,投完钱就变了,那到时候怎么把钱还给你呢?
For example, options/prospects & #xff0c; putting money into the contract xff0c; predicting future price trends xff0c; if projections are correct xff0c; giving you some returns xff0c; but the problem is xff0c; if you put money in this account xff0c; if you put money in, you change xff0c; how will you return the money then #xff1f?
这个不光是对于外部账户有这个问题。合约账户的问题就更严重了,如果你投钱投到一个合约账户,投完之后合约账户的地址变了,找不到就麻烦了
This is not just a problem with external accounts. The problem with the contract accounts is even more serious; if you put money into a contract account & #xff0c; the address of the contract accounts changed after the investment was made xff0c; if you don't find it, you're in trouble.
所以以太坊创建这个系统的时候,考虑了过去的一些已有的模型的利弊得失,最终没有采用比特币中基于交易的模式,而是采用了基于账户模式
So at the time the system was created in Taipan xff0c; considering the pros and cons of some of the old models xff0c; eventually not using the transaction-based model xff0c in bitcoin xff0c; instead, using the account-based model
这个从目前的状况来看,还是比较合适的一个决策,以太坊的账户是希望保持稳定的,无论是个人账户还是合约账户。如果你有隐私保护的需要,同样可以创建很多个账户,根据情况使用不同的账户进行不同的交易
This is xff0c; or a more appropriate decision xff0c; Etherko's accounts are xff0c; be it personal or contractual. If you need privacy protection xff0c; you can also create many accounts xff0c; use different accounts for different transactions depending on the circumstances.
3、ETH-状态树
以太坊采用基于账户的模式,系统中显式地维护每个账户上有多少余额,来看一下用什么样的数据结构来实现account-based ledger
Using account-based models & #xff0c; maintaining the balance on each account visibly in the system & #xff0c; and seeing what data structure is used to achieve acaccount-based ledger
要完成的功能是从账户地址到账户状态的映射:addr->state
Map of the function to be performed from account address to account statusxff1a;addr->state
addr:账户地址,以太坊中用的账户地址是160位,也就是20个字节,一般表示成40个十六进制的数
state:外部账户和合约账户的状态,包括余额、交易次数,合约账户还包括代码、存储
state: status of external and contractual accountsxff0c; including balance, number of transactionsxff0c; contract account also includes code, storage
那么要设计什么样的数据结构来实现这个映射呢?
What kind of data structure should be designed to achieve this map? #xff1f;
1)、思考如何组织账户的数据结构?
1)方案一: 用哈希表实现
1) Option Ixff1a;
从直观上看,像一个很典型的key-value pair,给出一个账户地址,要找到相应的账户状态,所以一个直观的想法是用哈希表实现
From a visual point of view, xff0c; like a typical key-value pair, give an account address xff0c; find the corresponding account status xff0c; so an intuitive idea is to use a Hashi watch
系统中的全节点维护一个哈希表,每次有一个新的账户,插入到哈希表里面。查询账户的余额,就直接在哈希表中查询。如果不考虑哈希碰撞的话,基本上查询的效率是常数时间内完成的,更新也是很容易在哈希表中更新的
The entire node in the system maintains a Hashi & #xff0c; each time there is a new account & #xff0c; inserts into the Hashi table. The balance of the query account xff0c; search directly in the Hashi table. xff0c if the Hashi collision is not taken into account; essentially the efficiency of the query is xff0c over a constant period of time; the update is also easy to update in the Hashi table.
问题:如果用这个哈希表要提供Merkle proof怎么提供?
question xff1a; how can Merkle proof be supplied with this Hashi watch <#xff1f;
比如说你要跟一个人签合同,希望他能证明一下他有多少钱,怎么提供证明呢?
For example, you're going to sign a contract with a man; hopefully he can prove how much he has; how can he prove it?
一种方法是把哈希表中的元素组织成一个Merkle Tree,然后算出一个根哈希值,这个根哈希值存在block header里,只要根哈希值是正确的,就能保证底下的树不会被篡改
One way is to organize the elements in Hashi's table into a Merkele Tree, then calculate a root Hashi value xff0c; this root Hashi value exists in Block Heagerri & #xff0c; as long as the root Hashi value is correct xff0c; and guarantee that the trees below will not be tampered with.
如果有新区块发布怎么办?新区块中包含新的交易,执行这个交易必然会使哈希表的内容发生变化,发布下一个区块的时候,再重新把哈希表中的内容(key:账户地址,value:账户状态)组织成一个Merkle Tree吗?
这个代价太大了。实际上,真正发生变化的账户状态只是一小部分,因为只有那个区块里的交易所关联的账户才会发生变化,大多数账户的状态是不变的。所以每次都重新构造一次Merkle Tree,这个代价是很大的
This costs too much. In fact, xff0c; the real changed account state is only a fraction of xff0c; because only the account connected by the exchange in that block changes xff0c; most accounts are in a constant state. So, every time, Merkle Tree, is reconstructed; this is a huge cost.
比特币系统当中难道不是每出现一个区块也要重新构造一个Merkle Tree吗?那个为什么没有问题?
Isn't it true that every block in the bitcoin system has to re-create a Merkle Tree? #xff1f; why not #xff1f;
比特币是把区块里包含的交易组织成一个Merkle Tree,那区块中的交易每次发布一个新的区块又有一系列新的交易,所以比特币中的Merkle Tree是immutable(不变的)的,每次发布一个新的区块对应一个Merkle Tree,然后这棵Merkle Tree构建完之后是不会再改的,下次再发布一个新的区块再构建一个新的Merkle Tree
Bitcoin organizes the transactions contained in the block into a Merkle Tree, there's a new series of transactions xff0c each time a new block is issued; so Merkle Tree in Bitcoin is immutable( xff0c; each time a new block is released corresponds to a Merkle Trie, then this Merkle Tree is built and will not change xff0c; next time a new block will be released and a new Merkle Tree will be constructed
那区块里有多少个交易呢?最多差不多4000个(按照1M字节,每个交易大概是250M字节左右),这个其实是一个上限,很多区块的交易数目根本到不了4000个,有好多区块就只有几百个,甚至有可能还有更少的。所以每次发布一个区块,比特币里构建一个Merkle Tree,是要把这几百个到几千个交易构成一个Merkle Tree
How many transactions there are in that block xff1f; at most about 4,000 xff08; according to 1M byte xff0c; each transaction is approximately 250M byte xff09; xff0c; this is actually a cap xff0c; many blocks have no transaction number of 4,000 xff0c; there are hundreds of xff0c; there may even be fewer. So each time a block is released xff0c; bitcoin has a Merkle Tree #xff0c; to make up a Merkle Tree with hundreds to thousands of transactions.
这里如果采用这种方法会是什么情况?
What would it be like to use this method? #xff1f;
是要把所有的以太坊账户一起构成一个Merkle Tree,这个就比刚才讲的几百、几千个交易要高出好几个数量级,相当于每次发布一个区块要把所有的账户遍历一遍构建出一个Merkle Tree,下次再有一个区块,再把所有的账户遍历一遍,再构建出一个Merkle Tree
It's a Merkle Tree #xff0c, which is several orders of magnitude higher than the hundreds and thousands of transactions we just mentioned; it's equivalent to each block being released to build up a Merkle Tree #xff0c; next time there's another block #xff0c; go through all accounts #xff0c; build up a Merkle Tree
除了提供Merkle proof证明账户有多少钱之外,这个Merkle Tree还有另外一个很重要的作用,就是维护各个全节点之间状态的一致性。如果没有根哈希值发布出来,每个节点就是在本地维护一个数据结构,那怎么知道你的数据结构的状态跟别人的数据结构的状态是不是一致呢,各个全节点要保持状态的一致才行。这也是为什么比特币中把根哈希值写在块头里的原因,就是对于当前区块中包含哪些交易,所有的全节点要有一个共识
xff0c; this Merkle Tre has another important role to play xff0c;
结论:不可行,因为每次构建Merkle Tree的代价太大
conclusion xff1a; not feasible xff0c; because the cost of building Merkle Tre is too high
如果每个全节点在本地维护一个哈希表,然后需要构建Merkle Tree的时候构建出Merkle Tree来,然后根哈希值放到区块头里,这个方法是不行的。哈希表本身的效率是挺好的,插入、更改效率都很好,但是每次构建Merkle Tree的代价太大了
If each full node maintains a Hashi watch & #xff0c locally; then when Merkle Tree is needed to construct Merkle Tree & #xff0c; and then the root Hashi value is placed in the block & #xff0c; this is not an option. The Hashi table itself is very efficient & #xff0c; it's good to insert and change #xff0c; but it's too expensive to build Merkle Tree each time.
2)方案二:直接用一个Merkle Tree把所有的账户都放进去
不要哈希表了,直接用一个Merkle Tree把所有的账户都放进去,要改的时候直接在Merkle Tree里改。因为每个区块更新的只是一小部分账户,所以改的时候只是Merkle Tree里的一小部分
No #xff0c; put all the accounts directly in a Merkle Tree & #xff0c; change them directly in Merkle Tree. Because each block updates only a fraction of the accounts & #xff0c; so change only a fraction of Merkle Tree.
问题1:Merkle Tree没有提供一个高效的查找、更新的方法
question 1: Merkle Tree does not provide an efficient search and update method
比特币中的Merkle Tree最底下一层是transaction,然后哈希值放到上面节点里,两两结合,然后再取一个哈希往上整。Merkle Tree没有提供一个快速查找,更新的方法
The bottom layer of Merkle Tre in Bitcoin is Transaction, then the Hashi value is placed in the top node & #xff0c; two pairs & #xff0c; then one Harshi up. Merkle Tree does not provide a quick search & #xff0c; update method
问题2:直接把账户放到Merkle Tree里,这个Merkle Tree要不要排序?(Sorted Merkle Tree)
Question 2: direct placement of accounts in Merkle Tree & #xff0c; Merkle Tree Sorting & #xff1f; #xff08; Sorted Merkle Tree #xff09;
如果不排序会怎么样?
What happens if you don't sort it? #xff1f;
- 查找速度会慢
- 这些账户组成了这棵Merkle Tree,叶节点是这些账户的信息,如果不规定这些账户在叶节点出现的顺序,那么这样构建出来的Merkle Tree不是唯一的
系统中有很多全节点,每个全节点按照自己的某个顺序,比如说他听到某个交易的顺序构建一个Merkle Tree,那么叶结点的顺序是乱的,每个节点都是自己决定的,最后构建出的Merkle Tree是不一样的,算出的根哈希值也是不一样的
There are many full nodes xff0c in the system; each node is organized according to its own order xff0c; for example, he hears the order of a transaction building a Merkle Tree#xff0c; the foliage node is in a messy order xff0c; each node is xff0c; the finally constructed Merkle Tree is different xff0c; and the calculated roothosh values are different.
比特币中的Merkle Tree也是不排序的,那为什么比特币就没有问题呢?
Merkle Tree in Bitcoin is also unordered #xff0c; then why does Bitcoin have a problem #xff1f;
因为比特币中的每个全节点收到的交易的顺序也是不一样的,理论上说构建的Merkle Tree的根哈希值也是不一样的
Because the order of transactions received at each full node in Bitcoin is different #xff0c; theoretically, the built Merkle Tree's gendarme value is different.
比特币中,每个节点在本地组装一个候选区块,这个节点自己决定哪些交易、以什么顺序打包进这个区块里,然后通过挖矿去竞争记账权。如果他没有抢到记账权,他的任何决定其他人没必要知道;只有他有记账权,且发布区块后最终成为被大家接受的区块,那么,这个顺序就是发布这个区块的节点确定的
xff0c in bitcoin; each node assembles a candidate block locally xff0c; this node determines which transactions, in which order they are packaged into this block xff0c; then competes for rights of account by digging. If he doesn't have the right to bookkeeping xff0c; no other person in his decision needs to know xff1b; only he has the right to book xff0c; when he issues the block, it becomes an accepted block xff0c; then xff0c; this order is determined by the node in which the block is published.
也就是说,比特币中虽然也没用排序的Merkle Tree,但是顺序是唯一的,是由发布区块的那个节点确定的
That is, xff0c; Merkle Tree #xff0c in 那为什么以太坊不能这样做? Then why can't you do it? #xff1f; 如果以太坊也这么做的话,需要把账户的状态发布到区块里。也可以说是每个全节点自己决定怎么把账户组织成一个Merkle Tree,算出跟哈希值、挖出矿,但要怎么让别人知道这个顺序,你得把这个Merkle Tree发布到区块里。但发布的是所有账户的状态,不是交易,这两者差好几个数量级,比特币发布一个区块只需要几百、几千个交易 xff0c; the status of the account of 结论:不可行,不排序的Merkle Tree是不行的 交易是必须要发布的,不发布别人就没法知道,但账户状态可以维护在本地,而且大部分账户状态是不变的。一个区块里的交易只能改很少的账户,大多数账户是不变的,而且重复发布,每隔十几秒发布一个新的区块,把所有状态都打包发布一遍,下次再过十几秒再发布一遍,这个是不可行 The transaction is xff0c; no one can know xff0c without publishing; but , and 3)方案三:用Sorted Merkle Tree
问题:新增一个账户怎么办?
problemxff1a; what about adding a new accountxff1f;
产生一个账户的地址是随机的,他的叶节点的位置很可能是插在中间的,那后面这些树的结构都得变
The address that gave rise to an account is random xff0c; his leaves node is probably located in the middle xff0c; the structure of these trees changes behind.
新产生一个账户,对外发生了交互,我需要把他加入到我的数据结构里,这是没错的,但问题是,这个加入的代价有多大?
produces a new account xff0c; there is an external interaction xff0c; I need to add him to my data structure xff0c; this is correct xff0c; but the problem is xff0c; what is the cost of this accession xff1f;
可能大半棵Merkle Tree需要重构,这个代价太大了
Maybe half of Merkle Tree needs to reconfigure #xff0c; the price is too high.
结论:不可行,用Sorted Merkle Tree,插入、删除代价都太大
conclusion xff1a; not feasible xff0c; Sorted Merkle Tre, inserted and deleted at too much cost
而且,区块链是不可篡改的,是说加东西容易,删东西难。以太坊中没有显式地删除账户的操作,有的账户上就一点钱,就一两个Wei,也不能把他删掉
And xff0c; block chains are non-manufactured xff0c; that is, it's easy to add xff0c; and it's difficult to delete things. xff0c does not appear to delete accounts xff0c; there is just a little money on the account xff0c; just one or two Weixff0c; he cannot be deleted either.
2)、以太坊的数据结构
1)Trie(字典树,前缀树)
以太坊中是用一个叫MPT(Merkle Patricia Tree)的结构,讲这个之前先讲一个简单的数据结构
Etheria uses a structure called MPT( Merkle Patricia Tree) #xff0c; before this, talk about a simple data structure.
Trie也是一种key-value对,一般来说key用字符串用的比较多,比如说一些单词排成一个Trie的数据结构
Trie is also a key-value to xff0c; usually key uses more xff0c with strings; for example, some words are typed into a trie data structure
举例:general、genesis(创世纪块,区块链的第一个区块)、go、god、good
Examples: #xff1a; general, genesis( Genesis & #xff0c; first block of the block chain & #xff09; go, god, good
上图就是一个Trie的结构,这几个单词都是以G开头的,然后第二个字母就开始分裂了,左边是E,右边是O。左边这前两个单词都是N和E,然后下面再分开,R和S,然后是后三个字母。右边这个分支,O这个分支,Go就已经结束了,从这个可以看到单词可能在Trie的中间节点结束,然后左边是D,右边是O,左边变成了God,右边下来是Good
The above is a Trie structure & #xff0c; these words are xff0c at the beginning of G; then the second letter starts to split xff0c on the left; Exff0c on the left; O on the right. The first two words on the left are N and Eff0c; then split xff0c; R and S#xff0c on the right; then the last three letters. This branch on the right xff0c; O this xff0c; Go is over xff0c; from this it can be seen that the word may end at the middle of Trie xff0c; then Dxff0c on the left; Oxff0c on the right; then God xff0c on the left; God xff0c on the right; down the right is Good
Trie树可以利用字符串的公共前缀来节约存储空间。如果系统中存在大量字符串且这些字符串基本没有公共前缀,则相应的Trie树将非常消耗内存,这也是Trie树的一个缺点
Trie trees can save storage space using public prefixes of strings . If there is a large number of strings in the system and they have virtually no public prefix xff0c; the corresponding Trie trees will consume very much xff0c; this is also a weakness of Trie trees.
特点1:Trie的每个节点的分叉数目取决于key值里每个元素的取值范围
这个例子当中,每个都是英文单词,而且是小写的,所以每个节点的分叉数目最多是26个,加上一个结束标志位(表示这个单词到这个地方就结束了)
In this example, xff0c; each of the words xff0c; and xff0c, respectively; so the maximum number of splits per node is 26 xff0c; plus an end sign xff08; indicating that the word ends at this place xff09;
在以太坊中地址是表示成40个十六进制的数,因此分叉数目(branching factor)是17(十六进制的0~f,再加上结束标志位)
xff0c; thus
特点2:Trie的查找效率取决于key的长度。键值越长,查找需要访问内存的次数就越多
在这个例子当中,不同的单词键值长度是不同的
In this example, #xff0c; different word key lengths are different
在以太坊中,所有键值都是40,因为地址都是40位十六进制的数
In Etheria & #xff0c; all keys are 40xff0c; because the address is 40-digit hexadecimal
比特币和以太坊的地址是不通用的,两个地址的格式长度都是不一样的。但有一点是类似的,以太坊中的地址也是公钥经过转换得来的。其实就是公钥取哈希,然后前面的不要,只要后面这部分,就得到一个160bit的地址
The addresses of Bitcoin and Etheria are not commonly used xff0c; both addresses are in different formats. But one thing is the same xff0c; the addresses in Ethio are also converted by the public key. In fact, the public key is taken xff0c; then the front xff0c; as long as the latter part xff0c; get a 160bit address.
特点3:只要两个地址不一样,最后肯定映射到树中的不同分支,所以Trie是不会出现碰撞的
characteristic 3xff1a; as long as two addresses are different xff0c; finally map different branches of the tree xff0c; so Trie doesn't have a collision
特点4:不同的节点,不论按照什么顺序插入这些账户,最后构造出来的树是一样的
characteristic 4: different nodesxff0c; whatever the order in which these accounts are inserted xff0c; the last tree constructed is the same
前面讲Merkle Tree,如果不排序的话,一个问题是账户插入到Merkle Tree 的顺序不一样,得到的树的结构也不一样
For example, Merkle Tree, if not sorted & #xff0c; one problem is that accounts are inserted into Merkle Tree in a different order & #xff0c; the structure of the trees obtained is different
那Trie呢?比如上图中的这五个单词,换一个顺序插到这个树里面,得到的是一个不同的树吗?其实是一样的,只要给定的输入不变,无论输入怎么打乱重排,最后插入到Trie当中,得到的树是一样的
What about Trie xff1f; for example, the five words xff0c in the figure above; to insert a different order into this tree xff0c; to get a different tree xff1f; to get the same xff0c; to leave the given input unchanged xff0c; to break #xff0c; to finally insert xff0c in Trie; to get the same tree
特点5:更新操作的局部性很好
characteristic 5: updating is local
每次发布一个区块,系统中绝大多数账户的状态是不变的,只有个别受到影响的账户才会变,所以更新操作的局部性很重要
Each time a block xff0c is released; the status of most accounts in the system is constant xff0c; only individually affected accounts change xff0c; so locality of the update is important
Trie的局部性呢?比如在上图中,我要更新genesis这个key对应的value(这个图当中只画出了key,没有画出value),只要访问genesis的那个分支,其他分支不用访问的,也不用遍历整棵树
Trie's locality xff1f; for example, in the above figure xff0c; I want to update genesis, the key corresponding value( this figure shows only keyxff0c; no valuexff09; xff0c; only the branch of genesis xff0c; other branches xff0c; and not all trees.
缺点:存储浪费
shortcomingsxff1a; storage waste
比如在上图中左边分支都只有一个子节点,对于这种一脉单传的情况,如果能把节点进行合并,那么可以减小存储的开销,同时也提高了查找的效率,不用一次一个一个的往下找了
For example, there is only one subnominal & #xff0c in each left branch of the above chart; for this single pulse & #xff0c; if the node can be merged & #xff0c; then the cost of storage & #xff0c can be reduced; also the efficiency of searching & #xff0c; no one-by-one search is needed
那么就引入了Patricia Tree,也有人写成Patricia Trie,是经过路径压缩的前缀树,有时候也叫压缩前缀树
So
2)Patricia Tree/Patricia Trie
Trie中的例子进行路径压缩就变成上图的样子。可以看到,G下面还是E和O进行分叉,E下面之后跟的都是NE,再往下就是E和S分叉,然后后面都和在一起了,右边的分支也是一样的
The example in Trie uses the path compression to become the image above. You can see xff0c; below G is also E and O for the split xff0c; behind E are NE& #xff0c; down are E and S fork ff0c; behind is xff0c; the right branch is the same.
这样压缩之后有什么好处?直观上看,这个高度明显缩短了,访问内存的次数会大大减少,效率提高了
What's the benefit of this compression? xff1f; intuitive xff0c; this height significantly reduces xff0c; the number of visits to memory will be significantly reduced xff0c; efficiency gains
注意:对于Patricia Tree来说,新插入一个单词时,原来压缩的路径可能需要扩展开
Note xff1a; xff0c for Patricia Tree; xff0c when a new word is inserted; the original compressed path may need to be expanded
比如这个例子中,加入geometry,左边的分支就不能那样压缩了
In this example, for example, #xff0c; add geometry, the branch on the left cannot be compressed like that.
路径压缩在什么情况下效果比较好?键值分布比较稀疏的时候,路径压缩效果比较好
When the
比如说,这个例子当中是用英文单词,假设每个单词都很长,但是一共没有几个单词,举例:misunderstanding、decentralization(去中心化的)、disintermediation(去中间商,非中介化,intermediaries:中间商)
For example, xff0c; this example is in the English word xff0c; assumes that each word is long xff0c; but there are not a few words xff0c; examples xff1a; xff09; decentralized xff09; disintermediate #xff0c; #xff0c; non-intermediated xff0c; intermediaries #xff1a; broker #xff09;
这三个单词插入到一个普通的Trie里面就成了下图的样子。可以看到这样的结构效率是比较低的,基本上是一条线了
These three words are inserted into a normal Trie and the image below. You can see that this structure is less efficient xff0c; it's basically a line.
如果用Patricia Tree的话,如下图
If Patricia Tree is used & #xff0c;
这个树的高度明显缩短了。所以键值分布比较稀疏的时候,路径压缩效果比较好
The height of this tree is significantly reduced. So the key distribution is thinner & #xff0c; the path compression is better.
以太坊中键值是不是稀疏的呢?
#xff1f;
以太坊中键值是地址,地址是160位的,地址空间有 2 160 2^{160} 2160,这是一个非常非常大的数。如果设计一个计算机程序的算法,需要进行运算的次数是 2 160 2^{160} 2160,那这个在所有人的有生之年都不可能算出来,全世界的以太坊的账户数目加在一起也远远没有这么大,跟这个数比,是微乎其微的
2 160 2 {16)
为什么要弄这么稀疏,不把地址长度缩短一点,这样访问效率也快,也没必要那么稀疏了?
Why are you so thin? #xff0c; don't shorten the address a bit #xff0c; so access is efficient #xff0c; and there is no need to be so thin xff1f;
以太坊中普通账户跟比特币的创建方法是一样的,没有一个中央的节点,就每个用户独立创建账户。在本地产生一个公私钥对,就是一个账户
The normal accounts in the Taiku are created in the same way as bitcoin xff0c; there is no central node xff0c; accounts are created independently for each user. A public-private key pair xff0c is created locally; it's an account.
那怎么防止两个人的账户碰撞,产生的一样呢?
So how do we prevent two people from crashing into their accounts? #xff0c; same thing that happens? #xff1f;
这种可能性是存在的,但是这个概率比地球爆炸还要小。怎么达到这么小的概率,就是地址要足够长,分布足够稀疏,才不会产生碰撞。这个可能看上去有点浪费,但是这是去中心化的系统防止账户冲突的唯一办法。所以以太坊地址分布非常稀疏的,所以比较适合使用Patricia Tree
This possibility is xff0c; but it's less likely than an explosion on Earth. How do you get this small probability xff0c; xff0c; xff0c; distribution is too thin xff0c; do not create collisions . This may seem a little wastey xff0c; but
3)MPT(Merkle Patricia Tree)
Merkle Tree和Binary Tree的区别:
The difference between Merkle Tree and Binary Treexff1a;
就是区块链与普通链表的区别,把普通指针换成了哈希指针
The difference between a block chain and a normal chain table #xff0c; replaces a normal pointer with a Hash pointer
Merkle Patricia Tree和Patricia Tree的区别:
Merkle Patricia Tree and Patricia Treexff1a;
所有的账户组织成一个Patricia Tree,用路径压缩提高效率,然后把普通指针换成哈希指针,所以就可以计算出一个根哈希值。这个跟哈希值也是写在block header里面
All accounts of 比特币的block header里只有一个根哈希值:交易树,就是区块里包含的交易组成的Merkle Tree组成的根哈希值 Block header in
以太坊的block header里有三个根哈希值:交易树、状态树、收据树
has three #xff1a in Block Headery; trading trees, status trees, receipts trees
账户状态最后组织成了一个Merkle Patricia Tree,状态树的根哈希值的作用:
作用1:防止篡改
functions 1: prevent tampering
只要根哈希值不变,整个树的任何部分都没有办法被篡改,也就是说每个账户的状态都能保证是没有被篡改过的
As long as the root value is constant xff0c; there is no way that any part of the tree can be tampered with xff0c; that is to say, the state of each account is assured that it has not been tampered with.
作用2:Merkle proof
functions 2 #xff1a; Merkle proof
1)能证明账户的余额是多少
1 xff09; can prove the balance of the account
你这个账户所在的分支自己向上作为Merkle proof发给轻节点,轻节点可以验证你的账户上有多少钱
The branch of your account itself issues a light node as Merkle proof #xff0c; the light node can verify how much money you have in your account.
2)能证明某个账户是不存在的
2xff09;demonstrate that an account does not exist
Sorted Merkle Tree的一个作用是能证明non-membership,这里的证明方法跟Sorted Merkle Tree类似
Sorded Merkle Tree's role is to prove noon-membership, the proof here is similar to the sorted Merkle Tree.
比如,给一个地址转账之前,验证一下全节点里有没有这个账户信息。说的更直白一点,证明MPT中某个键值是不存在的
For example, xff0c; xff0c before transfer to an address; check if this account information is available in the full node. A bit more straightforward xff0c; proof that a key value in the MPT does not exist.
如果存在的话,是在什么样的分支,把这个分支作为Merkle proof发过去,可以证明他是不存在的
#xff0c if there is one; what branch is #xff0c; send this branch as Merkele proof xff0c; it can be proved that he does not exist
4)Modified MPT
以太坊中用到的不是原生的MPT,是Modified MPT,就是对MPT的结构做一些修改,这些修改不是很本质的修改
It's not the original MPT, it's Modified MPT, it's some modifications to the structure of the MPT & #xff0c; these changes are not very basic.
上图是Modified MPT的案例,右上角有四个账户,为了简单起见,账户地址都比较短,假设只有7位的地址,而不是40位,账户状态也只显示出了余额,其他账户状态没有显示出来。第一个账户有45个以太币,第二个账户只有1WEI(这个是以太坊中最小的计量单位,1WEI基本上可以忽略不计)
The above figure is the case of Modified MPT xff0c; four accounts xff0c at the upper right corner; for simplicity xff0c; account addresses are shorter xff0c; assuming only seven addresses xff0c; instead of 40 xff0c; account status shows only the balance xff0c; other account status does not show. 45 accounts are in xff0c; the second account is only 1WEIxff08; this is the smallest unit of measure xff0c; 1WEI can largely ignore xff09;
这个案例当中,节点分为三种:
In this case xff0c; nodes divided into three xff1a;
Extension Node(扩展节点):
Extension Node #xff08; Extension Node xff09; xff1a;
如果这个树中出现了路径压缩就会有一个Extension Node,这四个地址前两位都是一样的a7,所以Root(根节点)就是一个Extension Node,shared nibble(nibble:16进制数,一个nibble就是一个16进制数),这里共享的nibble是a7
If the
Branch Node(分支节点):
案例中第三位就分开了,有1、7、f,所以就跟了一个Branch Node
In the third case, xff0c was separated; in 1, 7, fxff0c; so I followed a Branch Node.
Leaf Node(叶节点):
Leaf Nodexff08; Leaf Nodexff09; xff1a;
先说1,这个1之后就是1355,只有这一个地址,就跟了Leaf Node。这个7有两个地址,连着路径压缩d3,然后再往下3和9分开了,跟着一个Branch Node,下面两个Leaf Node,都是7。最后一个f,就跟着一个Leaf Node:9365
1xff0c; this one is followed by 1355xff0c; only this address xff0c; follow Leaf Node. This seven has two addresses xff0c; condensed path d3xff0c; then split down xff0c; followed by a Branch Nodexff0c; followed by two Leaf Nodexff0c; last fxff0c; followed by a Leaf Node#xff1a;
另外,这个树的根节点取哈希之后得到的一个根哈希值,是要写在块头里的(左上角)
Also xff0c; a root Hashi value xff0c obtained after the root of the tree; xff08; upper left corner xff09;
用的也是哈希指针。比如7这个位置,这里存的是下面这个节点(extension node)的哈希值。如果是普通指针的话,7这个位置存的是下面这个节点的地址
It's also the Hashi pointer. For example, this is the position 7 xff0c; this is the Hashi value of the following node xff08; extension node #xff09; if it's a normal pointer xff0c; this is the address of the next node 7
每次发布一个新的区块的时候,状态树中有一些节点的值会发生变化,这些改变不是在原地改,而是新建一些分支,原来的状态其实是保留下来的
上面这个例子中,有两个相邻的区块:
In the above example xff0c; there are two adjacent blocks xff1a;
Block N Header:State Root就是状态树的根哈希值,下面显示的是这棵状态树
Block N Header: State Root root is the status tree xff0c; this is the status tree shown below.
Block N+1 Header:这个是新的区块的状态树
Block N+ 1 Header & #xff1a; this is the state tree of the new block
可以看到,虽然每一个区块都有一个状态树,但是这两棵树的大部分节点是共享的。右边这棵树主要都是指向左边这棵树的节点,只有那些发生改变的节点是需要新建一个分支
You can see xff0c; , although each block has a status tree xff0c; but most of the two trees are shared. The tree on the right refers mainly to the node on the left of the tree xff0c; only those that have changed are those that require the creation of a new branch
这个例子中,这个账户是一个合约账户,因为有Code,还有Storage合约账户的存储也是由MPT保存下来的。这个存储其实也是一个Key Value Store,维护的是从这个变量到这个变量取值的一个映射,在以太坊当中,也是用的一棵MPT。所以以太坊中的这个结构是一个大的MPT,包含很多小的MPT,每一个合约账户的存储都是一棵小的MPT
In this example, xff0c; this account is a contractual account xff0c; because there is Code, and the store on the Storege contract account is also saved by MPT. This store is actually also a Key Value Store, it is maintained as a map from this variable to take value from this variable xff0c; it is also a MPT in Etheria xff0c; so this structure in is a large MPTxff0c; it contains many small MPTxff0c; each contract account is stored as a small MPT.
上图中这个账户的新的区块里:Nonce和Balance发生了变化,Code是不变的,所以Codehash指向原来树中那个节点,Storage是变了的(存储下面这个叫存储树),在存储树中,大部分节点也是没有改变。这个例子当中,只有一个节点变了,这个整数变量从29变成了45,所以新建了一个分支
In the new block of this account above, xff1a; Nonce and Balance have changed xff0c; Code is a constant xff0c; so Codehash points to that node in the original tree xff0c; Storage is xff08; store below this tree xff09; xff0c; in the storage tree xff0c; most nodes have not changed. In this example xff0c; only one node has changed xff0c; this integer variable has changed from 29 to 45xff0c; so a branch has been created
所以,系统中每个全节点需要维护的不是一棵MPT,而是每次出现一个区块,都要新建一个MPT,只不过这些状态树中,大部分的节点是共享的,只有少部分发生变化的节点要新建分支
So xff0c; not one MPT, for each full node in the system; but one block xff0c for each occurrence; new MPT, for each occurrence; except in these state trees xff0c; most of the nodes are shared xff0c; only a few changed nodes are to create new branches
为什么要保留历史状态,为什么不在原地直接改了?
Why keep the historical state xff0c; why not change it directly in place xff1f;
系统当中有时候会出现分叉,临时性的分叉是很普遍的。以太坊把出块时间降低到十几秒之后,这种临时性的分叉是常态,因为区块在网上传播时间可能也需要十几秒
Sometimes there's a fork in the system #xff0c; temporary fork is common. Ether's time of release is reduced to a dozen seconds xff0c; this temporary fork is normal xff0c; because blocks may take more than a dozen seconds to spread online.
如上图,有个分叉,这两个节点同时获得记账权。这两个分叉最终上面那个胜出了,下面这个分叉的节点这个时候就要回滚(roll back),就是这个节点当前的状态,就接受了下面这个节点的状态要取消掉,退回到上一个节点的状态,然后沿着上面那条链往下推进
This is the figure above xff0c; there is a fork xff0c; these two dots are given the right to account at the same time. The two forks eventually win xff0c; the next fork is about to roll back xff08; roll back xff09; xff0c; that is, the current state of the node xff0c; accepting the status of the next node to be removed xff0c; returning to the status of the previous node xff0c; and then moving down the chain above xff0c;
有时候可能要把当前状态退回到没有处理到这个区块中交易的前一个状态
Sometimes it's possible to return the status quo to the pre-dealing state that was not dealt with in this block.
那怎么实现回滚呢?
#xff1f;
就是要维护这些历史纪录
It's about preserving these historical records.
这个跟比特币还不太一样,如果是比特币的话,交易类型比较简单,有的时候可以通过这种反向操作推算出前一个状态。如果是一个简单的转账交易,A转给B 10个比特币,这个对账户余额有什么影响呢?A的账户上少了10个比特币,B的状态多了10个比特币。假如这个状态要回滚,退回到前一个状态,那就把B这个账户减少10个比特币,把A这个账户加回去10个比特币就行了。简单的转账交易回滚其实是比较容易的
This is not the same as bitcoin xff0c; if bitcoin xff0c; if it is bitcoin xff0c; if the type of transaction is simple xff0c; sometimes the previous state can be extrapolated through this reverse operation. If it is a simple transfer transaction xff0c; if it is transferred to B 10 bitcoins xff0c; if it has an effect on the account balance xff1f; if there are 10 bitcoin xff0c missing from A; if there are 10 bitcoin xff0c; if this state is to roll back to the previous state xff0c; if this account B is reduced by 10 bitcoins xff0c; if this account is to add 10 bitcots to A's account.
以太坊中为什么不行?因为以太坊中有智能合约。智能合约是图灵完备的,编程功能是很强的。从理论上说,可以实现很复杂的功能,跟比特币简单的脚本还不太一样。以太坊中如果不保存前面的状态,智能合约执行完之后,想再推算出前面是什么状态,这是不可能的,所以要想支持回滚,必须保存历史状态
Why not? #xff1f; because there are smart contracts in Etheria. Smart contracts are the perfect & #xff0c; programming functions are strong. In theory , very complex functions can be performed , not quite the same as a simple bitcoin script.
3)、以太坊的数据结构实现
1)block header的结构
1 #xff09; Block header structure
block header中的属性 | 含义 |
---|---|
ParentHash | 父区块块头的哈希值,是区块链中前一个区块块头的哈希值 |
UncleHash | 叔父区块块头的哈希值。每个区块还有叔父区块,以太坊中Uncle和Parent不一定是一个辈分的,Uncle比Parent可能大好多辈分 |
Coinbase | 挖出这个区块的矿工的地址 |
Root | 状态树的根哈希值 |
TxHash | 交易树的根哈希值(类似比特币系统中的那个根哈希值) |
ReceiptHash | 收据树的根哈希值 |
Bloom | 布隆过滤器,提供一种高效的查询符合某种条件的交易的执行结果(跟收据树是相关的) |
Diffculty | 挖矿难度,要根据需要调整 |
GasLimit | 单个区块允许的最多Gas总量(智能合约要消耗汽油费,类似于比特币中的交易费) |
GasUsed | 该交易消耗的总Gas数量 |
Time | 区块的大致的产生时间 |
Nonce | 是挖矿时猜的那个随机数(类似于比特币的挖矿),以太坊中的挖矿也是要猜很多个随机数,写在块头里的随机数是最后找到的,符合难度要求的 |
MixDigest | 混合摘要,从nonce这个随机数经过一些计算,算出一个哈希值 |
2)区块的结构
block中的属性 | 含义 |
---|---|
header | 指向block header的指针 |
uncles | 指向叔父区块的header的指针,是个数组,因为一个区块可以有多个叔父区块 |
transactions | 这个区块中交易的列表 |
extblock是区块在网上发布的信息,就是block中的前三项会真正发布出去
Extblock is the information that blocks publish online & #xff0c; the first three of Block are actually going to publish.
4)、状态树中的value的存储:RLP
状态树中保存的是key value pair对。key就是地址,前面主要讲的是键值,这个地址的管理方式
The status tree is saved by the key value pair pair. The key is the address & #xff0c; the previous key value & #xff0c; the way this address is managed
那么这个value呢,这个账户的状态呢,是怎么存储在状态树当中的呢?实际上是要经过一个序列化(RLP)的过程,然后再存储
So what about this value xff0c; what is the status of this account xff0c; how is it stored in a state tree xff1f; actually going through a sequence xff08; RLPff09; the process xff0c; and then storing xff0c;
RLP:Recursive Length Prefix,递归长度前缀,是一种序列化方法。特点是简单,极简主义,越简单越好
Protocal buffer:简称Protobuf,是个很有名的做序列化的库
Protocal Buffer: Protobuf, a well-known sequenced library
跟这些库相比,RLP的理念就是越简单越好。它只支持一种类型:nested array bytes,嵌套数组字节。一个一个字节组成的数组,可以嵌套。以太坊里的所有的其他类型,比如整数或者比较复杂的哈希表,最后都要变成nested array bytes
xff0c; RLP's idea is as simple as possible. It only supports one type & #xff1a;
所以实现RLP要比实现Protocal buffer简单很多,因为难的东西,都推给应用层了
So it's easier to achieve RLP than to achieve Protocal Buffer & #xff0c; because it's hard #xff0c; it's pushed to the application level.
4、ETH-交易树和收据树
1)、交易树和收据树
每次发布一个区块的时候,区块中所包含的交易会组织成一个交易树,也是一棵Merkle Tree,跟比特币中的情况是类似的
Each time a block is released, xff0c; the trading party included in the block is organized into a trading tree xff0c; it is also a Merkle Tree, the situation is similar to that in Bitcoin
此外,以太坊还增加了一个收据树,每个交易执行完之后会形成一个收据,记录交易的相关信息,交易树和收据树上的节点是一一对应的。由于以太坊智能合约执行较为复杂,通过增加收据树,便于快速查询执行结果
In addition, xff0c; Etheria added a receipt tree xff0c; once each transaction is executed, a receipt xff0c; relevant information for recording the transaction xff0c; nodes on the transaction tree and the receipt tree correspond to one. xff0c as a result of the more complex execution of the contract with Taitang smart; xff0c through the addition of a receipt tree xff0c; easy access to the results of the execution
从数据结构上,交易树和收据树都是MPT(Merkle Patricia Tree),而比特币中都采用普通的Merkle Tree。以太坊中可能就仅仅是为了三棵树代码复用好所以这样设计的
From the data structure xff0c; trading trees and receipts trees are MPT( Merkle Patricia Tree) xff0c; and common Merkle Tree in bitcoin. The Etheria may only be designed for three tree codes.
MPT的好处是支持查找操作,可以通过键值从顶向下沿着这个树进行查找。对于状态树来说,查找的键值就是这个账户的地址;对于交易树和收据树来说,查找的键值是这个交易在发布的区块中的序号,交易的排列顺序是由发布区块的那个节点决定的
MPT has the advantage of supporting the search operation xff0c; it can be found from top down along this tree. For the status tree xff0c; for the account xff1b; for the transaction tree and the receipt tree xff0c; for the transaction number xff0c; for the transaction the order of the transaction is determined by the node of the release block xff0c
这三棵树有一个重要的区别:交易树和收据树都是只把当前发布的这个区块里的交易组织起来的,而状态树是把系统中所有账户的状态都要包含进去,不管这些账户跟当前区块的交易有没有什么关系
There is an important difference between these three trees xff1a; the
多个区块的状态树是共享节点的,每次新发布一个区块的时候,只有这个区块中的交易改变了状态的那些节点需要新建一个分支,其他节点都是沿用原来状态树上的节点。而每个区块的交易树和收据树都是独立的,是不会共享节点的,一个区块跟另一个区块发布的交易本身也认为是独立的
The status tree of several blocks of
交易树和收据树的用途:
The use of the tradable and receipt trees and #xff1a;
- 向轻节点提供Merkle proof。像比特币当中,交易树可以证明某个交易被打包到某个区块里面,可以向轻节点提供这样的Merkle proof,收据树也是类似的,要证明某个交易的结果,也可以在收据树里面提供一个Merkle proof
- 以太坊还支持一些更加复杂的查询操作。比如说,想找到过去十天当中,所有跟某个智能合约有关的交易,一种方法是把过去十天产生的所有区块都扫描一遍,看看其中有哪些交易是和智能合约相关的,但是这种方法的复杂度较高,而且对于轻节点来说,实际上,轻节点没有交易列表,只有一个块头的信息,所以也没有办法通过扫描所有交易列表的方法来找到符合这个查询条件的交易。与之类似的一些查询,比如说,找到过去十天中所有的众筹事件或者所有的发行新币的事件,这些都是需要一个比较高效的方法才能支持
2)、bloom filter(布隆过滤器)
以太坊中引入了bloom filter,bloom filter支持比较高效的查找某个元素是不是在一个比较大的集合里面
Ether introduced Bloom filter & #xff0c; brom filter supports a more efficient search for whether an element is in a larger collection
比如说有一个集合,里面有很多元素,现在想知道某个指定的元素是不是在这个集合里
For example, there's a collection #xff0c; there's a lot of elements #xff0c; now I want to know if a given element is in this collection.
一个最笨的方法是,把这集合里面的元素遍历一遍,看看有没有想找的那个元素,这个复杂度是线性的,另外有一个前提是得有足够得存储来保存整个集合的元素,对于轻节点来说,轻节点没有这个交易列表,没有整个集合的元素信息,所以这种方法是用不了的
One of the dumbest methods is xff0c; the elements in this collection go through xff0c; see if there is any element to look for xff0c; the complexity is linear xff0c; there is also a premise that there is enough storage to save the entire pool xff0c; xff0c for light nodes; the light node does not have this list xff0c; there is no whole pool of elements xff0c; this method is therefore not available.
bloom filter用一个很巧妙的思想给这个大的、包含很多元素的集合计算出一个很紧凑的摘要
Bloom filter calculates a very tight summary with a very clever idea for this large group of elements.
如上图,这个例子当中有一个(a,b,c)的集合,要计算出一个digest,底下是一个向量,这个向量初始的时候都是零
As in xff0c above; this example includes one xff08;a,b,cxff09; a collection xff0c; to calculate a digest, below is a vector xff0c; this vector is zero at the beginning of the vector
然后有一个哈希函数H,把每一个元素映射到向量中的某个位置。比如说a这个元素,取哈希之后,映射到相应位置,把这个位置的元素从0变成1。然后,b和c也映射到相应的位置。就是把每个元素都取哈希,找到向量中的对应位置,然后把它置成1,所有元素都处理完了,得到这个向量就是原来集合的一个摘要,这个摘要比原来的集合要小很多,这个例子当中用128bits就可以代表了
Then there's a H & #xff0c; map each element to a position in the vector. For example, a element & #xff0c; & #xff0c after taking it; map the element of this location from 0 to the corresponding position & #xff0c; then & #xff0c; b and c map the same position. That is, take each element & #xff0c; find the corresponding position in the vector & xff0c; then put it in 1xff0c; all the elements are processed xff0c; get this vector is a summary of the original collection xff0c; this summary is much smaller than the original collection xff0c; in this case, 128 bits are used to represent xff0c.
摘要的用处:比如说有一个元素d,想知道这个d元素是不是在这个集合里,但是这个集合本身我们不一定能够保存下来
The use of the abstract #xff1a; for example, there's an element d& #xff0c; wondering if this element d is in this collection #xff0c; but the collection itself may not be able to save it.
可以用这个哈希函数H对d取哈希值,取完之后发现映射到一个值为0的位置,说明d这个元素一定不在这集合里
The Hashi function H can be used to get a Hashi & #xff0c for d; after taking it, a 0 & #xff0c map can be found; the element d must not be in this collection.
假设取完哈希,映射到一个值为1的位置,有可能确实是集合中的元素,d=b,也有可能不在这个集合里,而是出现了哈希碰撞,恰好映射到了跟集合某个元素一样的位置。所以用bloom filter要注意,有可能会出现false positive,但是不会出现false negative,就是可能出现误报,但是不会出现漏报,元素在里面一定说在里面,元素不在里面也有可能说在里面(或者是说,bloom filter说某个元素在,可能会被误判,bloom filter说某个元素不在,那么一定不在)
Hypothetically, it's #xff0c; it's xff0c; it's xff0c; it's probably xff0c; it's xff0c; it's xff0c; it's xff0c; it's xff0c; it's just xff0c; it's xff0c; it's not xff08; it's xff0c; it's strong; it's
bloom filter有各种各样的变种,比如说,解决这样的哈希碰撞,有的bloom filter的设计用的不是一个哈希函数,而是一组哈希函数,每个哈希函数独立的把这个元素映射到这个向量中的一个位置,用一组哈希函数的好处是如果出现哈希碰撞,那么一般来说,不会所有的哈希函数都出现哈希碰撞
#xff0c; e.g. #xff0c; solves this Hash collision #xff0c; some of the bloom filter designs not a Hash function #xff0c;
bloom filter不支持删除操作,比如把a删掉了,对应的向量1要不要改,如果改成0的话,集合中可能有另外一个元素也映射到这个位置(哈希碰撞是有可能的),所以简单的bloom filter是不支持删除操作的。如果要支持删除操作,这个地方就不能是0和1了,得改成一个计数器,记录这个位置有多少个元素映射过来,而且还要考虑到计数器会不会溢出,这样数据结构就复杂得多了,和当初设计bloom filter的初衷是相违背的
3)、以太坊中bloom filter的用途
每个交易执行完之后会形成一个收据,这个收据里面就包含一个bloom filter,记录交易的类型、地址等其他信息,发布的区块,在他的块头里也有一个总的bloom filter,这个总的bloom filter是这个区块里所有交易的一个bloom filter的并集
Upon completion of each transaction, a receipt & #xff0c will be formed; this receipt will contain a blue filer & #xff0c; other information such as the type of transaction, address of the transaction & #xff0c; issued block & #xff0c; there will also be a total blue filter & #xff0c in his block; this total Bloom filter is a combination of a Bloom filter for all transactions in this block.
比如说要查找过去十天发生的跟某个智能合约相关的交易,先查一下区块的块头里的bloom filter有要找的交易类型,如果没有,这个区块就不是我们想要的,如果有,再去查找区块里面包含的交易所对应的收据树里面的那些bloom filter,看看哪个有,也可能都没有,因为有可能是false positive,如果是有的话,再找到相对应的交易直接进行一下确认
For example, looking for transactions related to a smart contract over the past 10 days xff0c; looking for the type of transaction xff0c in the block's head xff0c; without xff0c; this block is not what we want xff0c; if xff0c; if there is xff0c; if there is one; if there is one, look for those xff0c; if there's one xff0c; if there's one, xff0c; if there's one, it's a direct confirmation of the corresponding transaction xff0c;
好处是通过bloom filter的结构能够快速过滤掉大量无关的区块,很多区块一看块头的bloom filter就知道肯定没有我们要的交易,然后剩下的一些少数的候选区块,再仔细查看。比如说一个轻节点,只有块头信息,根据块头就已经能够过滤掉很多区块了,剩下有可能是想要的区块,再问全节点要进一步的信息
The advantage is that the structure can quickly filter out a large number of unrelated blocks & #xff0c; that many blocks, by looking at the head, know that there is definitely no deal & #xff0c; and then a few of the remaining candidate blocks & #xff0c; and look closely again. For example, a light node & #xff0c; only one piece of information & #xff0c; that many blocks #xff0c can already be filtered off by the head; that there may be still a desired block #xff0c; and ask the full section for further information.
4)、补充
状态树、交易树、 收据树三棵树的根哈希值都是包括在块头里面的,以太坊的运行过程可以看作是一个交易驱动的状态机(transaction-driven state machine)。这个状态机的状态是所有账户的状态,就是状态树中包含的那些内容,交易是每次发布区块里包含的交易,通过执行这些交易会驱动系统从当前状态转移到下一个状态
xff0c for status trees, transaction trees, receipts trees; xff0c for status trees; xff08 for
比特币也可以认为是一个交易驱动的状态机,比特币中的状态是UTXO(没有被花掉的那些输出),每次新发布一个区块,会从UTXO里用掉一些输出,又会增加一些新的输出,所以发布的区块会驱动状态机从当前状态转移到下一个状态
Bitcoin can also be considered a transaction-driven state machine xff0c; the status in bitcoin is UTXO( those output xff09; xff0c; each new release of a block xff0c; some output xff0c will be used from UTXO; some new output xff0c will be added; so the published block will drive the status machine from the current to the next state
而且这两个状态机有一个共同的特点,就是状态转移都得是确定性的,对一个给定的当前状态、一组给定的交易(就是这个区块中包含的交易),能够确定性地转移到下一个状态,因为所有的全节点、所有的矿工,都要执行同样的状态转移,所以状态转移必须是确定性的
And the two state machines have a common feature xff0c; xff0c; xffc; xff08 for a given current state, a given set of transactions; xff08; xff0c for transactions included in this block; xff0c; xff0c for determinative transfer to the next state xff0c; because all nodes, all miners xff0c; ff0c for carrying out the same state transfer xff0c; so state transfer must be determinative
问题1:某人在以太坊发布一个交易,某个节点收到这个交易,转账交易A->B,有没有可能这个收款人的地址从来没听说过?
question 1&xff1a; a person issuing a transaction xff0c in Etheria; receiving this transaction at a node xff0c; transfer transaction A-> Bxff0c; possibility that this recipient's address has never heard of xff1f;
以太坊和比特币是一样的,创建账户是不需要通知其他人的,只有这个账户第一次收到钱的时候,其他的节点才会知道这个账户的存在,这个时候要在状态树中新插入的一个节点,因为这个是新增加的账户
Ether and Bitcoin are the same xff0c; accounts are created xff0c without notification to others; only when the account first receives the money xff0c; other nodes will know the existence of the account xff0c; a new node xff0c will be inserted in the status tree at this time; because this is a new account.
问题2:状态树、交易树、收据树的区别是,状态树要包含系统中所有账户的状态,无论这些账户是否参与了当前区块的交易,那么能不能把状态树的设计改一下,改成每个区块的状态树也只包含这个区块中的交易相关的那些账户的状态,这样就跟交易树和收据树一致了,而且可以大幅度的削减每个区块所对应的状态树的大小,因为大部分的账户状态是不会变的?
这样设计的结果是每个区块没有一棵完整的状态树,只有当前区块中所包含的交易涉及到的账户的状态
The result of this design is that each block does not have a complete status tree xff0c; only the status of the accounts involved in the transactions contained in the current block
这么设计的一个问题就是,如果要想查找某个账户的状态就不方便了。比如说有一个转账交易A转给B 10个以太币,要检查A账户里是不是真的有10个以太币,问题是最近一个区块对应的那个状态树可能没有A这个账户,往前一直找,找到最近的一个包含A账户的区块,才能知道A的账户余额是多少。如果A有较长的一段时间没有发生交易,可能要从后往前,扫描很多个区块,才能找到最近一次的账户状态
One of the problems with this design is xff0c; it's not easy to find the status of an account. For example, there is a transfer transaction A transferred to B 10 in xff0c; checking whether there are actually 10 in xff0c in account A; the problem is that the status tree corresponding to a recent block may not have an A account xff0c; looking for xff0c all the way forward; finding a recent block containing A xff0c; knowing how much of A's account balance is. If A has not been traded for a long time xff0c; may have to go forward xff0c; scanning many blocks xff0c; finding the most recent account position.
还有一个更大的问题,就是A转给B钱的时候,要知道A账户的状态,才能知道A是不是有足够的钱转给B 10个以太币,也要知道B账户的状态,余额是多少,因为要往B账户余额里加10个以太币,所以也要找B账户的状态,而B账户有可能是个新建的账户,这个时候就要找到创世纪块去,从当前区块一直找到创世纪块,发现这个账户没有,才知道原来是个新建的账户
There is a bigger problem xff0c; that is, when A transfers to B money xff0c; that the status of A accounts xff0c; that is, whether A has enough money to transfer to B 10 in taks xff0c; that is, that the status of B accounts xff0c; that the balance is xff0c; that there is 10 in xff0c; that there is a need to find B accounts xff0c; that there is a possibility that account B is a new account xff0c; that there is a time to find the creation block xff0c; that there is always a creation from the current block xff0c; that this account is not xff0c; that there is no new account xff0c; and that it is known to be a new account xff0c; and that accounts B may be a new account xff0c.
5)、代码中具体的数据结构
交易树和收据树的创建过程,在NewBlock函数里创建了交易树和收据树,并且得到了他们的根哈希值
The creation of the transaction tree and receipt tree xff0c; the creation of the transaction tree and receipt tree xff0c in the NewBlock function; and the acquisition of their root-hack values
先看一下交易树的代码,首先判断交易列表是否为空,如果是空的话,那么这个区块里块头的交易树的根哈希值就是一个空的哈希值,否则通过调用DeriveSha函数来得到交易树的根哈希值,然后创建区块的交易列表
First look at the code of the transaction tree xff0c; first determine whether the transaction list is empty xff0c; then if it is empty xff0c; then the root Hashi value of the transaction tree in this block is an empty Hashi value xff0c; otherwise the transaction list of the transaction tree is obtained by calling the DeriveSha function xff0c; and then create a block's list of transactions
中间这个代码是收据树,首先判断一下收据列表是否为空,如果为空,块头里收据树的根哈希值就是一个空的哈希值,如果不为空,通过调用DeriveSha函数来得到收据树的根哈希值,然后创建块头里的bloom filter,每个交易执行完之后会得到一个收据,所以交易列表的长度和收据列表的长度是一样的
The middle code is the receipt tree & #xff0c; first you can judge whether the receipt list is empty xff0c; if it is empty xff0c; then the root Hashi value of the receipt tree in the block is an empty Hashi value xff0c; if it is not empty xff0c; the root Hashi value of the receipt tree is obtained by calling the DeriveSha function xff0c; then you create the bloom filter #xff0c in the block; after each transaction is executed, you get a receipt xff0c; so the length of the list of transactions is the same as the length of the receipt list.
最下面这段代码是叔父区块的,首先判断叔父列表是否为空,如果是的话,那么块头里叔父区块的哈希值就是一个空的哈希值,否则,通过调用CalcUncleHash函数计算出哈希值,然后通过循坏构建出区块里的叔父数组
At the bottom of the code is xff0c of the uncle block; first, the list of uncles is empty xff0c; if so, ff0c; then the Hashi value of the uncle block in the block is an empty Hashi value xff0c; otherwise ff0c; the Hashi value is calculated by calling the CalcUncleHash function xff0c; and then the number of uncles in the block is constructed by breaking down.
DeriveSha函数,前面NewBlock函数创建交易树和收据树的时候,调用的都是这个函数,这里创建的数据结构是trie
Derive Sha function & #xff0c; previous NewBlock function creates a transaction tree and a receipt tree & #xff0c; all called is this function & #xff0c; the data structure created here is trie
而trie的数据结构是一棵MPT,以太坊的三棵树:交易树、收据树、状态树用的都是MPT
And trie's data structure is one MPT, three trees in Etheria & #xff1a; trading trees, receipts and status trees are all for MPTs.
这是Receipt的数据结构,每个交易执行完之后形成的一个收据,记录了这个交易的执行结果
This is the data structure of Receipt & #xff0c; a receipt resulting from the completion of each transaction & #xff0c; the results of the transaction are recorded
Bloom就是这个收据的bloom filter
Bloom's the one with the receipt. Bloom filter.
Logs是个数组,每个收据可以包含多个Log,这些收据的bloom filter就是根据这些Log产生出来的
Logs is an array & #xff0c; each receipt can contain multiple Log,s; these receipts are made from these Logs.
这是区块块头的数据结构,里面的Bloom域就是整个区块的bloom filter,这个是由每个收据的bloom filter合并在一起得到的
This is the data structure of the block & #xff0c; the area of Bloom is the Bloom filter & #xff0c of the block as a whole; this is obtained from a combination of bloom filters for each receipt.
这是刚刚的NewBlock函数,红框里的代码就是创建块头里的bloom filter,通过调用CreateBloom这个函数
This is the NewBlock function & #xff0c; the code in the red box is to create the black block & #xff0c; by calling the CreateBloom function
这是相关的三个函数的代码实现
It's a code realization for the three functions involved.
CreateBloom函数的参数是这个区块的所有收据,这个for循环对每个收据调用LogsBloom函数来生成这个收据的bloom filter,然后把这些bloom filter用Or操作合并起来得到整个区块的bloom filter
The parameters of the CreateBloom function are all receipts for this block & #xff0c; this For cycle calls the LogsBloom function for each receipt to generate this receipt; and then combines these bloom filters with Or operations to get the whole block's Bloom filter
LogsBloom函数的功能是生成每个收据的bloom filter,他的参数是这个收据的Log数组,刚才看过Receipt的数据结构,每个Receipt里面包含一个Logs的数组,这个函数有两层for循环,外层循环对Logs数组里的每一个Log进行处理,首先把Log的地址取哈希后,加到bloom filter里面,这里的bloom9是bloom filter中用到的哈希函数,然后内层循环把Log中包含的每个Topics加入到bloom filter里,这样就得到了这个收据的bloom filter
The LogsBloom function is the bloom filter, that generates each receipt; his parameters are the Log & #xff0c of this receipt; he has just seen the data structure of Receipt & #xff0c; each Receipt contains a logs & #xff0c; this function has two layers for circulation xff0c; each Log in the Logs cycling & xff0c; first takes Log's address xff0c; added to Bloom & & xff0c; here's bloom9 is the Hashi function #xff0c in the bloom filter; then the inner circle adds to every Tog #xff0c in the bloom & #xff0c; this is the way to obtain this receipt.
bloom9是bloom filter中用到的哈希函数,这里的bloom9函数是把输入映射到digest中的三个位置,把三个位置都置为1
Bloom9 is the Hashi function xff0c in Bloom filter; here's the Bloom9 function is to map the input to three locations in digest xff0c; and to set all three positions to one.
第一行调用crypto里面的函数,生成一个256位的哈希值,b这个变量是个32个字节的哈希值
The first line calls the function in Crypto xff0c; produces a 256-bit Hashi value xff0c;b this variable is a 32 bytes Hashi value
第二行的r是要返回的bloom filter,在这里初始化为0
Line 2 r is to return Bloom filter, it is initially converted here to zero
接下来是个循环,把刚才生成的32个字节的哈希值取前六个字节,每两个字节组成一组拼接在一起,然后and上2047,相当于对2048取余,得到一个位于0到2047这个区间里的数,这样做是因为以太坊中bloom filter的长度是2048位。这个循环的最后一行,把t左移这么多位然后合并到上一轮得到的bloom filter里,通过Or运算合并到前面的bloom filter里,经过三轮循环,把三个位置置为1后,返回所创建的bloom filter
This is followed by a circular xff0c; taking the first six bytes of the 32 bytes of the Hashi value just generated xff0c; grouping together each of the two bytes xff0c; then up to 2047xff0c; corresponding to xff0c for 2048; getting a number of xff0c in the zero to 2047 zone; doing so because the length of the bloom filter in the tatai is 2048. The last line of the cycle xff0c; moving t so many places to the left and then combining them to the bloom filter xff0c in the previous round; combining the bloomfter #xff0c in the front by Orling; three loops ff0c; putting the three positions into 1 post xff0c; returning the bloom filter created
前面是生成bloom filter的过程,那么怎么查询某个bloom filter里面是否包含了我们感兴趣的topic呢?
It's about the process of generating bloom filter #xff0c; then how do you find out if some bloom filter contains topics of interest to us?
这是通过调用BloomLookup函数来实现的,查一下bin的bloom filter里有没有包含要找的第二个参数topic,首先用bloom9函数把topic转化成一个bytesBacked,然后把他跟bloom filter取and操作,看看得到的结果是不是和bytesBacked相等。注意bloom filter里面可能包含除了我们要查找的topic之外其他的topic,所以要做一个and,然后再跟他自身比较,相当于判断一下我们要查找的这个topic在bloom filter中对应的位置是不是都是1
This is xff0c by calling the BloomLookup function; check if the bloom filter contains the second parameter that we are looking for; first converts to a bytesbacked, using the bloom9 function; then removes him from the xff0c operation; see if the result is the same as bytesBacked. Note that the bloom filter may contain a topic, other than the topic that we're looking for; then compares to him #xff0c; it's equivalent to judging whether the Topic corresponds to 1 in the bloom filter that we're looking for.
对应课程:
counterpart course and #xff1a;
The Open Course of Xiao Xiao Xiao of Beijing University on Block Chain Technology and Applications
注册有任何问题请添加 微信:MVIP619 拉你进入群
打开微信扫一扫
添加客服
进入交流群
发表评论