Bitcoin Source Code Study Notes (Part 3)

Chapter 3

This chapter introduces some new data structures. Unless otherwise specified, all classes and functions mentioned in this chapter are located in main.h or main.cpp.

Each node stores a copy of the blockchain. The blockchain consists of interconnected blocks (CBlock instances). Each block contains multiple transactions (CTransaction instances). In order to store, search, and read block and transaction information in memory and disk, Bitcoin introduces some access classes. They include: CBlockIndex and CDiskBlockIndex for indexing blocks, CDiskTxPos and CTxIndex for indexing transactions.

CBlock

The CBlock class represents a block.

 class CBlock
{
public:
//header
int nVersion;
uint256 hashPrevBlock;
uint256 hashMerkleRoot;
unsigned int nTime;
unsigned int nBits;
unsigned int nNonce;
// network and disk
vector<CTransaction> vtx;
// memory only
    mutable vector<uint256> vMerkleTree;
CBlock()
{
SetNull();
}
IMPLEMENT_SERIALIZE
(
        READWRITE(this->nVersion);
        nVersion = this->nVersion;
        READWRITE(hashPrevBlock);
        READWRITE(hashMerkleRoot);
READWRITE(nTime);
READWRITE(nBits);
READWRITE(nNonce);
        // ConnectBlock depends on vtx being last so it can calculate offset
        if (!(nType & (SER_GETHASH|SER_BLOCKHEADERONLY)))
READWRITE(vtx);
else if (fRead)
            const_cast<CBlock*>(this)->vtx.clear();
)
void SetNull()
{
nVersion = 1;
hashPrevBlock = 0;
hashMerkleRoot = 0;
nTime = 0;
nBits = 0;
nNonce = 0;
vtx.clear();
vMerkleTree.clear();
}
bool IsNull() const
{
return (nBits == 0);
}
uint256 GetHash() const
{
        return Hash(BEGIN(nVersion), END(nNonce));
}
    uint256 BuildMerkleTree() const
{
vMerkleTree.clear();
        foreach(const CTransaction& tx, vtx)
            vMerkleTree.push_back(tx.GetHash());
int j = 0;
        for (int nSize = vtx.size(); nSize > 1; nSize = (nSize + 1) / 2)
{
            for (int i = 0; i < nSize; i += 2)
{
                int i2 = min(i+1, nSize-1);
                vMerkleTree.push_back(Hash(BEGIN(vMerkleTree[j+i]), END(vMerkleTree[j+i]),
                                           BEGIN(vMerkleTree[j+i2]), END(vMerkleTree[j+i2])));
}
j += nSize;
}
        return (vMerkleTree.empty() ? 0 : vMerkleTree.back());
}
    vector<uint256> GetMerkleBranch(int nIndex) const
{
        if (vMerkleTree.empty())
            BuildMerkleTree();
        vector<uint256> vMerkleBranch;
int j = 0;
        for (int nSize = vtx.size(); nSize > 1; nSize = (nSize + 1) / 2)
{
            int i = min(nIndex^1, nSize-1);
            vMerkleBranch.push_back(vMerkleTree[j+i]);
nIndex >>= 1;
j += nSize;
}
return vMerkleBranch;
}
    static uint256 CheckMerkleBranch(uint256 hash, const vector<uint256>& vMerkleBranch, int nIndex)
{
if (nIndex == -1)
return 0;
        foreach(const uint256& otherside, vMerkleBranch)
{
if (nIndex & 1)
                hash = Hash(BEGIN(otherside), END(otherside), BEGIN(hash), END(hash));
else
                hash = Hash(BEGIN(hash), END(hash), BEGIN(otherside), END(otherside));
nIndex >>= 1;
}
return hash;
}
    bool WriteToDisk(bool fWriteTransactions, unsigned int& nFileRet, unsigned int& nBlockPosRet)
{
        // Open history file to append
        CAutoFile fileout = AppendBlockFile(nFileRet);
if (!fileout)
            return error("CBlock::WriteToDisk() : AppendBlockFile failed");
        if (!fWriteTransactions)
            fileout.nType |= SER_BLOCKHEADERONLY;
// Write index header
        unsigned int nSize = fileout.GetSerializeSize(*this);
        fileout << FLATDATA(pchMessageStart) << nSize;
// Write block
        nBlockPosRet = ftell(fileout);
        if (nBlockPosRet == -1)
            return error("CBlock::WriteToDisk() : ftell failed");
fileout << *this;
return true;
}
    bool ReadFromDisk(unsigned int nFile, unsigned int nBlockPos, bool fReadTransactions)
{
SetNull();
        // Open history file to read
        CAutoFile filein = OpenBlockFile(nFile, nBlockPos, "rb");
if (!filein)
            return error("CBlock::ReadFromDisk() : OpenBlockFile failed");
        if (!fReadTransactions)
            filein.nType |= SER_BLOCKHEADERONLY;
// Read block
filein >> *this;
// Check the header
        if (CBigNum().SetCompact(nBits) > bnProofOfWorkLimit)
            return error("CBlock::ReadFromDisk() : nBits errors in block header");
        if (GetHash() > CBigNum().SetCompact(nBits).getuint256())
            return error("CBlock::ReadFromDisk() : GetHash() errors in block header");
return true;
}
void print() const
{
        printf("CBlock(hash=%s, ver=%d, hashPrevBlock=%s, hashMerkleRoot=%s, nTime=%u, nBits=%08x, nNonce=%u, vtx=%d)\n",
            GetHash().ToString().substr(0,14).c_str(),
nVersion,
            hashPrevBlock.ToString().substr(0,14).c_str(),
            hashMerkleRoot.ToString().substr(0,6).c_str(),
            nTime, nBits, nNonce,
vtx.size());
        for (int i = 0; i < vtx.size(); i++)
{
printf(" ");
vtx[i].print();
}
        printf(" vMerkleTree: ");
        for (int i = 0; i < vMerkleTree.size(); i++)
            printf("%s ", vMerkleTree[i].ToString().substr(0,6).c_str());
printf("\n");
}
    int64 GetBlockValue(int64 nFees) const;
    bool DisconnectBlock(CTxDB& txdb, CBlockIndex* pindex);
    bool ConnectBlock(CTxDB& txdb, CBlockIndex* pindex);
    bool ReadFromDisk(const CBlockIndex* blockindex, bool fReadTransactions);
    bool AddToBlockIndex(unsigned int nFile, unsigned int nBlockPos);
bool CheckBlock() const;
bool AcceptBlock();
};

A few notes here:

A block contains several transactions and is stored in vector<CTransaction> vtx.
The hash of a block, generated by the function GetHash(), is obtained by calculating the block header of the block instead of the entire block data. More specifically, the hash is generated by member variables nVersion to nNonce, and does not include the transaction container vtx.
The member variable uint256 hashMerkleRoot is part of the block header. It is generated by the member function BuildMerkleTree(), which takes all transactions vtx as input parameters and finally generates a 256-bit hash hashMerkleRoot.
uint256 hashPrevBlock is the hash of the block before the current block, included in the block header. Therefore, the hash of a block is related to the hash of the previous block in the blockchain.

CBlock::BuildMerkleTree()

BuildMerkleTree() builds a Merkle tree and returns the root. Each node of the tree, including leaf nodes and intermediate nodes, is a uint256 hash value. The tree is flattened and placed in vector<uint256> vMerkleTree. To flatten the Merkle tree, the function first puts the leaf nodes of level 0 into vMerkleTree, then puts the intermediate nodes of level 1 immediately after them, and repeats the process upwards until the root node is reached.

The variable j in line 25 is used to index the starting position of vMerkleTree to put the first node of each layer of the tree. The variable nSize is the number of nodes in each layer. Starting from layer 0, nSize = vtx.size(), and j = 0.

The outermost for loop visits each level of the tree. The number of nodes in each level is half of the previous level (nSize=(nSize+1)/2, line 26). The inner for loop visits each pair of nodes in a level in turn, and the pointer i increments by 2 at each step. The hash of the node pair (i and i2) is appended to the container vMerkleTree, where i2=i+1. When i reaches the last pair of nodes, i=nSize-1 when nSize is odd, or i=nSize-2 when nSize is even. In the former case, the last node (i2=i=nSize-1) forms a node pair with its own copy.

CBlock::GetMerkleBranch()

This function returns a Merkle tree branch. The entire Merkle tree has been flattened into the container vMerkleTree. The parameter nIndex points to the node in the container vMerkleTree[nIndex]. Its value ranges from 0 to vtx.size(). Therefore, nIndex always points to a leaf node, which is the hash of a transaction.

The returned Merkle tree branch contains all the nodes accompanying vMerkleTre[nIndex] to the root. Recall the process of building a Merkle tree, the nodes of each layer are paired to become the nodes of the next layer. At level 0, if nIndex is even, the node accompanying the node vMerkleTree[nIndex] is vMerkleTree[nIndex+1]; if nIndex is odd, the accompanying node is vMerkleTree[nIndex-1]. For simplicity, we call vMerkleTree[nIndex] node A0, and its accompanying node is B0. A0 and B0 are combined to generate another node A1 at level 1. At level 1, the companion node of A1 is B1, and the two are paired to generate node A2 at level 2. Repeat this process until the Merkle tree root is reached. The nodes [A0, A1, A2, ...] form a path from A0 to the root node, and their companion nodes [B0, B1, ...] constitute the returned Merkle branch.

To find the companion node, the function traverses each level of the tree (line 4). At each level, the pointer nIndex is halved from its value at the previous level. Therefore, nIndex always points to the path node [A0, A1, A2, ...]. Another pointer i is set to min(nIndex^1, nSize-1) at line 46. The or operator ^1 flips the last bit of nIndex without changing the other bits. This ensures that i always points to the companion node of the node pointed to by nIndex, for example, the node [B0, B1, B2, ...].

You may ask, what is the role of the Merkle tree? It can quickly verify whether a transaction is included in a block. We will introduce it below.

CBlock::CheckMerkleBranch()

This function takes its first argument hash, checks its second argument vMerkleBranch with it, and returns a resulting Merkle root. If the returned root is the same as hashMerkleRoot, the hash of the transaction pointed to by hash is included in the CBlock. The third argument nIndex is the position of the first argument in vMerkleTree; it is the same as the only argument to GetMerkleBranch().

Therefore, if you need to check whether a transaction is included in a CBlock, you only need to generate the hash of the transaction and call this function, verifying that the returned value is the same as hashMerkleroot.

CBlock::WriteToDisk() and CBlock::ReadFromDisk()

WriteToDisk writes a CBlock to disk. On line 80, it calls AppendBlockFile().

 FILE* OpenBlockFile(unsigned int nFile, unsigned int nBlockPos, const char* pszMode)
{
if (nFile == -1)
return NULL;
    FILE* file = fopen(strprintf("%s\\blk%04d.dat", GetAppDir().c_str(), nFile).c_str(), pszMode);
if (!file)
return NULL;
    if (nBlockPos != 0 && !strchr(pszMode, 'a') && !strchr(pszMode, 'w'))
{
        if (fseek(file, nBlockPos, SEEK_SET) != 0)
{
fclose(file);
return NULL;
}
}
return file;
}
static unsigned int nCurrentBlockFile = 1;
FILE* AppendBlockFile(unsigned int& nFileRet)
{
nFileRet = 0;
loop
{
        FILE* file = OpenBlockFile(nCurrentBlockFile, 0, "ab");
if (!file)
return NULL;
        if (fseek(file, 0, SEEK_END) != 0)
return NULL;
        // FAT32 filesize max 4GB, fseek and ftell max 2GB, so we must stay under 2GB
        if (ftell(file) < 0x7F000000 - MAX_SIZE)
{
            nFileRet = nCurrentBlockFile;
return file;
}
fclose(file);
nCurrentBlockFile++;
}
}

The first parameter nFileRet of AppendBlockFile() is used to store the return value. Analysis of the source code of AppendBlockFile() and OpenBlockFile() shows that nFileRet is a series of data files named in the form of "blk0001.data" and "blk0002.data". Blocks are saved in these files. They are called block data files.

Return to WriteToDisk(), the third parameter nBlockPosRet is the starting position of the current CBlock in the block data file. In line 89 of Block::WriteToDisk(), nBlockRet is assigned to the return value ftell(), which is the current pointer position of the block data file. After that, the current CBlock is stored in the block data file in a serialized form (line 92).

ReadFromDisk() is pretty straightforward to understand.

CBlockIndex and CDiskBlockIndex

The above analysis of CBlock::WriteToDisk() and CBlock::ReadFromDisk() shows that blocks are stored in a series of block data files. CBlockIndex encapsulates all this information in a single class CBlockIndex.

 //
// The block chain is a tree shaped structure starting with the
// genesis block at the root, with each block potentially having multiple
// candidates to be the next block. pprev and pnext link a path through the
// main/longest chain. A blockindex may have multiple pprev pointing back
// to it, but pnext will only point forward to the longest branch, or will
// be null if the block is not part of the longest chain.
//
class CBlockIndex
{
public:
    const uint256* phashBlock;
CBlockIndex* pprev;
CBlockIndex* pnext;
unsigned int nFile;
unsigned int nBlockPos;
int nHeight;
//block header
int nVersion;
uint256 hashMerkleRoot;
unsigned int nTime;
unsigned int nBits;
unsigned int nNonce;
//......
    uint256 GetBlockHash() const
{
return *phashBlock;
}
//......
};

In addition to nFile and nBlockPos, CBlockIndex also includes a copy of the block header it points to (except for the field hashPrevBlock, which can be accessed through pprev->phashBlock). This allows the block hash to be calculated without reading the entire block from the block data file.

From the commented lines 1-6, we can know that pprev points to the previous block index in the blockchain, and pnext points to the next block index in the blockchain.

If you pay close attention, you will notice that I mentioned "block indexes in the blockchain" instead of "blocks in the blockchain". This is a bit confusing, after all, the blockchain is made up of blocks, not block indexes, right? Yes, the blockchain is made up of blocks, but you can also say it another way: the blockchain is made up of block indexes. This is a block whose complete data is stored in the block data file on disk, and a block index refers to this block through the block's member variables nFile and nBlockPos. The pointers pprev and pnext point to the previous and next blocks of a block respectively to form the entire blockchain. So, the block index is as meaningful as the blockchain.

Despite being called a blockchain, Bitcoin's blockchain is not actually a linear structure, but a tree structure. The root of this tree is the genesis block, which is hard-coded into the source code. Other blocks are accumulated on top of the genesis block. It is legal to accumulate two or more blocks on top of one block. When this happens, the blockchain will have a fork. A blockchain may contain several forks, and the forked branches may further fork.

Due to forks, the pprev fields of multiple block indices may point to the same predecessor. Each of these block indices starts a branch, and these branches are based on the same predecessor block. However, the pnext field of the predecessor block can only point to the successor block that starts the longest branch. The path from the root of the tree (genesis block) to the last block of the longest branch is called the longest chain, or the best chain, or the main chain of this blockchain.

The pointer phashBlock points to the hash of the block, which can be calculated on-site through the block header embedded in CBlockIndex.

nHeight is the height of the block in the blockchain. The blockchain starts with a block with a height of 0, the genesis block. The next block has a height of 1, and so on.

CBlockIndex instances are only stored in memory. To store the block index on disk, the derived class CDiskBlockIndex is defined here.

 class CDiskBlockIndex : public CBlockIndex
{
public:
uint256 hashPrev;
uint256 hashNext;
CDiskBlockIndex()
{
hashPrev = 0;
hashNext = 0;
}
    explicit CDiskBlockIndex(CBlockIndex* pindex) : CBlockIndex(*pindex)
{
        hashPrev = (pprev ? pprev->GetBlockHash() : 0);
        hashNext = (pnext ? pnext->GetBlockHash() : 0);
}
IMPLEMENT_SERIALIZE
(
        if (!(nType & SER_GETHASH))
            READWRITE(nVersion);
READWRITE(hashNext);
READWRITE(nFile);
READWRITE(nBlockPos);
READWRITE(nHeight);
//block header
        READWRITE(this->nVersion);
READWRITE(hashPrev);
        READWRITE(hashMerkleRoot);
READWRITE(nTime);
READWRITE(nBits);
READWRITE(nNonce);
)
    uint256 GetBlockHash() const
{
CBlock block;
        block.nVersion = nVersion;
        block.hashPrevBlock = hashPrev;
        block.hashMerkleRoot = hashMerkleRoot;
        block.nTime = nTime;
        block.nBits = nBits;
        block.nNonce = nNonce;
        return block.GetHash();
}
//......
};

This class has two other member variables: the hashes of the previous and next block indexes, hashPrev and hashNext. Do not confuse them with pprev and pnext in CBlockIndex, the latter is a pointer to the block index, while the former is the hash of the block.

Neither CBlockIndex nor CDiskBlockIndex itself has a hash; they always identify themselves with the hash of the block they point to. This can be concluded from the constructor of CDiskBlockIndex (line 15). In this constructor, hashPrev is assigned to pprev->GetBlockHash(), where pprev is a pointer to a CBlockIndex. If you check the member function GetBlockHash of CBlockIndex, you can see that it returns the hash of the previous block. For CDiskBlockIndex, its member function GetBlockHash() also returns the hash of the block it points to.

CBlockIndex has no serialization method, while CDiskBlockIndex implements serialization through the macro IMPLEMENT_SERIALIZE. This is because the latter needs to be saved on disk, while the former only exists in memory.

CMerkleTx and CWallet

After understanding Merkle trees and CBlock, let's check two other important classes derived from CTransaction: CMerkleTx and CWallet.

 classCMerkleTx : public CTransaction
{
public:
uint256 hashBlock;
    vector<uint256> vMerkleBranch;
int nIndex;
// memory only
    mutable bool fMerkleVerified;
CMerkleTx()
{
Init();
}
    CMerkleTx(const CTransaction& txIn) : CTransaction(txIn)
{
Init();
}
void Init()
{
hashBlock = 0;
nIndex = -1;
        fMerkleVerified = false;
}
int64 GetCredit() const
{
        // Must wait until coinbase is safely deep enough in the chain before valuing it
        if (IsCoinBase() && GetBlocksToMaturity() > 0)
return 0;
        return CTransaction::GetCredit();
}
IMPLEMENT_SERIALIZE
(
        nSerSize += SerReadWrite(s, *(CTransaction*)this, nType, nVersion, ser_action);
        nVersion = this->nVersion;
READWRITE(hashBlock);
        READWRITE(vMerkleBranch);
READWRITE(nIndex);
)
    int SetMerkleBranch(const CBlock* pblock=NULL);
    int GetDepthInMainChain() const;
    bool IsInMainChain() const { return GetDepthInMainChain() > 0; }
    int GetBlocksToMaturity() const;
    bool AcceptTransaction(CTxDB& txdb, bool fCheckInputs=true);
    bool AcceptTransaction() { CTxDB txdb("r"); return AcceptTransaction(txdb); }
};

Instances of CTransaction are collected and a CBlock is generated. For a given CTransaction instance, its Merkle branch, and its position in the vector<CTranaction> vtx field of the CBlock can be used to find out whether the instance is contained in this CBlock. This is why CMerkleTx contains two other fields: a Merkle branch vMerkleBranch, and a position index nIndex. A CMerkleTx instance carries these two fields so that it can be easily verified whether it belongs to a block.

uint256 hashBlock is the hash of this transaction in CBlock.

CWalletTx extends CMerkleTx further to include information that only the owner cares about. These fields will be mentioned as we encounter them in the code.

 class CWalletTx : public CMerkleTx
{
public:
    vector<CMerkleTx> vtxPrev;
    map<string, string> mapValue;
    vector<pair<string, string> > vOrderForm;
    unsigned int fTimeReceivedIsTxTime;
    unsigned int nTimeReceived; // time received by this node
char fFromMe;
char fSpent;
    //// probably need to sign the order info so know it came from payer
// memory only
    mutable unsigned int nTimeDisplayed;
//......
};

CDiskTxPos and CTxIndex

These two classes are used to index transactions.

 class CDiskTxPos
{
public:
unsigned int nFile;
unsigned int nBlockPos;
unsigned int nTxPos;
//......
}
class CTxIndex
{
public:
CDiskTxPos pos;
    vector<CDiskTxPos> vSpent;
}

nFile, nBlockPos and nTxPos in CDiskTxPos are the serial number of the block data file, the position of the block in the block data file and the position of the transaction in the block data. Therefore, CDiskTxPos contains all the information to find a transaction from the block data file.

A CTxIndex is a database record "containing a transaction and the location on disk of the transaction that spends the output of this transaction". A transaction can easily find its source transaction (see Chapter 1), but not vice versa. vector<CDiskTxPos> vSpent contains the location of all transactions that have a transaction pointed to by CDiskTxPos pos as a source transaction, i.e. all transactions that spend the transaction pointed to by CDiskTxPos pos.

<<: Bitcoin ETF regains hope as SEC approves reconsideration petition

>>: Litecoin founder plans to activate Segregated Witness and convince miners to do so

Analyzing a person's personality through walking posture

What does it mean if the index finger is longer than the ring finger? What does it mean if the index finger is longer than the ring finger?

Many people use palmistry to judge their own dest...

Bitcoin Source Code Study Notes (Part 3)

Chapter 3

CBlock

CBlock::BuildMerkleTree()

CBlock::GetMerkleBranch()

CBlock::CheckMerkleBranch()

CBlock::WriteToDisk() and CBlock::ReadFromDisk()

CBlockIndex and CDiskBlockIndex

CMerkleTx and CWallet

CDiskTxPos and CTxIndex

Analyzing a person's personality through walking posture

The core driving force behind this wave of Bitcoin craze: Japanese speculators with 15 times high leverage?

What does a man with a small nose mean? See the life of a man with a small nose

Estonia's Supreme Court restricts Bitcoin

There are several types of Chuan-Zi Palms. Analysis of the fate of Chuan-Zi Palms

What kind of women would men be willing to spend money on?

Is it good to have a beauty spot under the lip? Is there any way to activate it?

How to identify the four characteristics of evil people

Is it good for a woman to have curved eyebrows? Eyebrow fortune analysis

PoS, DPoS, PBFT...Innovative consensus emerges one after another. Is PoW outdated?

Recommend

What kind of man is unlucky in marriage? Just look at these few places

Wu said weekly mining information: Jiangxi incident "revealed" and the impact spread 1108-1114

Causes of hairy fingers

The personality of a person who eats cleanly

The short-term peak of the big pie is reached, pay attention to the risk of decline

How to tell from his face whether a man is cheating? Which kinds of men are more likely to cheat?

What does it mean if the index finger is longer than the ring finger? What does it mean if the index finger is longer than the ring finger?

How does eyebrow length look like?

In-depth | As the dry season approaches, an analysis of the entire Bitcoin mining industry chain

Musk, the "dog dealer", announced that he will accept Bitcoin for car purchases again!

Palmistry characteristics of starting from scratch

Eight banking giants experiment with Bitcoin and blockchain technology

Column: Blockchain can help achieve globalization

Wall Street Journal: Dun & Bradstreet conducts blockchain trade finance test

How to tell if your marriage is going to be successful by looking at your facial features