As our release of a Substrate-to-Substrate Bridge is around the corner, we are testing it intensively and comprehensively. This article is an introduction to the design and implementation of Substrate-to-Substrate bridges.
Before we dive into the discussion of Substrate-to-Substrate bridges, we need to define the scopes of Bridges and Substrate blockchains.
What is a Bridge and Why do We Need Bridges?
Blockchains are launching everyday. Massive assets, data, and smart contracts reside in those silo blockchains. From what had happened to Internet over the past decades, we believe that If they can circulate across chains, their value will multiply and can be acknowledged eventually.
In a narrow sense, a cross-chain bridge is a means of transferring of assets or value between two Layer-1 blockchains. Broadly speaking, however, any measure to provide interoperability and exchange of messages between two heterogeneous chains can be seen as a bridge.
What is a Substrate Blockchain?
Substrate is a modular framework for building custom blockchains that will be capable of connecting to the Polkadot Network or Kusama Network. In this article, we treat all types of Substrate-based blockchains, be it a parachain, a relay chain, or even a solo chain.
How to Build a Cross-chain Bridge?
We have a variety of technologies to choose from when building a cross-chain bridge. Centralized crypto exchanges such as Binance are a type of bridge with which one can transfer assets (USDT) from one blockchain (Ethereum) to another (Tron) by a series of deposits and withdrawals. More decentralized solutions include custody by MultiSig or Oracle service, which provides a higher level of decentralization and security. Another type of bridge, including Rainbow from NEAR and relayer bridges from Darwinia, use light clients to connect two blockchains. To enable one-way communication, we need to build a light client into the target chain’s runtime to verify the source chain’s integrity. Similarly, we can do so in the opposite direction (from the target chain to the source chain) and thus we establish a two-way bridge.
Overview of Light Client Bridges
This article will first introduce the structure of a light-client-based bridge and then conduct a bottom-up analysis to discuss the Light Client Layer (GRANDPA), the Bridge Message Layer, and the Application Layer. In the Application Layer part, we will take Mapping Token Protocol as an example to introduce some use cases in this layer. In the end, we will touch on the concept of Bridge Network. The following diagram depicts the structure of our Darwinia<>Crab bridge.
One essential aspect that distinguishes a light client bridge from a custody-based one is the level of trust. Unlike a custody-based bridge which feeds messages to a target chain via a set of trusted nodes or oracles, the credibility of a light client bridge stems from certain consensus and the verification of finality. The synchronized headers and corresponding finality proofs must comply with pre-defined rules and logic. A variety of cryptographic technologies can be applied in such verification. Merkle Trees in Bitcoin and Merkle Patricia Tries in Ethereum are two well-known examples. Polynomial commitment and zero-knowledge proof are among the promising technologies in this domain.
We can see from Fig 1 that the Application Layer rests on the Message Layer which in turn relies on the light client for message verification. There are four key components in this structure: Header Sync, Headers Relayer, Message Delivery & Dispatch and Message Relayer. The Header Sync component manages the synchronization of the source chain block headers and finality proof to the target chain and coordinates with an off-chain executable named Headers Relayer to accomplish the task. The Message Delivery & Dispatch component is responsible for the delivery, dispatch and interpretation of cross-chain messages from the source chain to the target chain. It also has an accompanying off-chain executable called the Message Relayer.
The Process of Cross-chain Bridging
The process of cross-chain bridging can be broken down to 7 steps, as shown in the above diagram (Fig 2). The Header Relayer keeps track of headers on the source chain (Step 1) and delivers new headers and their finality proofs to the target chain without any modification (Step 2). The light client on the target chain verifies the headers and records them (Step 3) for later use. Similarly, the Message Relayer keeps listening for new messages (Step 4) and synchronize them to the target chain (Step 5). The light client on the target chain can check the validity of the messages using the corresponding headers (Step 6). After the message passes the verification, the payload contained in the message can be interpreted and executed accordingly (Step 7).
We can tailor implementation details to different situations. In Darwinia, we have adopted a Bridger framework that provides all such functionalities in various cross-chain scenarios.
GRANDPA Finality Light Client
We will start off with the light client to discuss the most important component in a Substrate-to-Substrate — GRANDPA Protocol. By Parity’s definition, GRANDPA, GHOST-based Recursive ANcestor Deriving Prefix Agreement, is a finality gadget for blockchains, implemented in Rust. It needs more than 2/3 stake to finalize a block. A remarkable feature of GRANDPA is that it can finalize more than one block in one round of voting in most cases.
Fig 3 depicts the general idea of GRANDPA. While every validator node votes on a specific block and hence its ancestors along the chain (Node 2 votes on Block F1 and implicitly F1’s ancestors E1, D2, C), GRANDPA finds their latest common ancestor © and finalize it. With this block © being finalized, so are its ancestors(B and A).
Fig 4 is a snapshot of the voting process with GRANDPA finality. The rectangles on the left are blocks and the ovals on the right are validators. The two left-most grey blocks have been finalized. Validators with different staking weights vote for the newest blocks on their own branches, as the lines show. After collecting enough votes in one round, some blocks are finalized while competing blocks are discarded based on the calculation of GRANDPA justification. So the validators can go on to the next round of voting on the updated canonical chain.
Fig 5 is the sequence diagram of the process of Header Syncing. As the source continuously produces new blocks, the relayer keeps fetching new blocks and their corresponding proofs of GRANDPA finality and synchronizing them to the target chain. The light client on the target chain verifies the newly synchronized block and replaces the highest block. Then the light client can use the newest block to verify the subsequent messages.
Like Merkle Trees in Bitcoin and Merkle Patricia Tries in Ethereum, there is a specific data structure for verifying GRANDPA finallity — Merkle Mountain Range (MMR) in Substrate blockchains. One advantage of MMR is the low cost of storing and updating nodes, and we can devise the structure of a leaf node in an MMR to accommodate the hash of the previous block header or a root hash of messages. Thus the light client on the target chain has a more efficient way to verify what happened in the source chain, be it a storage item or an event.
Bridge Message Layer
In this section, we will dive into the design of the Bridge Message Layer.
Fig 06 is a sequence diagram of the message delivery process. Similar to the header syncing process, a source chain, a message relayer, and a target chain are involved. Following are the life cycle of a message delivery.
- The messages are generated and appended to the message queue on the source chain;
- The relayer fetches messages (by RPC) from the head of the queue along with their Merkle proofs and relays them to the target chain;
- The target chain verifies the received message with the Merkle proofs;
- After the verification, the target chain sends out a Confirm message;
- The relayer passes the Confirm message back to the source chain;
- A counterpart light client on the source verifies the Confirm message and delete the corresponding messages from the queue;
- In the end, the system distributes rewards among the relayers who contribute in the process.
OutboundLane and InboundLane and are a pair of essential concepts. OutboundLane is on the source chain and responsible for maintaining the queue of the messages for delivery, while InboundLane is on the source chain and responsible for receiving messages forwarded by relayers. The payload of the message is nothing more than a sequence of bytes. The message layer does not care about interpreting these bytes in any semantics or application sense, thus decoupling from the application layer.
One thing to note is that there is a nonce contained in the outbound data (the source chain) which can be used for sorting on the receiving end (the target chain) and preventing replay attacks. Similarly, a symmetric mechanism exists in the opposite direction (confirmation messages from the target to the source).
Mapping Token Protocol
This section will introduce an application built on top of the aforementioned infrastructure — Mapping Token Protocol. We need to answer a question first, What does the cross-chain transfer of assets mean exactly? Or how does it happen in essence?
Fig 7 depicts a typical solution that locks the tokens (backing) on the source chain and issues the corresponding tokens (issuing) on the target chain atomically. The mapping token issued on the target chain may circulate from one holder to another. However, whoever holds the the mapping token is entitled to claim and draw the locked token on the source chain. Theoretically, tokens can be fungible (ERC20) or non-fungible (ERC721) or any other types of assets, because the message layer does not impose any restrictions on the payload’s content.
Fig 8 illustrates the four types of transactions related to Cross-chain transfer. The Issue in the top-right quadrant reflects the key points the we have introduced so far. When a requestor performs a cross-chain transfer, what happens under the hood is as follows:
- The asset on the source chain is locked by the backing module;
- After the locking transaction is included in a block and finalized, the header relayer delivers the corresponding header and the the transaction proof to the target chain;
- Meanwhile the message relayer delivers the message of the locking transaction to the target chain;
- Once the light client on the target chain finishes the verification of the locking message, the issuing module transfer the mapping token (asset) to a specified address.
Prospect of Darwinia Bridge Network
Finally, we introduce the concept of the Darwinia Bridge Network. If we build a bridge for each pair of blockchains, the number of bridges would be $O(n²)$. However, if we can build a bridge between Darwinia and every other blockchain and Darwinia serves as a hub, we can avoid a lot of reinvent-the-wheels work, and the complexity becomes $O(n)$. Fig 9 is the snapshot of the bridge network we have built so far. There are bi-directional bridges, such Ethereum<>Darwinia and Crab<>Darwinia(Substrate-to-Substrate), and uni-directional bridges, such as Kusama>Crab (Relaychain). We will build a Darwina<>BSC bridge in the near future. In the long run, as the number and types of blockchains and bridges grow, we might have a routing mechanism for cross-chain interoperability similar to what we have on the Internet.