One year after the first Rayonism prototypes, we now have robust merge implementations across all Ethereum clients.
The path from where we are today to fully transitioning Ethereum to proof-of-stake is now clearly in sight. We need:
... and that's it! Once these things happen, and we've observed them be stable for a few weeks, we'll be ready to move to mainnet!
Let's expand on each of them.
Over the past year, we've added a new step to our network upgrade process: Shadow Forks.
There have been a few explainers written about shadow forks already. In short, a shadow fork is a new devnet created by forking a live network with a small number of nodes. The shadow fork keeps the same state, history & chain ID as the main chain.
Running these allows us to observe how clients perform in conditions as close to public networks as possible. For the nodes on the shadow fork, The Merge effectively happens. After that, transactions from the main chain can be replayed on the fork, allowing us to see how nodes behave under mainnet conditions. We can also sync new nodes to the shadow fork to ensure that they still join the network as expected.
During these shadow forks, every execution layer (EL) and consensus layer (CL) client combination is tested, and the goal is for each pair to transition and keep running smoothly afterwards. With 4 EL and 5 CL clients, there are 20 pairs to test!
So far, we've had multiple Goerli shadow forks along with two mainnet ones. The second mainnet shadow fork (MSF2) went almost perfectly. Another one, MSF3, is planned for this week. If MSF3 happens without issues and remains stable afterwards, we could move towards upgrading existing testnets. To play it safe, we will continue to launch shadow forks regularly until (and perhaps even during!) testnet deployments.
In parallel to this, we are also doubling down on several other testing efforts.
The Merge has been a unique upgrade to test because it spans both the execution and consensus layers of Ethereum. While we have extensive testing tools for each of these layers in isolation, a lot of new infrastructure to test cross-layer interactions has been necessary.
Hive is the integration testing platform we historically used for the EL. Over the past few months, we've added the ability for it to mock CL behaviour and test various ELs against that. This has helped us test the new Engine APIs which EL and CL clients use to communicate. To test the PoW -> PoS transition, a simulator which mocks the EL behaviour will need to be added as well.
Client teams are currently prioritizing supporting Hive and ensuring they pass all test suites, while the testing teams are focused on adding EL mocking to it.
In addition to our existing testing infrastructure, we have partnered with [Kurtosis] (https://www.kurtosistech.com/) to automatically spin up staging networks which we run through The Merge on a daily basis.
These help us find implementation issues across clients and monitor various network health metrics. As things stabilize on that front, the next step is to create harsher network conditions to see how clients recover. For example, pausing EL or CL clients right before the transition and unpausing them after, or removing their database post-merge and checking how they handle syncing.
In addition to improving Hive and working with Kurtosis, a long tail of testing tools were built by the client, research and testing teams to help us catch every possible edge case. They include fuzzers, bad block generators, EL/CL mockers, debugging APIs and more fuzzers. A wishlist of other tools is available here.
Our priority is getting clients to pass unit/spec tests as well as integration tests in Hive and Kurtosis. However, these other tools mentioned above can help us identify and debug edge cases we missed and then subsequently incorporate in routine test suites.
On a more human front, The Merge has massively increased the amount of cross-team coordination & collaboration on testing. CL and EL teams have had to work each other closely for the first time, ensuring that their software works with each client on the other layer. This had lead to more, and deeper, collaboration across our entire testing infrastructure 😁
Once shadow forks are going smoothly and test suites are passing across all clients, we will be ready to deploy The Merge to the existing public testnets, namely Ropsten, Goerli and Sepolia.
While public testnets do not stress test clients as much as mainnet shadow forks, they require broader coordination within the Ethereum ecosystem.
The Merge requires more from node operators than previous Ethereum upgrades. In past upgrades, node operators and miners on the EL only needed to update a single piece of software: their EL client. For The Merge, they will need to download, configure and run a CL client in parallel.
On the CL side, it was always strongly recommended to run an EL node when operating a validator. Pre-merge, though, this could be outsourced to third party service providers. For The Merge, stakers will need to run an EL to verify the validity of blocks and receive transaction fees when proposing a block (or risk losing them if outsourcing their EL!).
Node operators, stakers, and infrastructure providers should make sure they test their configurations on Kiln to be ready for deployments on testnets. EthStaker has also published various tutorials on how to do this.
Once Ropsten, Goerli and Sepolia have forked and stabilized (assuming no further issues are found) we will then be ready to set a merge date for mainnet!
The process for transitioning the Ethereum mainnet to proof-of-stake will be identical to the process followed for testnets. That said, it is worth re-emphasizing that the transition happens in three steps:
The image below by Danny Ryan illustrates the process:
The leftmost blocks show the EL and CL moving in parallel pre-merge, where PoW (EL) blocks contain transactions and Beacon Chain (CL) blocks contain proof-of-stake consensus data.
The second PoW block from the left is when TTD would have been hit or exceeded. The third block on the bottom is the first post-merge block, containing both the proof-of-stake consensus data and the execution layer transactions.
The fourth block, and subsequent ones, have no link back to proof-of-work. Once these are finalized, the network could only be disrupted beyond that point in circumstances comparable to a 51% attack under proof-of-work.
In other words, by that point, we've merged 🍾!
The Merge is by far the most complex upgrade we've ever planned for Ethereum. Teams and individual contributors have been working tirelessly for over a year now, and the finish line is finally in sight.
While everyone is excited to see Ethereum transition to proof-of-stake, this is not the time to cut corners: ensuring a safe and seamless transition for Ethereum users and the rich ecosystem built on the network is our #1 priority. We're almost there 😁!
Wen merge? 🔜.
Thank you to Trent Van Epps, Danny Ryan, Mario Vega, and others for reviews & additions to this update.