Over the past year, TrustSec has contributed to the security of OP chain through several initiatives. We began with a private audit of the transition to the Superchain. Later, we secured first place in the Sherlock Fault Dispute contest. Recently we took part in the Optimism-Java node contest on Cantina, which is still in the judging phase. To top it off, we also submitted two paid bug bounties.
In this series of posts, we'll highlight some of the more unique bugs uncovered. Hopefully, it will inspire fellow white-hats for their future projects (blackhats, please stop reading now).
The Hidden Attack Surface of Protocol Upgrades
The first issue revolves around the upgrade procedure for the L1-L2 communication. As with most bugs, understanding this issue beyond a "hand-waving" level requires a thorough introduction. There's a great breakdown of the relevant code paths on The Bytecode podcast here. For those that don't have 30 minutes to 2x it, we'll cover the key details below.
Intro to OP communication
OP implements L1 -> L2 ETH deposits using the OptimismPortal. It locks ETH and emits an event which will be automatically translated to an ETH minting call on L2. For ETH withdrawals, users send ETH to the L2ToL1MessagePasser, this saves a withdrawal mapping in its state, which can be proven on L1 by a Merkle proof from the trusted L2 state root. These primitives can also be used to pass arbitrary data between mainnet and OP. However, using these functions directly is highly error-prone, as insufficient gas can cause the operation to become permanently stuck. For that reason, OP introduced the CrossDomainMessenger (XDM from now). We can assume it will use the low-level piping safely. On top of the pure-messaging logic of the XDM, token bridges are implemented - StandardBridge (ERC20 and ETH), and ERC721Bridge. Each of these has an L1 and L2 variant to reduce code duplication. With this background, we now have a broad picture of the deposit and withdrawal flows
Deposit ETH:
L1:
User -> L1StandardBridge depositETH() -> L1CrossDomainMessager sendMessage() -> OptimismPortal depositTransaction()
L2 (Through injection of deposit transaction):
L1CrossDomainMessenger -> L2CrossDomainMessenger relayMessage() -> L2StandardBridge finalizeBridgeETH() -> User
Withdraw ETH:
L2:
User -> L2StandardBridge withdraw() -> L2CrossDomainMessenger sendMessage() -> L2ToL1MessagePasser initiateWithdrawal()
L1 (Through proving of withdrawal existence):
OptimismPortal finalizeWithdrawalTransaction() -> L1CrossDomainMessenger relayMessage() -> L1StandardBridge finalizeETHWithdrawal -> User
An important detail is that the XDM supports resending failed messages (the safety net). Let's examine how the core of XDM is implemented.
The call() is the external call (e.g. to StandardBridge) that may fail. The xDomainMsgSender is a storage variable that signifies the original sender of the TX and also doubles as a reentrancy guard. After the call it returns to the default 0xDEAD (DEFAULT_L2_SENDER) value. Then according to whether the call succeeded or failed, the respective mapping for the withdrawal hash is set to true.
We can look at how the mapping is used below.
This part early in the function is basically stating:
Either this is the first delivery of the message, in this case ETH is supplied from the OptimismPortal in the current call.
Or this is a re-delivery of a failed message, in this case ETH for the withdrawal is already in the XDM ready for sending and is not passed in.
The check below makes sure we never execute a withdrawal that already succeeded.
For closure, the reentrancy check discussed before is in the line below, prior to the call. It will fail the delivery.
What could happen without the msgSender check above?
An attacker could intentionally fail delivery of an ETH withdrawal to their smart account (e.g. by reverting).
Then, it would redeliver it (failedMessages == true flow):
Within the delivery, instead of reverting it would now reenter into relayMessage() passing the same withdrawal request.
The message is not yet marked as success (we saw this happens after the call).
It would reach the call again, which this time does not reenter.
It would store the withdrawal as success.
Contract returns from reentering relayMessage()
It would store the withdrawal as success again.
With the reentrancy check, this attack won't work as the sender will not be 0xDEAD at the second call, failing it. In other words, the design does not follow Check-Effects-Interactions, and counts on an explicit reentrancy check to protect from attacks.
This concludes the intro to the mechanics of the XDM and now we can move to the upgrade.
SuperchainConfig upgrade
The scope of the private audit was the SuperchainConfig contract, a very simple program allowing designated roles to flip a pause button (other functionalities would be added down the line). It was integrated to the Bridge, OptimismPortal, and XDM contracts, so that everything would be pausable. As part of the upgrade procedure, the L1XDM proxy would be upgraded to a new implementation, and initialize() would be called to set the config contract.
For the sharp-eyed readers, normally initialize() can't be called because the proxy is already initialized, however an in-between upgrade manually switches off the initialized bit of the contract (via assembly) to allow it. The easier way of achieving the same result is using the reinitializer modifier.
The upgrade is signed by a Gnosis Safe M-of-N multisig with default configuration.
It was quite tempting to mark the function as safe as it only added a single line which sets an address variable. By doing that, we would have missed a critical issue. See if you can spot it.
The Bug
We've discussed that xDomainMsgSender is DEFAULT_L2_SENDER (0xDEAD) at all times except inside the withdrawal. How does it get this value in the first place? Clearly it must be part of the XDM initialization.
If not initialized, it defaults to a zero value, causing all withdrawals to fail as if they were reentering.
So how is all that relevant? Because it means that during an upgrade xDomainMsgSender is force-reset to DEFAULT_L2_SENDER, even if we are inside a withdrawal. Wait, what?
Once enough signers sign an upgrade, it can be executed by anyone. Normally that's not a problem since the upgrade logic does not depend on who initiated it as long as the upgrade payload is trusted. But in this case, there is a hidden assumption that the msgSender setter line is a no-op, which is not true mid-withdrawal. In fact, the withdrawal receiver contract can perform a double-withdrawal by executing the upgrade before reentering the XDM. It would look as follows:
Store a failed delivery to the smart contract.
Wait for the upgrade to be in the mempool.
Copy the upgrade parameters into the contract.
Reattempt the failed delivery.
Within the delivery, perform the upgrade using the parameters observed.
The msgSender is now DEFAULT_L2_SENDER.
Reenter into relayMessage() passing the same withdrawal request.
The reentrancy check is passed.
Other checks are passed as described in the naive flow.
The contract returns successfully from the inner execution.
It would store the withdrawal as success.
It would restore msgSender to DEFAULT_L2_SENDER.
Contract returns from reentering relayMessage().
It would store msgSender as DEFAULT_L2_SENDER though it is already that value (should never happen).
It would store the withdrawal as success again.
The result is double withdrawal of ETH from the XDM contract. The stolen amount comes from any failed withdrawals parking at the XDM.
The honest upgrade transaction reverts as it was already executed.
The idea of using an upgrade to gain primitives through its initialization was to our knowledge never seen before. We recommended several layers of defense to address this issue:
The initializer could simply not touch msgSender as it is already set.
A more invariant-based fix:
When marking the message as succeeded after the call, make sure it is not already set. This seems redundant because we already have a check it did not succeed, but after executing the external call, the inner call frame could have set it to true. The code intention is to change the value of the mapping, not to unconditionally set it (this is a recurring motif in the exploit).
Takeaways for devs/auditors:
Look to perform the minimum required operations in upgrades.
Always consider the impact of upgrades being performed in the middle of an external call.
Whenever possible convert assumptions into validation code. Here this would have prevented the issue without being aware of the hidden attack surface.
For those wondering why the XDM does not simply have a nonReentrant wrap on top of it - that was the case prior to the first Optimism contest, where an ingenious exploit was detected which could abuse the reentrancy protection to permanently block a user's withdrawal. Due to the delicate nature of the withdrawal path a more fine-grained approach is necessary.
In the next post, we'll unpack some interesting bug bounty issues reported to Optimism. See you all soon!
About TrustSec
TrustSec is a world-renowned boutique auditing and bug bounty firm, founded by famed white-hat hacker Trust in 2022. Since then, we have published over 100 reports for a wide range of clients and received bug bounties from more than 40 projects. Composed exclusively of top competitive bounty hunters, TrustSec employs unique, fully collaborative methodologies to secure even the most complex codebases. Visit our website to learn why top names like Optimism, zkSync, OlympusDAO, The Graph, BadgerDAO, Reserve Protocol, and many others trust us as their security partner.
Comments