Geth at the core
The second key design idea in Nitro is "Geth at the core." Here "geth" refers to go-ethereum, the most common node software for Ethereum. As its name would suggest, go-ethereum is written in the Go programming language, as is almost all of Nitro.
The software that makes up a Nitro node can be thought of as built in three main layers, which are shown above:
- The base layer is the core of geth--the parts of Geth that emulate the execution of EVM contracts and maintain the data structures that make up the Ethereum state. Nitro compiles in this code as a library, with a few minor modifications to add necessary hooks.
- The middle layer, which we call ArbOS, is custom software that provides additional functions associated with Layer 2 functionality, such as decompressing and parsing the Sequencer's data batches, accounting for Layer 1 gas costs and collecting fees to reimburse for them, and supporting cross-chain Bridge functionalities such as deposits of Ether and tokens from L1 and withdrawals of the same back to L1. We'll dig in to the details of ArbOS below.
- The top layer consists of node software, mostly drawn from geth. This handles connections and incoming RPC requests from clients and provides the other top-level functionality required to operate an Ethereum-compatible Blockchain node.
Because the top and bottom layers rely heavily on code from geth, this structure has been dubbed a "geth sandwich." Strictly speaking, Geth plays the role of the bread in the sandwich, and ArbOS is the filling, but this sandwich is named for the bread.
The State Transition Function consists of the bottom Geth layer, and a portion of the middle ArbOS layer. In particular, the STF is a designated function in the source code, and implicitly includes all of the code called by that function. The STF takes as input the bytes of a Transaction received in the inbox, and has access to a modifiable copy of the Ethereum state tree. Executing the STF may modify the state, and at the end will emit the header of a new block (in Ethereum's block header format) which will be appended to the Nitro chain.
The rest of this section will be a deep dive into Geth and ArbOS. If deep technical knowledge does not suit you, skip to the next section Separating Execution from Proving.
Geth
Nitro makes minimal modifications to Geth in hopes of not violating its assumptions. This section will explore the relationship between Geth and ArbOS, which consists of a series of hooks, interface implementations, and strategic re-appropriations of Geth's basic types.
We store ArbOS's state at an address inside a Geth statedb
. In doing so, ArbOS inherits the statedb
's statefulness and lifetime properties. For example, a transaction's direct state changes to ArbOS are discarded upon a revert.
0xA4B05FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
The fictional account representing ArbOS
Hooks
Arbitrum uses various hooks to modify Geth's behavior when processing
transactions. Each provides an opportunity for ArbOS to update its state and make decisions about the
transaction during its lifetime. Transactions are applied using Geth's ApplyTransaction
function.
Below is ApplyTransaction
's callgraph, with additional info on where the various Arbitrum-specific hooks are inserted. Click on any to go to their section. By default, these hooks do nothing so as to leave Geth's default behavior unchanged, but for chains configured with EnableArbOS
set to true, ReadyEVMForL2
installs the alternative L2 hooks.
core.ApplyTransaction
⮕core.applyTransaction
⮕core.ApplyMessage
core.NewStateTransition
core.TransitionDb
StartTxHook
core.transitionDbImpl
- if
IsArbitrum()
remove tip GasChargingHook
evm.Call
core.vm.EVMInterpreter.Run
PushCaller
PopCaller
core.StateTransition.refundGas
- if
EndTxHook
- added return parameter:
transactionResult
What follows is an overview of each hook, in chronological order.
ReadyEVMForL2
A call to ReadyEVMForL2
installs the other transaction-specific hooks into each Geth EVM
right before it performs a state transition. Without this call, the state transition will instead use the default DefaultTxProcessor
and get exactly the same results as vanilla Geth. A TxProcessor
object is what carries these hooks and the associated Arbitrum-specific state during the transaction's lifetime.
StartTxHook
The StartTxHook
is called by Geth before a transaction starts executing. This allows ArbOS to handle two Arbitrum-specific transaction types.
If the transaction is ArbitrumDepositTx
, ArbOS adds balance to the destination account. This is safe because the L1 bridge submits such a transaction only after collecting the same amount of funds on L1.
If the transaction is an ArbitrumSubmitRetryableTx
, ArbOS creates a retryable based on the transaction's fields. If the transaction includes sufficient gas, ArbOS schedules a retry of the new retryable.
The hook returns true
for both of these transaction types, signifying that the state transition is complete.
GasChargingHook
This fallible hook ensures the user has enough funds to pay their poster's L1 calldata costs. If not, the transaction is reverted and the EVM
does not start. In the common case that the user can pay, the amount paid for calldata is set aside for later reimbursement of the poster. All other fees go to the network account, as they represent the transaction's burden on validators and nodes more generally.
If the user attempts to purchase compute gas in excess of ArbOS's per-block gas limit, the difference is set aside and refunded later via ForceRefundGas
so that only the gas limit is used. Note that the limit observed may not be the same as that seen at the start of the block if ArbOS's larger gas pool falls below the MaxPerBlockGasLimit
while processing the block's previous transactions.
PushCaller
These hooks track the callers within the EVM callstack, pushing and popping as calls are made and complete. This provides ArbSys
with info about the callstack, which it uses to implement the methods WasMyCallersAddressAliased
and MyCallersAddressWithoutAliasing
.
L1BlockHash
In Arbitrum, the BlockHash and Number operations return data that relies on underlying L1 blocks instead of L2 blocks, to accommodate the normal use-case of these opcodes, which often assume Ethereum-like time passes between different blocks. The L1BlockHash and L1BlockNumber hooks have the required data for these operations.
ForceRefundGas
This hook allows ArbOS to add additional refunds to the user's tx. This is currently only used to refund any compute gas purchased in excess of ArbOS's per-block gas limit during the GasChargingHook
.
NonrefundableGas
Because poster costs come at the expense of L1 aggregators and not the network more broadly, the amounts paid for L1 calldata should not be refunded. This hook provides Geth access to the equivalent amount of L2 gas the poster's cost equals, ensuring this amount is not reimbursed for network-incentivized behaviors like freeing storage slots.
EndTxHook
The EndTxHook
is called after the EVM
has returned a transaction's result, allowing one last opportunity for ArbOS to intervene before the state transition is finalized. Final gas amounts are known at this point, enabling ArbOS to credit the network and poster each's share of the user's gas expenditures as well as adjust the pools. The hook returns from the TxProcessor
a final time, in effect discarding its state as the system moves on to the next transaction where the hook's contents will be set afresh.
Interfaces and components
APIBackend
APIBackend implements the ethapi.Backend
interface, which allows simple integration of the Arbitrum chain to existing Geth API. Most calls are answered using the Backend member.
Backend
This struct was created as an Arbitrum equivalent to the Ethereum
struct. It is mostly glue logic, including a pointer to the ArbInterface interface.
ArbInterface
This interface is the main interaction-point between geth-standard APIs and the Arbitrum chain. Geth APIs mostly either check status by working on the Blockchain struct retrieved from the Blockchain
call, or send transactions to Arbitrum using the PublishTransactions
call.