Starknet Mainnet Reorg
Incident Report for Starknet

What was the bug?

In a nutshell, the blockifier (the sequencer’s execution engine) charged a transaction fee that is higher than the fee agreed on by the user. Here are the details:

  1. Transaction fees are charged for both gas usage and blob gas usage.
  2. However, v3 transactions currently submit only a max amount of gas: blob gas is converted to gas behind the scenes.
  3. There are two instances of such a conversion:

    1. To compute the amount of gas the user’s fee must cover,
    2. To compute the amount of gas to charge for.
  4. There was a discrepancy in logic for these two instances: the first one rounded down while the second rounded up.

  5. Hence, the fee submitted by the user could cover the computed amount of gas, but in fact the sequencer would charge for a larger amount of gas.

  6. Example: user agreed to pay 1. The first check says that the compound cost is 0.99 which is good as it’s lower than the user max fee. The actual charge is 1.01 which is more than the user agreed to pay.

  7. The Starknet OS ensures users are not charged higher fees than they agreed to pay, leading to a reorg.

  8. The bug was in the following conversion:
    blob_gas_as_regular_gas = (blob_gas_amount * blob_gas_price) / regular_gas_price

  9. The fix:
    blob_gas_as_regular_gas=(blob_gas_amount * blob_gas_price).div_ceil(regular_gas_price)

  10. The actual incident: a transaction was charged 542202566082576 FRI while signing on the following max resources.

    'L1_GAS': {'max_amount': '0x15',
       'max_price_per_unit': '0x177b7e71b5d0'} 

    But 0x177b7e71b5d0 * 0x15 = 542202565749264 and 542202565749264 < 542202566082576. The un-rounded amount of gas after blob gas conversion was 21.0000000012 gas, yet the user signed only on 21 gas.

How did Starknet behave following the fix?

  1. The reorg replayed the same ledger of transactions.
  2. Blocks [630,029, 630,064] were reorganized into [630,029, 630,059].

    1. First aborted (by hash)
    2. Last aborted (by hash)
  3. There were fewer blocks post-reorg:

    1. Pre-reorg several of the blocks in this ranged were closed due to reaching a time limit.
    2. Post-reorg, the transactions from the reorg were simultaneously dropped into the sequencer’s backlog, so they were processed faster and created fuller blocks.
  4. The execution of several transactions was impacted by the reorg. The vast majority of these changes in execution status was due to different ambient conditions during their post-reorg re-execution, namely block timestamps. Indeed it is customary for wallets to render their transactions invalid several hours after their first creation.

Next steps

Due to the inherent separation between execution and proving phases, we are taking further steps to mitigate the effects of any potential discrepancies in the future.

  1. Proactive: we plan to add intermediate status between ACCEPTED_ON_L2 and ACCEPTED_ON_L1 which drastically reduces any wiggle-room. In the near future, such a status will reflect successful execution of the OS. In the slightly more distant future, it will reflect the availability of a proof on L2.
  2. Reactive: we’re looking into a smooth replay feature, which will allow the replay of the reorged list of transaction in maximally similar ambient conditions to the original execution. This means not only timestamps, but also the same coupling of transactions into blocks. At any rate, such a feature is rather involved and will not ship in the very near future.
Posted Apr 05, 2024 - 22:24 UTC

A rounding error bug caused the sequencer to create an unprovable block. See postmortem for details.
Posted Apr 05, 2024 - 02:30 UTC