Starknet mainnet – slow block creation alert

Incident Report for Starknet

Postmortem

The sequencer uses a cloud DB for storage. An internal GCP fault - verified with Google - caused one of the resources storing our DB to crash. Consequently, the sequencer wasn’t unable to read from or write to storage. Hence it was unable to execute transactions and produce new blocks.

The storage nodes did not automatically recover after the GCP fault was resolved due to enforcement of a strong consistency condition: a manual process must be initiated after confirming there was no data corruption. The nodes were revived after ensuring the DB is in order.

The total sequencer downtime was roughly 2 hours. The only effect was a break in block production.

Improvements to the response protocol will be discussed in the upcoming days.

Posted May 08, 2024 - 07:05 UTC

Resolved

The issue has been resolved and blocks are being produced normally again. A post-mortem will be published once we understand what triggered the bug.

Posted May 08, 2024 - 05:32 UTC

Update

The issue in the sequencer storage has been identified about 30 minutes ago and we're working on resolving it.

Posted May 08, 2024 - 04:57 UTC

Update

A bug in the sequencer storage has stalled block production. We are investigating.

Posted May 08, 2024 - 04:19 UTC

Investigating

Starknet is taking longer than expected to produce a block.

Posted May 08, 2024 - 03:46 UTC

This incident affected: Starknet Mainnet.