Coinbase reviews the May outage incident: AWS cascading failure exposes architectural risks
Coinbase released a retrospective report on the large-scale service interruption event on May 7, 2026.
The outage lasted approximately 8 hours, with full recovery taking about 12 hours. During this time, trading, deposits, withdrawals, and most core services were unavailable or severely degraded. Coinbase stated that the outage was caused by multiple cooling units failing simultaneously in the cooling system of a data center in one availability zone (use1-az4) in the AWS us-east-1 region, triggering cabinet thermal protection shutdowns, which led to EC2 instances and EBS volumes going offline, affecting multiple internet services.
During the recovery process, the Coinbase trading matching engine lost quorum due to the cluster architecture deployed in a single AWS data center losing most nodes. It required urgent code adjustments and the reconstruction of a new node group to restore operation, gradually restarting market trading during the recovery.
Additionally, the AWS-managed Kafka (MSK) service experienced control plane failures, preventing the automatic re-election of partition leaders, further blocking quotes, fees, and some settlement and data flow systems, which expanded the overall impact.
After manual partition migration in collaboration with the AWS engineering team, the system gradually returned to normal. Coinbase stated that this incident exposed its shortcomings in cross-availability zone automatic switching capabilities and disaster recovery for managed middleware. The company will upgrade its cross-region hot backup architecture, strengthen regular failure drills, and migrate the Kafka system from dual availability zones to a three availability zone deployment, while also working with AWS to advance root cause fixes and improvements.
You may also like

Morning News | Michael Saylor releases Bitcoin Tracker information; Aave releases post-attack investigation on Kelp rsETH bridge; Gravity Bridge announces service suspension after being attacked

BIS's latest research: The future of stablecoins and the global monetary landscape

Interview with macro master Raoul Pal: The AI competition is giving rise to an "economic singularity," don't easily give up your chips in the next four years

Wang Chuan: How can one not feel anxious after the neighbor Old Wang made thirty times his investment in storage stocks? (Six) - The Trap of Homogeneous Products

"Trapped in the cryptocurrency world: Don't let the anxiety of missing out force you onto the most dangerous last train."

The broken defense of Solana's guardians: In order to tear apart Hyperliquid, they actually picked up the script that Ethereum once criticized itself?

Why is Peter Thiel, behind Palantir, preparing an exit in Argentina?

The midlife crisis of Crypto GP: Without PMF, there is no next check from LP

Fidelity Mid-Year Review: 6 Key Trends in Digital Assets for 2026

Three years later: Looking back at my judgment of ChatGPT in 2023

From Casino Tools to Global Pricing Machines: The NYSE Leader's Perspective on Hyperliquid

A Detailed Analysis of "Stock God Serenity" Investment Methodology

Sharplink CEO: The future of Ethereum is unfolding

Morning Report | Korea Investment & Securities and OKX plan to jointly acquire 40% of Coinone; Polymarket denies implementing KYC comprehensively; Grayscale delays U.S. stock IPO plans

Bit Digital CEO: Why I Bought More ETH

A Decade of Three Waves of Stock Tokenization from Bitget's Reality: An Unfinished Financial Exploration

"Hu Run Baifu" Dialogue with Sun Yuchen: A New Paradigm of Value Circulation in the Web3 Transformation Cycle




