Amazon Web Services had downtime yesterday. Their S3 storage service had what was described as elevated error rates in the US-East zone, and as a result failures started in other AWS services in US-East that depended on S3.
The failures were reminiscent of the DynamoDB downtime of September 20, 2015, in that seemingly unrelated services share a common failure mode because of unforeseen interdependencies.
A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable. Leslie Lamport, 1987
The current Techmeme roundup has a good summary of the impacts on the net. Notably, Twitter was unaffected, but some elements of Slack like file sharing failed to work. And crucially Amazon’s own ability to communicate with its many customers was the first system to be noticably affected; their status page showed all green checkmarks for more than an hour after the problem was noticed, with the explanation coming that the status page was itself dependent on services that were down.
The dashboard not changing color is related to S3 issue. See the banner at the top of the dashboard for updates.— Amazon Web Services (@awscloud) February 28, 2017
Some news media sites were also down or running in impaired service, prompting the Associated Press wire service to use Facebook to issue its news story about the AWS outage.