Underestimation Caused Gmail OutageCategory: Internet
Posted: September 2, 2009 03:57PM
Users of Google's Gmail service will likely have noticed some issues yesterday, as the company experienced some down time of its email web interface. Google officially report that the outage lasted for around 100 minutes, though some of you may have noticed it taking a little more time to get fully back to normal. The problem was caused when engineers took down a small portion of Gmail servers for routine upgrades, but the affect this would have on the rest of the service was underestimated. Some other recent changes that were made meant that the request routers that process queries and direct them to appropriate servers became overloaded. That effectively saw the service grind to a halt until measures could be taken to remedy the situation. Notably, IMAP and POP access was not affected, as these requests don't use the same routers.
With Gmail now among the top three email services, further incidents such as this could hurt its reputation and potential suitability for critical applications. Google say they will be putting resources into place that ensure a similar event doesn't happen again.