Mobile

14 cloud outages in 7 months, who is next?

Deepa

07 Aug 2012 00:00 IST

Updated On 07 Aug 2012 00:48 IST

New Update

BANGALORE, INDIA: Cloud outages are not new, neither should one fret when it happens as it is part-and-parcel of the game called computing over the cloud. Having said that we do not negate the fact that it does affect users and businesses alike causing even financial damages.

Advertisment

Since January this year we have seen over 10 outages, wherein cloud providers were forced to eat their own words (read as claims) with regards to 99.999 per cent uptime and availability.

The year, cloud outages began with online service provider Zoho's outage, which lasted for several hours on January 21, and affected about 5 million customers.

The reason, it quoted, was power outage. According to Zoho’s blog, the outage began at 8:25 am PST and by 6:12 pm all of its services and applications were back up and running.

Advertisment

Then came the famous, or is it the infamous, Leap Year Bug causing the outage of Microsoft Windows Azure, in February.

On February 29, Windows Azure cloud platform was down for over 12 hours. Windows Azure Compute and dependent services - Access Control Service (ACS), Windows Azure Service Bus, SQL Azure Portal, and Data Sync Services - were knocked out. A glitch with a security certificate on leap day was quoted as the reason.

"When 'guest agent' (GA) creates the transfer certificate, it gives it a one-year validity range. It uses midnight UST of the current day as the valid-from date and one year from that date as the valid-to date. The leap day bug was what the GA calculated as the valid-to date by simply taking the current date and adding one to its year. That meant that any GA that tried to create a transfer certificate on leap day set a valid-to date of February 29, 2013, an invalid date that caused the certificate creation to fail."

Advertisment

After this incidence, Microsoft announced that it would credit customers affected by the outage 33 per cent on their entire monthly bills.

Then came Amazon's in March. On March 15, Amazon's EC2 suffered a downtime for about 20 minutes, between 2:22 AM and 2:43 AM PST. The connectivity issue affected Amazon’s Elastic Compute Cloud in its North Virginia data center.

On April 17, Google's Gmail webmail service saw an outage, the first of the three outages so far.

Advertisment

The outage affected up to 10 per cent of its global users. GMail has over 350 million active users, thus 35 million GMail user accounts were inaccessible between 12:40 p.m. ET to 1:45 p.m. No reasons were specified.

Then on May 15, we got to see Apple’s iCloud service crashing down, keeping 15 million of its users out of its cloud for about 90 minutes, from 8:00 am to 9:30 am Pacific Daylight time.

GMail went offline again on May 29. About 4,00,000 of its users were denied access to their accounts. This time again, Google did not feel the need to owe any explanation and so did not bother to give out any specifications on what caused the outage.

Advertisment

Two days later on May 31, online social networking giant Facebook saw a brief outage that lasted for over two hours.

The month of June scores highest in terms of cloud outages. June 2012 saw four cloud computing outages, where in Google Gmail, Amazon Web Services, Apple iCloud and Twitter were involved.

The month began with GMail's webmail service outage causing 90 minutes of downtime, between 11 am US Eastern Time till 12:40 pm on June 7.

Advertisment

According to Google this outage affected 'less than' 1.38 per cent of the Google Mail user base, but that amounted to almost 4.8 million users of the total 350 million active users.

On June 15, Amazon saw its second outage of the year. It experienced an outage in its US-East-1 availability zone, in Northern Virginia.

In an another instance, iCloud and iMessage services suffered on June 20 when the iCloud data center in North Carolina suffered an outage. The outage spanned from 9.00 am to 3.00 pm PST.

Advertisment

On July 26, Google Talk, Google messaging and telephony service, was knocked offline for over four hours for unknown reasons. Yet again, Google deferred from specifying any reason, and kept its users in oblivion as to what caused the outage.

On the same day, and just after GTalk came online, the 140-character micro blogging site Twitter was inaccessible for two hours due to a data center glitch..

The outage was triggered by a series of failures in the power infrastructure in a northern Virginia data centre, including the failure of a generator cooling fan while the facility was on emergency power.

On June 29, Amazon again saw problems in its US East data centre. This outage, which lasted for over an hour, affected Elastic Compute, Elastic Cache, Elastic MapReduce and Relational Database services. According to the provider, the outage was due to power cuts following lightning in the area.

July was comparatively lean except for iCloud's outage, wherein very few users found it difficult to access older e-mail messages on July 30.

Wikipedia's outage, on August 6 evening has been the first in August. Now, the question is 'next who' and 'when'?

smac