Advertisment

When WAN turns peachy at Cosmos Bank

It squeezed data replication time up to 70 pc, peeled off data loss by 70 pc, and to top that it used FCIP juice in the process. Here’s more on this optimization color

author-image
Pratima Harigunani
New Update
ID

PUNE, INDIA: For a bank that started its journey was way back as in 1906, Cosmos Co-operative Bank Ltd. or like it says, the second oldest and second largest bank, is bound to run into unique challenges while sprinting on the road of growth and modernization.

Advertisment

In a span of 108 years of service, it has been trying to balance its traditional values with a readiness to embrace new technologies and advanced banking tools for beefing up services and adding a contemporary sheen to its portfolio. For a bank with a deep heritage like the one it carries, and a wide span of 137 branches across seven states and 37 major cities; optimizing network and application time becomes more of a necessity than a good IT add-on.

We talk to Rajendra Godbole, Manager – IT at Cosmos bank and find out if and how this recently-rolled WAN optimization exercise worked well for the bank.

Tell us something about the broader context for this initiative from the bank’s stance. Were there any specific triggers?

Advertisment

Cosmos Bank has a primary site in Pune and a Disaster Recovery (DR) site in Hyderabad for some critical applications. Initially, core application was part of DR but now hundreds of servers have come up so the management decided to cover all applications possible in DR and to replicate that we were aiming for a transfer rate of 100 mbps in contrast to the erstwhile 20 mbps speeds.

Customer expectations are changing very rapidly and bandwidth usage reduction became a priority. We also store customer-related information in a database and that (as per the Reserve Bank of India mandate for banks), must be replicated to a disaster recovery site, so that banks can recover data during a disaster at their primary data center. We were confronting issues like limited bandwidth capacity, traffic congestion and latency data replication was turning slow between the DC and DR. That was also impeding RTO (Recovery Time Objective) and RPO (Recovery Point Objective) goals, hence we decided to opt for WAN optimization.

What were you looking for during evaluation phase and what solution worked best for you?

Advertisment

Well, for one, database replication had to be faster and accelerated to condense any data loss risks during transfers. Our limited bandwidth accounted a data transfer rate of about 40 MB per hour. Now, the data transfer rate is often faster when the bandwidth of a given path is high and other WAN optimization vendors were unable to manage the bandwidth on networks and could not optimize or accelerate the data replication through the Fibre Channel over IP (FCIP) tunnel. We chose Array Networks WAN acceleration solution (aCelera WAN 2300) as we found its simplicity of configuration very high, and we also tested it for compatibility with SAN issues which started popping as a major issue during evaluation stage. This worked smoothly in case of Array’s aCelera and it also turned out to be a solution that optimizes data replication through FCIP tunnel. The integration was very simple and the engineer did not even visit the DR site and made the new set-up take off in half an hour. This also aligned well for both file-based and storage-based replication that we were covering. This covered initial storage replication and we have found the solution to be easy to use and configure so far.

How does it work?

One thing is that only newly changed data is sent across the WAN, and that aids in reducing bandwidth requirements and replication times. They also allow for zero data loss during the disaster and helps bank to meet RTO and RPO goals. We accelerated database replication and reduced the utilization of replication link from 40 Mbps to 10-15 Mbps. Its application blueprint technique cranked down transfer rate from 40 MB per hour to 10 MB per hour.

Advertisment

If you were to pick major impacts so far, what would they be?

Now some 50 to 60 per cent applications are being replicated to DR and Cosmos Bank can meet the RBI compliance for business continuity. Further, encrypted FCIP tunnel protocol has helped to accelerate data replication between the data center and disaster recovery sites. Less data traffic going over the WAN translated into less congestion, and thus lead to faster backup and replication times. With that, bank can take up frequent backup and replication through FCIP tunnel. The best part is that there was no downtime and it was so seamless a process that no changes or approvals were dealt with.

Have you measured the progress?

Advertisment

The solution has made way for faster replication of application along with reducing data loss up to 65 to 70 per cent. We have customized the solutions and attained faster data replication for efficient retrieval of data from DR. Overall storage data replication time has moved up to 70 per cent without increasing bandwidth and at the same time, risk of data loss has been crunched up to 70 per cent, flanked with eight- times improvement in response time.

Has FCIP required any equipment upgrades or other changes?

At both ends FCIP is used for storage with application-based replication protocol for some devices. There were no upgrades necessary and that is the beauty of the product as I see it. This did not entail any single change in the entire architecture. FCIP helps to make a transparent point to point connection between geographically separated SANs over IP networks and this protocol makes room for a large amount of data to travel over the WAN link. Basically, it encapsulates fibre channel and transports it over a TCP socket.

Have you also started segueing to another trend catching up - WAN acceleration with this? Is it a better one?

Acceleration is an additional stage but we have not considered it so far. But I am aware of these capabilities and in near future, it would be an interesting thing for banking industry as band-width-hungry applications start dominating the scenario.