Advertisment

Data Dedupe: So much with so little!

author-image
Deepa
Updated On
New Update

BANGALORE, INDIA: Shrinking budget and dwindling resources have only added to the misery of CIOs who are today under tremendous pressure to do more with less.

Advertisment

Venkatesh iyer

However, with the digital world over the metro expanding like never before, they are left with no choice but to incorporate more storage in their data centres, to meet the ever-growing demand for storage. As per IDC's report, 'Economy Contracts, the Digital Universe Expands', 480 billion gigabyte of data has been created so far and this would increase by five times by 2012.

Venkatesh Iyer, head - India & SAARC, Backup, Recovery and Archival Solutions, EMC, says: "We know that for every terabyte of information that we have, about 10-15 times of the information is lapped-up by organizations. On an average, people store in the back-up, and thus storage space is growing tremendously at around six per cent year-on-year."

Advertisment

This is certainly not an exciting news for CIOs... isn't there a way out?

Download Mindteck Whitepaper Here

Enters Data Deduplication

Data deduplication or simply put data dedupe is one of the hottest topics in the network storage space today.

Advertisment

Data deduplication is not duplicating data, as the word implies, but is a method of reducing storage needs by eliminating redundant data.

“Dedupe,” adds Iyer, “can be done at source, or in-line or at target levels. In the deduplication process, data is scanned and unique patterns are identified, assigned corresponding unique data fingerprints, indexed and retained. Duplicate copies of data already fingerprinted are deleted, leaving only one stored copy of each unique data pattern along with its corresponding fingerprint. Redundant data is replaced with a pointer to the unique data copy.”

Aman Munglani

Advertisment

There are multiple ways by which dedupe can be achieved. Dedupe could be source based, target based or it could be in-line dedupe. Low line suits SMBs, where ERP could be good for in-line datadedupe technology. Source based data deduplication is finding more importance in VMware or virtualized or remote office environment because customer is facing challenge in terms of bandwidth.

Aman Munglani, principal analyst, Gartner, says: “Given the present circumstances, companies have become extremely conscious with regard to storage. So, if you don't have repeated data on your disk and you are able to store more data, it reduces the back-up need quite considerably. When you are talking about 300:1 kind of savings, it is pretty significant amount of data that you can save on a disk.”

So when the amount of data is reduced, it boosts the bandwidth availability over the network too.

Advertisment

“There is a good amount of acceptability for this kind of technology because in various surveys the two things that CIOs might have come back and said is that their key challenges are the amount of information growth, so backing them up and disaster recovery,” avers Iyer.

In the view of Anand Naik, director, Systems Engineering, Symantec, dedupe helps you to have a centralised management of overall infrastructure. “Customers are not only adopting just reduction in storage, but an overall reduction in the complexity of storage environment with dedupe,” he adds.

Advertisment
 

Anand Naik

What causes this huge data growth?

Every moment that we spend on Internet, networking over social network sites or sending e-mails or SMSs via mobile phone, we are becoming the source of this increasing information lot.

Advertisment

'Data Explosion in India - Trends & Challenges', a survey by The Nielsen Company in association with NetApp, finds that the financial impact owing to high usage of IT is tremendous on organizations.

Primarily, management of email traffic and maintenance of remote office data back up have a high impact on the finances of the company. In addition, much of the data stored on disk is redundant and a large portion of storage devices lay unused. Not only is this a waste of storage, it is a waste of power and floor space - all of which increase IT costs.

“Data explosion is being caused by high adoption of IT, such as CRM, ERP, and trends such as remote office back-up. Moreover, more and more customers are today adopting virtualization,” notes Iyer.

How is virtualization a culprit?

Virtualization is coming in a big way and as per EMC, 65 per cent of companies will be virtualized in another one year.

“Virtualization is throwing up a huge challenge because utilization levels of the server have become very high. This, in turn, throws a huge amount of challenge on the back-up with the shared infrastructure that virtualization provides,” notes Iyer.

“Data deduplication is capable of providing back-up in a virtualized environment. Dedupe improves the efficiency of servers that are being virtualized. Huge amount of value add is provided through dedupe by simplifying and reducing the kind of data that goes through shared network and enabling them to do back-up in a much more efficient manner. This also, in turn, increases bandwidth,” avers Naik.

Sunil Chavan

Benefits of data dedupe

Reducing traffic over the metro results in the increase of bandwidth by over 300 times. The benefits don't end here.

“Data deduplication reduces the data that must be sent across a WAN for remote backups, replication, and disaster recovery. Customers are not only adopting just reduction in storage but an overall reduction in the complexity of environment,” Naik says.

Sunil Chavan, director, content and file services, Hitachi Data Systems, APAC, says that deduplication can revolutionize the way data is stored and protected, if implemented in the right way for an organization’s need.

“Some approaches can result in insufficient data reduction (therefore increasing costs), performance bottlenecks, increased management complexity or islands of deduplication creating unwanted vendor lock-in. The best approach depends on where you are starting from, within or outside the infrastructure, what skills you have in your IT organization and having a realistic expectation on the savings you will get.”

“Deduplication is a strategy or journey that a company can go for,” Iyer signs off.

tech-news