Data De-Duplication Is No Longer Just for Backup
If a document is emailed to 500 people within your organization, do you think it makes more sense to store all 500 copies of that document… or just one?
Data de-duplication eliminates copies of data so storage space isn’t wasted with multiple instances of the same data. Using the above example, only one copy of the document would be stored instead of 500. The remaining 499 documents would be replaced with a pointer to the single stored document.
Traditionally utilized as part of backup and archival processes, the removal of redundant data can reduce the amount of data needing to be backed up by 90 percent or more. Because significantly less data is transmitted for remote backup and disaster recovery purposes, bandwidth requirements also drop by up to 99 percent. By conserving storage capacity and bandwidth, data de-duplication enables organizations to reduce storage costs and recover data faster.
While the benefits are significant, data de-duplication hasn’t been widely used in primary storage, which houses data in active use. This has been largely due to performance issues and data integrity concerns. Also, far more redundant data was eliminated during the backup process, so data de-duplication solutions focused on that tier of storage.
However, thanks to evolving data center technology, primary storage data de-duplication is likely to produce more business value today than it might have five years ago. Increasingly virtualized environments are producing more redundant data, while cloud storage requires smaller volumes of data in order to transfer data efficiently. At the same time, performance issues associated with data de-duplication can be negated by flash storage, which is much faster – and more expensive – than traditional disk and tape storage.
In a virtualized environment, most data originates in primary storage and is distributed to other storage tiers. Consequently, using data de-duplication to improve primary storage efficiency and optimization can produce significant downstream cost savings across the entire storage infrastructure.
A stronger business case for primary storage de-duplication has led more vendors to offer such solutions. For example, data de-duplication is built into Microsoft Windows Server 2012 R2 for primary storage. Administrators can minimize performance issues by scheduling data de-duplication jobs at specific times and configuring policies to control which files should be processed. Microsoft’s data de-duplication feature promises to provide data integrity, bandwidth efficiency and faster download times.
Let ICG assess the storage requirements of your network and help you determine how your organization could benefit by implementing a data de-duplication solution for backup, primary storage or both.