5 Data Deduplication Best Practices

Computers & TechnologyTechnology

  • Author Chuck Swanson
  • Published November 3, 2010
  • Word count 422

Studies suggest that organizations that have multiple copies of data buy, administer and use two to fifty times the amount of storage space they’d need with data deduplication. It’s no wonder than data redundancy is a major contributor to explosive data growth.

At the outset, data deduplication reduced data redundancy in only specific circumstances, such as full backups, VMware images and email attachments. However, duplicate data would still persevere. This is mainly because of the multiplication of test and development data across an organization over time. Backup, archiving, and replication create numerous data copies that can be found throughout an organization. Add to that the fact that users often copy data to locations for their own convenience.

Organizations are now realizing these facts, and are seeing data deduplication as a mandatory and integrated element of their overall IT strategy.

Essentially there are two methods of reducing the cost of data storage. First, you can use a lower-cost storage platform, but that opens numerous additional problems that I won’t go into here. Second, you can leverage a sound data deduplication strategy designed to reduce required storage and data growth.

Data deduplication can reduce your data storage costs by lowering the amount of disk space required to store data – whether that be data backups or primary production data. This article highlights 5 best practices to help you select and implement the best data deduplication solution for your environment.

  1. Consider the broad implications of deduplication. You’ll want to consider how a deduplication strategy fits within your entire data management and storage strategy, accounting for tradeoffs in things like computational time, accuracy, index size, the level of deduplication detected and the scalability of the solution.

  2. Learn what data does not dedupe well. Human created data dedupes differently than data created by computers, so you’ll want to consider what types of data to avoid deduplication efforts.

  3. Don’t obsess over space reduction ratios. The length of time that data is retained affects your space reduction ratios, but rather than increasing the number of full backups, consider increasing your backup retention period.

  4. Don’t use multiplexing if you’re backing up to a VTL. Multiplexing data in a virtual tape library (VTL) wastes computing cycles.

  5. Pilot multiple systems before you select your system. This will ensure you that the deduplication solution you choose integrates best within your IT environment and the data currently in-house.

If you’d like to learn more about these 5 data deduplication best practices you can visit the original blog series.

For more information on data deduplication best practices, visit the full blog series here: http://blog.virtual.com/2010/5-data-deduplication-best-practices-post-one

Article source: https://articlebiz.com
This article has been viewed 652 times.

Rate article

Article comments

There are no posted comments.

Related articles