2008-FAST-Avoiding the Disk Bottleneck in the Data Domain De(13)
时间:2026-01-16
时间:2026-01-16
FAST有关论文。。
systems also use file hashes to address files. Some call such systems content addressed storage or CAS. Since their deduplication is at file level, such systems can achieve only limited global compression.
Venti removes duplicate fixed-size data blocks by comparing their secure hashes [QD02]. It uses a large on-disk index with a straightforward index cache to lookup fingerprints. Since fingerprints have no locality, their index cache is not effective. When using 8 disks to lookup fingerprints in parallel, its throughput is still limited to less than 7 MB/sec. Venti used a container abstraction to layout data on disks, but was stream agnostic, and did not apply Stream-Informed Segment Layout.
To tolerate shifted contents, modern deduplication systems remove redundancies at variable-size data blocks divided based on their contents. Manber described amethod to determine anchor points of a large file when certain bits of rolling fingerprints are zeros [Man93] and showed that Rabin fingerprints [Rab81, Bro93] can be computed efficiently. Brin et al. BDH94] described several ways to divide a file into content-based data segments and use such segments to detect duplicates in digital documents. Removing duplications at content-based data segment level has been applied to network protocols and applications [SW00, SCPC*02, RLB03, MCK04] and has reduced network traffic for distributed file systems [MCM01, JDT05]. Kulkarni et al. evaluated the compression efficiency between an identity-based (fingerprint comparison of variable-length segments) approach and a delta-compression approach [KDLT04]. These studies have not addressed deduplication throughput issues.
The idea of using Bloom filter [Blo70] to implement the Summary Vector is inspired by the summary data structure for the proxy cache in [FCAB98]. Their work also provided analysis for false positive rate. In addition, Broder and Mitzenmacher wrote an excellent survey on network applications of Bloom filters [AM02]. TAPER system used a Bloom filter to detect duplicates instead of detecting if a segment is new [JDT05]. It did not investigate throughput issues.
We have shown that Summary Vector can reduce disk index lookups by about 17% and Locality Preserved Caching can reduce disk index lookups by over 80%, but the combined caching techniques can reduce disk index lookups by about 99%.
Stream-Informed Segment Layout is an effective abstraction to preserve spatial locality and enable Locality Preserved Caching.
These techniques are general methods to improve throughput performance of deduplication storage systems. Our techniques for minimizing disk I/Os to achieve good deduplication performance match well against the industry trend of building many-core processors. With quad-core CPU’s already available, and eight-core CPU’s just around the corner, it will be a relatively short time before a large-scale deduplication storage system shows up with 400 ~ 800 MB/sec throughput with a modest amount of physical memory.
8References
7Conclusions
This paper presents a set of techniques to substantially reduce disk I/Os in high-throughput deduplication storage systems.
Our experiments show that the combination of these techniques can achieve over 210 MB/sec for 4 multiple write data streams and over 140 MB/sec for 4 read data streams on storage server with two dual-core processors and one shelf of 15 drives.
ABCC*02] A. Adya, W. J. Bolosky, M. Castro, G. Cermak, R. Chaiken, J. R. Douceur, J. Howell, J. R. Lorch, M. Theimer, and R. P. Wattenhofer. FARSITE: Federated, available, and reliable storage for an incompletely trusted environment. In Proceedings of USENIX Operating Systems Design and Implementation (OSDI),December 2002.
[BM05] Andrie Z. Broder and Michael Mitzenmacher. Network Applications of Bloom Filters: A Survey. Internet Mathematics,2005.
BDH94] S. Brin, J. Davis, H. Carcia-Molina. Copy Detection Mechanisms for Digital Documents (weblink). 1994, also lso in Proceedings of ACM SIGMOD,1995.
Blo70] Burton H. Bloom. Space/time Trade-offs in Hash Coding with Allowable Errors. Communications of the ACM,13 (7). 422-426.
[JDT05] N. Jain, M. Dahlin, and R. Tewari. TAPER: Tiered Approach for Eliminating Redundancy in Replica Synchronization. In Proceedings of USENIX File And Storage Systems (FAST),2005.
[Dat05] Data Domain, Data Domain Appliance Series: High-Speed Inline Deduplication Storage, 2005, [FCAB98] Li Fan, Pei Cao, Jussara Almeida, and Andrie Z. Broder. Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol. in Proceedings of ACM SIGCOMM'98,(Vancouver, Canada, 1998).
[KDLT04] P. Kulkarni, F. Douglis, J. D. LaVoie, J. M. Tracey: Redundancy Elimination Within Large Collections of Files. In Proceedings of USENIX Annual Technical Conference,pages 59-72, 2004.
USENIXAssociationFAST’08:6thUSENIXConferenceonFileandStorageTechnologies
281
…… 此处隐藏:2885字,全部文档内容请下载后查看。喜欢就下载吧 ……