Re: [BackupPC-users] improving the deduplication ratio
2008-04-14 17:31:45
On Apr 14, 2008, at 11:20 AM, Tino Schwarze wrote:
>
> Of
> course, you shouldn't underestimate the cost of managing a lot of
> small
> files (my pool has about 5 million files, some of them are pretty
> large), so the pool will have even more files which means more seeking
> and looking up file blocks.
>
> Introducing file chunking would introduce a new abstraction layer - a
> file would need to be split into chunks and recreated for restore. You
Tino -- thanks for posting this. These issues are exactly what I had
in mind when I posted about adding sub-file deduplication. There's a
lot more work to do and definitely a bunch more housekeeping. Right
now, BackupPC gets off "easy" by utilizing hardlinks to do the
dedupe. Once we delve below the file, a brand new data structure/
mechanism needs to be designed and built to efficiently link all of
these blocks together.
If you look at the commercial solutions that provide this
functionality exclusively in software (as opposed to appliance-based
solutions), you see that it is quite processor intensive. If there
are flaws in the design of the mechanism to track the chunks, you
will most definitely see pain in the backup and restore processes
compared to the existing mechanism of deduping at the file level.
--
Michael Barrow
michael at michaelbarrow dot name
-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save $100.
Use priority code J8TL2D2.
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/
|
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- [BackupPC-users] improving the deduplication ratio, Ludovic Drolez
- Re: [BackupPC-users] improving the deduplication ratio, Tino Schwarze
- Re: [BackupPC-users] improving the deduplication ratio, Les Mikesell
- Re: [BackupPC-users] improving the deduplication ratio, Michael Barrow
- Re: [BackupPC-users] improving the deduplication ratio, Ludovic Drolez
- Re: [BackupPC-users] improving the deduplication ratio, Tino Schwarze
- Re: [BackupPC-users] improving the deduplication ratio,
Michael Barrow <=
- Re: [BackupPC-users] improving the deduplication ratio, Ludovic Drolez
- Re: [BackupPC-users] [BackupPC-devel] improving the deduplication ratio, Tino Schwarze
- Re: [BackupPC-users] [BackupPC-devel] improving the deduplication ratio, Ludovic Drolez
- Re: [BackupPC-users] [BackupPC-devel] improving the deduplication ratio, Tino Schwarze
- Re: [BackupPC-users] improving the deduplication ratio, Ludovic Drolez
- Re: [BackupPC-users] improving the deduplication ratio, Les Mikesell
- Re: [BackupPC-users] improving the deduplication ratio, Ludovic Drolez
- Re: [BackupPC-users] improving the deduplication ratio, Les Mikesell
Re: [BackupPC-users] improving the deduplication ratio, Kenneth Porter
Message not available
|
|
|