Re: [BackupPC-users] Backing up a BackupPC server

Jeffrey J. Kosowsky wrote at about 08:57:54 -0400 on Tuesday, June 2, 2009:
 > Peter Walter wrote at about 06:27:35 -0400 on Tuesday, June 2, 2009:
 >  > I have read with interest various threads on this list concerning 
 >  > methods of how to back up a backuppc server to a remote file system over 
 >  > the internet. My impression from reading the threads is that there is no 
 >  > *good* way - that rsync is a poor choice if you have many hardlinks, and 
 >  > methods like copying a "snapshot"  of a block-level device are 
 >  > inefficient if only a relatively small proportion of the data changes. I 
 >  > have tried both methods, and am not satisfied with the performance and 
 >  > efficiency of either. In addition, BackupPC is not compatible with 
 >  > 'cloud' storage systems - at least the ones I have looked at do not seem 
 >  > to support hardlinks.
 >  > 
 >  > As a Linux newbie, I have only a partial understanding of the technology 
 >  > underlying Linux and BackupPC, but I get the impression that the problem 
 >  > with a rsync-like solution is that processing hardlinks is very 
 >  > expensive in terms of cpu time and memory resources. This may be a 
 >  > stupid question, but, if hardlinks are the problem, has any thought been 
 >  > given to adding to BackupPC an option to use some form of database 
 >  > (text, SQL or otherwise) to associate hashes to files, instead? It seems 
 >  > to me that using hardlinks is in fact using that feature of the file 
 >  > system *as* a database, a use that does not appear to be optimal ... if 
 >  > I have misunderstood, please educate me :-)
 >  > 
 >  > Peter
 >  > 
 > 
 > Indeed this has been discussed many times before ;) -- see the archives.
 > 
 > That being said, I agree that using a database to store both the
 > hardlinks along with the metadata stored in the attrib files would be
 > a more elegant, extensible, and platform-independent solution though
 > presumably it would require a major re-write of BackupPC.
 > 
 > I certainly understand why BackupPC uses hardlinks since it allows for
 > an easy way to do the pooling and in a sense as you suggest uses the
 > filesystem as a rudimentary database.
 > 
 > On the other hand as I and others have mentioned before moving to a
 > database would add the following advantages:
 > 
 > 1. Platform and filesystem independence -- BackupPC would no longer
 >    depend on the specific hard link behaviors of linux and associated
 >    filesytems.
 > 
 > 2. It would be easier to extend the attrib notion to store extended
 >    attributes whether for Linux (e.g., selinux attributes), Windows
 >    (e.g., ACL attributes) or any other OS.
 > 
 > 3. The pool could be split among multiple disks and filesystems since
 >    it would no longer depend on hard-link behavior
 > 
 > 4. Backing up BackupPC backups would be much easier and faster since
 >    you no longer would have hard links to worry about -- just backup
 >    the database and any portion of the pool that you want to.
 > 
 > 5. The whole system would be more elegant and extensible since all
 >    types of metadata could be stored in the database rather than being
 >    stored in various files in the BackupPC tree. For example,
 >       - You wouldn't need the kludge of file mangling
 >       - Checksums could be stored in the database rather than being
 >         appended in a non-standard way to the end of the file
 >       - File level encryption could easily be added
 >       - Alternative file-level compression schemes could easily be
 >         supported.
 >       - The host-specific config data (and maybe even all the config
 >         data) could be stored in tables rather than in individual
 >         config files
 >       - The 'backups' file could also be stored as a table
 > 
 > 6. Presumably a database architecture would also make it easier to
 >    have more granular control over user access and permissions at the
 >    feature and file level.
 > 
 > The challenge though is that to do this right (i.e. in a way that is
 > both elegant and extensible) would require a substantial if not almost
 > complete re-write of BackupPC and I'm not sure that Craig (or anybody
 > else for that matter) are willing to sign up for that...
 > 
 > Still, it would be awesome to combine the simplicity and pooling
 > structure of BackupPC with the flexibility of a database
 > architecture...
 > 

One more advantage of a database architecture:

7. Reconstructing incremental backups would be simpler and faster
   since the database could point directly to the file rather than
   having to crawl through a tree of attrib files to reconstruct the
   hierarchy of which files have changed or not.

------------------------------------------------------------------------------
OpenSolaris 2009.06 is a cutting edge operating system for enterprises 
looking to deploy the next generation of Solaris that includes the latest 
innovations from Sun and the OpenSource community. Download a copy and 
enjoy capabilities such as Networking, Storage and Virtualization. 
Go to: http://p.sf.net/sfu/opensolaris-get
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/