Re: [BackupPC-users] Backing up a BackupPC server
2009-06-02 10:50:45
Jeffrey J. Kosowsky wrote at about 08:57:54 -0400 on Tuesday, June 2, 2009:
> Peter Walter wrote at about 06:27:35 -0400 on Tuesday, June 2, 2009:
> > I have read with interest various threads on this list concerning
> > methods of how to back up a backuppc server to a remote file system over
> > the internet. My impression from reading the threads is that there is no
> > *good* way - that rsync is a poor choice if you have many hardlinks, and
> > methods like copying a "snapshot" of a block-level device are
> > inefficient if only a relatively small proportion of the data changes. I
> > have tried both methods, and am not satisfied with the performance and
> > efficiency of either. In addition, BackupPC is not compatible with
> > 'cloud' storage systems - at least the ones I have looked at do not seem
> > to support hardlinks.
> >
> > As a Linux newbie, I have only a partial understanding of the technology
> > underlying Linux and BackupPC, but I get the impression that the problem
> > with a rsync-like solution is that processing hardlinks is very
> > expensive in terms of cpu time and memory resources. This may be a
> > stupid question, but, if hardlinks are the problem, has any thought been
> > given to adding to BackupPC an option to use some form of database
> > (text, SQL or otherwise) to associate hashes to files, instead? It seems
> > to me that using hardlinks is in fact using that feature of the file
> > system *as* a database, a use that does not appear to be optimal ... if
> > I have misunderstood, please educate me :-)
> >
> > Peter
> >
>
> Indeed this has been discussed many times before ;) -- see the archives.
>
> That being said, I agree that using a database to store both the
> hardlinks along with the metadata stored in the attrib files would be
> a more elegant, extensible, and platform-independent solution though
> presumably it would require a major re-write of BackupPC.
>
> I certainly understand why BackupPC uses hardlinks since it allows for
> an easy way to do the pooling and in a sense as you suggest uses the
> filesystem as a rudimentary database.
>
> On the other hand as I and others have mentioned before moving to a
> database would add the following advantages:
>
> 1. Platform and filesystem independence -- BackupPC would no longer
> depend on the specific hard link behaviors of linux and associated
> filesytems.
>
> 2. It would be easier to extend the attrib notion to store extended
> attributes whether for Linux (e.g., selinux attributes), Windows
> (e.g., ACL attributes) or any other OS.
>
> 3. The pool could be split among multiple disks and filesystems since
> it would no longer depend on hard-link behavior
>
> 4. Backing up BackupPC backups would be much easier and faster since
> you no longer would have hard links to worry about -- just backup
> the database and any portion of the pool that you want to.
>
> 5. The whole system would be more elegant and extensible since all
> types of metadata could be stored in the database rather than being
> stored in various files in the BackupPC tree. For example,
> - You wouldn't need the kludge of file mangling
> - Checksums could be stored in the database rather than being
> appended in a non-standard way to the end of the file
> - File level encryption could easily be added
> - Alternative file-level compression schemes could easily be
> supported.
> - The host-specific config data (and maybe even all the config
> data) could be stored in tables rather than in individual
> config files
> - The 'backups' file could also be stored as a table
>
> 6. Presumably a database architecture would also make it easier to
> have more granular control over user access and permissions at the
> feature and file level.
>
> The challenge though is that to do this right (i.e. in a way that is
> both elegant and extensible) would require a substantial if not almost
> complete re-write of BackupPC and I'm not sure that Craig (or anybody
> else for that matter) are willing to sign up for that...
>
> Still, it would be awesome to combine the simplicity and pooling
> structure of BackupPC with the flexibility of a database
> architecture...
>
One more advantage of a database architecture:
7. Reconstructing incremental backups would be simpler and faster
since the database could point directly to the file rather than
having to crawl through a tree of attrib files to reconstruct the
hierarchy of which files have changed or not.
------------------------------------------------------------------------------
OpenSolaris 2009.06 is a cutting edge operating system for enterprises
looking to deploy the next generation of Solaris that includes the latest
innovations from Sun and the OpenSource community. Download a copy and
enjoy capabilities such as Networking, Storage and Virtualization.
Go to: http://p.sf.net/sfu/opensolaris-get
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/
|
|
|