Amanda-Users

Re: DAT hardware or software compression

2002-11-25 06:29:39
Subject: Re: DAT hardware or software compression
From: Gene Heskett <gene_heskett AT iolinc DOT net>
To: Sven Rudolph <rudsve AT drewag DOT de>
Date: Mon, 25 Nov 2002 05:52:40 -0500
On Monday 25 November 2002 01:41, Sven Rudolph wrote:
>Gene Heskett <gene_heskett AT iolinc DOT net> writes:
>> Which should be reason enough to see if the use of bzip2 could
>> be incorporated into amanda.  AIUI, bz2 can re-synch, losing
>> only the actual file that the error effected.
>
>Thanks to Niall for actually testing this.

Yes, it corrected what apparently is some miss-propaganda vis-a-vis 
bz2 thats extant.

>> Since its compression is even better than gzip's
>> best, the question becomes "can amanda tolerate the increased
>> compression time that using bz2 would result in?"
>
>This is the wrong question. Gzip decompression is way faster than
>compression, whereas bzip2 decompression is as slow as bzip2
>compression.
>
>Hence "Can you tolerate the slow restore when you are in hurry to
> get a broken machine up again?"

Again, on a decent machine, the decompress rate is still faster than 
the data rate actually coming into the pipe from the media.  Here, 
in decompressing a .bz2 kernel source image, I estimate the bunzip2 
thruput to be around 2x the medias output where the media is DDS2 
tape with a 390kb data rate.  With data coming from one of the 
latest gee-whizbang multimeg a second drives, I could see where 
that might be a factor.

OTOH, how often would that inconvienience actually occur?  Recovery, 
while it should be done as quickly as possible, isn´t such a part 
of the normal daily routine that one has to optimize it to the 
lowest common denominator just to accumulate saved time.

If recovery time really needs to be just the reboot time of the 
machine plus the downtime to swap a bad drive out, then a software 
raid5 setup is the way to go.  The rebuilding of the new drives 
contents can proceed (its automatic when md finds a new drive has 
been connected, and starts as soon as md is mounted during the 
bootup, even before the operator has logged in) while the database 
on that raid5 array is in fact being actively used.  Its probably 
not recommended to do it that way, but its worked in our tests at 
the tv station.  The disadvantage is that it took 4 ea 160 gig 
drives to make a 320gig array.  With a big enough tape drive, that 
can be backed up after business hours are over for the day for that 
last 1% confidence level.

-- 
Cheers, Gene
AMD K6-III@500mhz 320M
Athlon1600XP@1400mhz  512M
99.19% setiathome rank, not too shabby for a WV hillbilly