Amanda-Users

Re: Crashing machine

2003-09-19 14:49:44
Subject: Re: Crashing machine
From: "C.Scheeder" <christoph.scheeder AT scheeder DOT de>
To: "Brashers, Bart -- MFG, Inc." <Bart.Brashers AT mfgenv DOT com>
Date: Fri, 19 Sep 2003 07:34:32 +0200
Hi,
As many others, i guess you have a hardware-problem.
Something simmilar happen to me a few years ago, a system was running
fine for yaers, but suddenly it started to fail. It ran fine untill
amanda kicked in to do her job, and after a short time it went south....
After some month of search i found out it was a new scsi disk i placed
in the chain. Before i installed it, the chain was passivley terminated by a cd-rom. But that disk was the first ultra-scsi-device, and it had
built in active termination. So i thought it would be a good idea to put
it at the end of the chain and let do it his job. and terminate the bus.
Bad idea. The internal terminators didn't do their job under load.
There where no error-messages in the logfiles, as the system hadn't the
chance to put them on disk.
How did i find the error at least? The new disk grew out of space a few month later, and got replaced by a bigger one with working terminators and all problems went away.
Since this time i only use dedicated active terminators for scsi-busses,
no internal terminators of drives anymore.
Doublecheck your hardware.
it could be even the cpu-fan. you don't have switched on bios-throtle-control for a fan with its own termal regulation, do you?
this can lead to verry strange results.
Christoph

Brashers, Bart -- MFG, Inc. wrote:
I've been using amanda-2.4.2p2 for a long time now, without problems.  In
the last week or so, my Linux (2.4.20) machine has been crashing, apparently
when amanda runs.  I see in the various logs in /var/log when amanda (e.g.
xinetd in /var/log/secure with user amanda, from 127.0.0.1) and then nothing
until the restart the next morning when I restart the computer.
The real kicker was just now when I ran amflush (after amcleanup) to flush
the last failed dump to the disk.  The system panicked after just a few
minutes, with the "Machine check exception (kernel panic: cpu context
corrupt)" error.  That usually happens when the system is too hot, or you
have a bad motherboard, or something.  This machine has been in operation
for about 6 months, so it's probably not the MB.  It's not that hot in the
room, and I checked that the fins on the CPU fan weren't clogged with dust.

Any ideas here?  Anyone heard of such a thing?  Am I barking up the wrong
tree thinking that amanda might be responsible for my crashes?  It's a real
pain, not being able to run stuff at night (and not having backups makes me
nervous).

Bart
--
Bart Brashers, Ph.D.
Air Quality Meteorologist
MFG Inc.
19203 36th Ave W Suite 101
Lynnwood WA 98036-5707

bart.brashers AT mfgenv DOT com
Phone: 425.921.4000
Fax:   425.921.4040



<Prev in Thread] Current Thread [Next in Thread>