Networker

Re: [Networker] adhoc manual "save" reporting successfull completion while failin g

2003-03-28 15:28:00
Subject: Re: [Networker] adhoc manual "save" reporting successfull completion while failin g
From: Carl Farnsworth <carl.farnsworth AT DIGIDYNE DOT CA>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Fri, 28 Mar 2003 15:27:59 -0500
Not totally sure what you're after here, but looks like you're suffering
from bad tapes or a bad drive(s) (you mention it only happens on one of
your servers).  When NetWorker reaches end-of-tape, it 'verifies' the media
by backspacing a bit and re-reading.  When NetWorker encounters a failed
write due to bad media or faulty drive, it treats it same as "tape full".
Any current savesets are then marked as "aborted".  In the case of
scheduled saves, savegrp will usually do a client retry, starting the
saveset all over again from scratch with a new tape.

However, manual saves will simply fail (I am surprised you're not getting
an error message - maybe double check your redirection of err_out ??).
Your log file stops at the last file saved.

I wouldn't spend much time trying to verify these saves - they're no good,
although you can use scanner to retrieve anything up to the failure point.

On Thu, 27 Mar 2003 16:10:22 -0500, O'Brien, Pat
<Pat.Obrien AT CHOICEPOINT DOT COM> wrote:

>os tru64 5.1
>legato server 6.1.1
>
>Having performed a normal save -s servername -e expiration_date "file or
>list of files" with std_out and err_out redirected to a file which looks
>like the following:
>                        /path_1/dir_1/dir_2/dir_3/file_1
>                        /path_1/dir_1/dir_2/dir_3/file_2
>                        /path_1/dir_1/dir_2/dir_3/file_3
>                        /path_1/dir_1/dir_2/dir_3/file_4
>                        /path_1/dir_1/dir_2/dir_3/
>                        /path_1/dir_1/dir_2/
>                        /path_1/dir_1/
>                        /path_1/
>                        /
>
>                        save: /path_1/dir_1/dir_2/dir_3  1418 MB 00:15:26
>9 files
>
>I seem to have just stumbled into a known bug, or at least am close to
known
>bugs, the jury is still out.   It would seem though that some bad thing
>occurs on a drive directly attached to the server results in any saves
>running on that drive to be killed by the server.  We have documented this
>occuring on nightly incrementals, but the adhocs for us happen real often
>and only on data not in incremental streams due to size (TB) staleness, and
>multiple copies usually.
>
>                syslog: NetWorker media: (warning) verification of volume
>"BQJ997", volid 1850121217 failed, can not read record 7077 of file 65 on
>sdlt tape BQJ997
>                syslog: NetWorker media: (notice) verification of volume
>"BQJ997", volid 1850121217 failed, volume is being marked as full.
>                syslog: NetWorker media: (notice) Save set (2047402497)
>clienta:/path_1 volume BQJ997 on /dev/ntape/tape5_d1 is being terminated
>because: Media verification failed
>                syslog: NetWorker media: (notice) Save set (2047360513)
>clienta:/path_2 volume BQJ997 on /dev/ntape/tape5_d1 is being terminated
>because: Media verification failed
>                syslog: NetWorker media: (notice) Save set (2047331329)
>clientb:/path_1 volume BQJ997 on /dev/ntape/tape5_d1 is being terminated
>because: Media verification failed
>                syslog: NetWorker media: (notice) Save set (2043590657)
>clientb:/path_2 volume BQJ997 on /dev/ntape/tape5_d1 is being terminated
>because: Media verification failed
>
>When the above occurs, only the following exits from the save.  Most
notable
>missing is the summarization of the save, but no direct error messages and
>the save is statused incomplete in the indexes: ( note: We have discovered
>this event only happening 7-10 times, and only on 1 of our servers.  I do
>realize the tape went full, but with thousands of tapes, this occurs
>regularly.)
>
>                        /path_1/dir_1/dir_2/dir_3/file_1
>                        /path_1/dir_1/dir_2/dir_3/file_2
>                        /path_1/dir_1/dir_2/dir_3/file_3
>
>I can programatically scrub through the std_out files and interogate the
>networker server with mminfo. The issue is we do launch these backups in a
>manner to splatter across several threaded paths concurrently.  We prefer
to
>not add a saveset label with unique serial numbers, and even parsing within
>the last day could result in multiple hit for a paths.  I am looking for
>ideas to perform adhoc manual save verification directly into the server
>preferable with mminfo.
>thanks
>pmob
>

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

<Prev in Thread] Current Thread [Next in Thread>
  • Re: [Networker] adhoc manual "save" reporting successfull completion while failin g, Carl Farnsworth <=