Networker

Re: [Networker] Session statistics broken for over 2Tb.

2012-10-26 07:36:04
Subject: Re: [Networker] Session statistics broken for over 2Tb.
From: Francis Swasey <Frank.Swasey AT UVM DOT EDU>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Fri, 26 Oct 2012 07:33:52 -0400
My own mediadb (and the rest of /nsr/res) has come along from 32-bit OS's into 
the 64-bit era as well.  I also have experience with the savegroup emails not 
always being correct.  When the amount of data crosses the 2TB limit, it is a 
crap shoot whether the data displayed in nsradmin's 'show session' will be 
correct or not.  I often get push back from my customers and I have to explain 
to them that their backups are too big (in that regard, I like this bug!) and 
because of that they will need to do an mminfo query to see the real size of 
their saveset.

I also opened an issue with EMC, and got my name added to the RFE for this 
problem, which is NW113348.  I don't know if EMC will ever have a real fix for 
it (other than the clean slate restart with running scanner on all your media 
volumes [shudder]).  However, eventually some bright programmer will stumble on 
the exact combination of what is doing it and be able to write a conversion 
program to read in the 32-bit /nsr/res constructs and write out the correct 
64-bit /nsr/res constructs.  Still, I'm not going to hold my breath waiting!   
It is yet another reason to keep individual save sets below 2TB...  Yeah, I 
know, that's not realistic anymore.

-- Frank


On Oct 26, 2012, at 4:57 AM, Yaron Zabary <yaron AT aristo.tau.ac DOT il> wrote:

>  This doesn't make sense because mminfo, and nsradmin's 'show session' knows 
> the correct size as can be seen below. The problem seems to be with some 
> variable defined as 'int' and not 'long int' in nsradmin and NMC.
> 
> On 10/26/2012 09:47 AM, Tony Albers wrote:
>> AFAIK this is a known issue if you've upgraded a 32-bit mediadatabse
>> from an old networker to a newer 64-bit nw and mediadb.
>> 
>> I don't think there's any other way around it than making a complete new
>> 64 bit backup server and then moving the data to it. That is use scanner
>> to populate the new media db (yes I know).
>> 
>> /tony
>> 
>> 
>> Tony Albers  - Technical Consultant  -  Proact Systems A/S
>> Tel: +45 7010 1132 - Mobile: +45 2210 5208 - Fax: +45 7010 1142
>> toal AT proact DOT dk  www.proact.dk - We secure mission-critical 
>> information -
>> 
>> On 10/25/2012 05:48 PM, Yaron Zabary wrote:
>>> Hello all,
>>> 
>>>    I have this script which tries to dig some statistics from nsradmin's
>>> session statistics. It works nicely for sessions smaller than 2Tb, but
>>> breaks above that. I suspect that nsradmin does 32 bit counters. For
>>> example:
>>> 
>>> [root@legato ~]# nsradmin
>>> NetWorker administration program.
>>> Use the "help" command for help, "visual" for full-screen mode.
>>> nsradmin>  . type: NSR
>>> Current query set
>>> nsradmin>  option hidden;
>>> 
>>> Hidden display option turned on
>>> 
>>> Display options:
>>>      Dynamic: Off;
>>>      Hidden: On;
>>>      Raw I18N: Off;
>>>      Resource ID: Off;
>>>      Regexp: Off;
>>> nsradmin>  option dynamic
>>> Dynamic display option turned on
>>> 
>>> Display options:
>>>      Dynamic: On;
>>>      Hidden: On;
>>>      Raw I18N: Off;
>>>      Resource ID: Off;
>>>      Regexp: Off;
>>> nsradmin>  show session statistics
>>> nsradmin>  print
>>>            session statistics: id = 285113144, jobid = 0,
>>>                                name = dayan-ng.tau.ac.il, mode = browsing,
>>>                                "group = ", "pool = ", "volume = ", rate
>>> kb = 0,
>>>                                amount kb = 0, total kb = 0, amount files
>>> = 0,
>>>                                total files = 0, start time = 1350993680,
>>>                                connect time = 185605, num volumes = 0,
>>>                                used volumes = 0, completion = running,
>>>                                flags = 0, "level = ", id = 285113524,
>>>                                jobid = 76501, name = cloning session,
>>>                                mode = recovering, "group = ", pool =
>>> DDPool,
>>>                                volume = DDPool.001.RO, rate kb = 0,
>>>                                amount kb = 129176321, total kb =
>>> 1018277321,
>>>                                amount files = 0, total files = 0,
>>>                                start time = 1351029303, connect time =
>>> 149982,
>>>                                num volumes = 0, used volumes = 0,
>>>                                completion = running, flags = 4, "level = ",
>>>                                id = 285113525, jobid = 76501,
>>>                                name = legato.tau.ac.il, mode = saving,
>>>                                "group = ", pool = TAUDefault, volume =
>>> JDF648,
>>>                                rate kb = 0, amount kb = 0, total kb = 0,
>>>                                amount files = 0, total files = 0,
>>>                                start time = 1351029303, connect time =
>>> 149982,
>>>                                num volumes = 0, used volumes = 5,
>>>                                completion = running, flags = 26, "level
>>> = ";
>>> nsradmin>
>>> [root@legato ~]# /usr/local/TAUSRC/Local/ToolBox/monstage.pl
>>> 76501 r=0MB/s size=841GB/971GB time=16205m ETA=5/23:43
>>> 
>>>    The size is reported correctly with nsradmin's session attribute:
>>> 
>>> [root@legato ~]# /usr/local/TAUSRC/Local/ToolBox/showsessions.pl|nl
>>>       1    dayan-ng.tau.ac.il:root browsing
>>>       2    cloning session:1 of 7 save set(s) reading from DDPool.001.RO
>>> 4431 GB of 5313 GB
>>>       3    legato.tau.ac.il:cloning session saving to pool 'TAUDefault'
>>> (JDF648)
>>> 
>>>   NMC is no better. It thinks that the size of this staging session is
>>> 1018Gb. I had this investigated under SR#44358972, but they claimed that
>>> this was OK with 7.6.3HF and was related to NW138153. Networker is now
>>> 7.6.4.2.Build.1060, but the problem is still here.
>>> 
>>>   Does anyone knows which version has this corrected ?
>>> 
>>> 
>>> 
>>> #!/usr/bin/perl
>>> 
>>> use lib "/usr/local/TAUSRC/Local/ToolBox";
>>> use Nsradmin;
>>> require "timelocal.pl";
>>> 
>>> set_nsradmin("/usr/sbin/nsradmin");
>>> 
>>> $server = "legato";
>>> $query  = "type: NSR ";
>>> $show   = "session statistics";
>>> $options = "hidden; dynamic";
>>> 
>>> @reslist = query($server, $query, $show, $options);
>>> 
>>> #
>>> # A reslist is a list of resources.  Resources are a
>>> # hash of attributes, which have a name and value lists.
>>> #
>>> 
>>> $found = 0;
>>> foreach $res (@reslist) {
>>>        %attrlist = %{$res};
>>>        $attr = "session statistics";
>>>        @vallist = @{$attrlist{$attr}};
>>>        foreach $val (@vallist) {
>>>           if ($val =~ "jobid")
>>>           {
>>>             ($a,$jobid) = split(/ = /,$val);
>>>           }
>>>           if ($val =~ "total kb")
>>>           {
>>>             ($a,$totalkb) = split(/ = /,$val);
>>>           }
>>>           if ($val =~ "amount kb")
>>>           {
>>>             ($a,$amountkb) = split(/ = /,$val);
>>>           }
>>>           if ($val =~ "connect time")
>>>           {
>>>             ($a,$ctime) = split(/ = /,$val);
>>>             $rate = $amountkb/$ctime;
>>>             if ($found == 1)
>>>             {
>>>              #print "$totalkb $amountkb \n";
>>>              $left = $totalkb - $amountkb;
>>>              $leftt = $left/$rate if $rate>  0;
>>>              $eta = time() + $leftt;
>>>              ($sec,$min,$hour,$mday,$monx,$year,$wday,$yday,$isdst) =
>>> localtime($
>>> eta);
>>>              $rate = int($rate/1024);
>>>              $left = int($left/1024/1024);
>>>              $leftt = int($leftt/60);
>>>              $totalkb = int($totalkb/1024/1024);
>>>              printf "%d r=%dMB/s size=%dGB/%dGB time=%dm ETA=%d/%d:%02d\n",
>>>                    $jobid,$rate,$left,$totalkb,$leftt,$mday,$hour,$min
>>> if ($found
>>>   == 1);
>>>              $found = 0;
>>>              #last;
>>>             }
>>>           }
>>>           $found = 1 if ($val =~ 'cloning session');
>>>        }
>>> }
>>> 
> 
> 
> -- 
> 
> -- Yaron.