Veritas-bu

[Veritas-bu] Monitoring perfomance at the buffer level

2004-08-12 09:08:13
Subject: [Veritas-bu] Monitoring perfomance at the buffer level
From: dave.markham AT icl DOT net (Dave Markham)
Date: Thu, 12 Aug 2004 14:08:13 +0100
Dave Markham wrote:

> Ed Wilts wrote:
>
>> On Tue, Aug 10, 2004 at 03:59:35PM -0600, Mark.Donaldson AT cexp DOT com 
>> wrote:
>>  
>>
>>> Here's a quick and dirty script that sweeps the bptm logs on a media 
>>> server
>>> for a supplied policy name and reports the "fill_buffer, waiting on 
>>> empty
>>> buffer" and "write_backup, waiting on full buffer" statistics.
>>>
>>> Output looks like this:
>>>
>>>   
>>>
>>>> policy_perf Hot_PRD
>>>>     
>>>
>>> ## Gathering data..........Done.
>>> ## Write to buffer waiting on available buffer:
>>> Min: 0  Avg: 356  Max: 5877 with 285 samples
>>>
>>> ## Write to tape waiting on full buffer:
>>> Min: 0  Avg: 43373  Max: 290583 with 7 samples
>>>   
>>
>>
>> I've added a section to optionally pass in a date so I can go back
>> through previous days logs and here's a sample:
>>
>> [root@osiris ewilts]# ./perf.sh osiris-vpn 081004
>> Using /usr/openv/netbackup/logs/bptm/log.081004
>> ## Gathering data.................................................Done.
>> ## Write to buffer waiting on available buffer:
>> Min: 0  Avg: 1216  Max: 42479 with 150 samples
>>
>> ## Write to tape waiting on full buffer:
>> Min: 317  Avg: 34382  Max: 157723 with 48 samples
>>
>>  
>>
>>> If the Write to Buffer is waiting for an available empty buffer a whole
>>> bunch, then perhaps you should increase your buffer count.  If 
>>> you're tape
>>> writing process waiting on a full buffer a lot, then you're starving 
>>> your
>>> tape drives and you should find a way to increase the delivery of 
>>> client
>>> data to your media server or increase your multiplexing factor.
>>>   
>>
>>
>> So what's a "whole bunch"?  Is what I'm seeing an issue I should deal
>> with?  Don't things like incrementals really slow down the tape
>> processing?
>>
>> Can it be broken down by host instead of by policy?  Having multiple
>> hosts per policy would make it difficult to target a system to fix.
>> There's also the minor issue of not knowing which hosts or policies even
>> have buffer messages in bptm.  The script is an excellent start though.
>>
>> My overall issue is that although we have GigE connections between many
>> hosts and the media servers, and trying to drive 8 SDLT220 drives in an
>> L700, we almost never exceed 11MBs of traffic coming into the media
>> servers. It's like there's a cap there that we just haven't been able to
>> remove.
>> Thanks,
>>        .../Ed
>>
>>  
>>
> I just saw this script at beginning of this thread and my current 
> today log file has no policy info so i quickly added a dirty way to 
> check yesterdays log with a -y flag
>
> policy=$1
> yesterday=$2
>
>
> today=`date +%m%d%y`
> TMPFILEf=/tmp/`basename $0`.tmp.f
> TMPFILEw=/tmp/`basename $0`.tmp.w
>
> [ -f $TMPFILEf ] && rm -f $TMPFILEf
> [ -f $TMPFILEw ] && rm -f $TMPFILEw
>
> [ $2 = "" ] && yesterday = "undef"
>
> if [ $yesterday = "-y" ]
> then
>        if [ -n `echo $today | grep ^0` ]
>        then
>                today1=`expr $today - 1`
>                today=0$today1
>        else
>        today=`expr $today - 1`
>        fi
> fi
>
> echo "## Gathering data.\c"
> ...............script continues
>
>
>
> On answer to your capping have you check the gigabit ndd settings for 
> the device ? This may help :-
>
> This is nicked from a script i wrote to auto set device paremeters on 
> boot up or if ran manually to check them.
>
> do_check_ge()
> {
>        DEV=$1
>        INST=$2
>
>        ndd -set $DEV instance ${INST}
>        echo "+-------------------+"
>        if [ `ndd $DEV link_status` = 0 ];then
>                echo "$DEV${INST} status is down";else
>                echo "$DEV${INST} status is up"
>        fi
>
>        if [ `ndd $DEV link_speed` = 1000 ];then
>                echo "$DEV${INST} link speed 1000 Mbps";else
>                echo "$DEV${INST} link not up"
>        fi
>
>        if [ `ndd $DEV link_mode` = 0 ];then
>                echo "$DEV${INST} link mode Half-Duplex";else
>                echo "$DEV${INST} link mode Full-Duplex"
>        fi
>
>        if [ `ndd $DEV adv_1000autoneg_cap` = 0 ];then
>                echo "$DEV${INST} Auto-Negotiation-OFF";else
>                echo "$DEV${INST} Auto-Negotiation-ON"
>        fi
>        if [ `ndd $DEV adv_pauseTX` = 0 ];then
>                echo "$DEV${INST} Transmit PAUSE Not 
> Capable(default)";else
>                echo "$DEV${INST} Transmit PAUSE Capable"
>        fi
>        if [ `ndd $DEV adv_pauseRX` = 0 ];then
>                echo "$DEV${INST} Receive PAUSE Not Capable";else
>                echo "$DEV${INST} Receive PAUSE Capable(default)"
>        fi
> }
>
> do_set_ge()
> {
>        DEV=$1
>        INST=$2
>        ndd -set $DEV instance ${INST}
>        echo "Setting $DEV${INST} adv_1000autoneg_cap 0"
>        ndd -set $DEV adv_1000autoneg_cap 0
>        echo "Setting $DEV${INST} adv_1000fdx_cap 1"
>        ndd -set $DEV adv_1000fdx_cap 1
>        echo "Setting $DEV${INST} adv_1000hdx_cap 0"
>        ndd -set $DEV adv_1000hdx_cap 0
>        echo "Setting $DEV${INST} adv_pauseTX 0"
>        ndd -set $DEV adv_pauseTX 0
>        echo "Setting $DEV${INST} adv_pauseRX 1"
>        ndd -set $DEV adv_pauseRX 1
> }
>
> # Workings
>
> case "$1" in
> 'check')
>
> ### Ge gigabit interface different
>        GE_=`nawk '$NF == "\"ge\"" {print $2}' /etc/path_to_inst | uniq`
>        if [ "$GE_" != "" ];then
>                for x in ${GE_};do
>                        do_check_ge /dev/ge $x
>                done
>
>                ANS=`ckyorn -p "Do you want to force all nics 1000 Mbps 
> , Full-Duplex, Auto Negotiation off?~"`
>                if [ $ANS = y ] || [ $ANS = Y ] || [ $ANS = YES ] || [ 
> $ANS = yes ];then
>                        echo "Setting Interfaces"
>                        for x in ${GE_};do
>                                do_set_ge /dev/ge $x
>                        done
>                fi
>        fi
> ;;
>
> 'start')
>
> ### Ge gigabit interface different
>        GE_=`nawk '$NF == "\"ge\"" {print $2}' /etc/path_to_inst | uniq`
>        if [ "$GE_" != "" ];then
>                for x in ${GE_};do
>                        do_set_ge /dev/ge $x
>                done
>        fi
>
> ;;
>
> *)
>        echo "Usage: $0 { check | start }"
>        exit 1
> esac
> exit 0
>
>
>
> Thanks
>
>
>
>
>
>
>
> _______________________________________________
> Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
>
I must apologize i hadnt tested properly. the -y function takes 1 off 
the year bit of date not the day. Sorry ill fix

Dave