[Veritas-bu] Monitoring perfomance at the buffer level
2004-08-12 09:08:13
Subject: |
[Veritas-bu] Monitoring perfomance at the buffer level |
From: |
dave.markham AT icl DOT net (Dave Markham) |
Date: |
Thu, 12 Aug 2004 14:08:13 +0100 |
Dave Markham wrote:
> Ed Wilts wrote:
>
>> On Tue, Aug 10, 2004 at 03:59:35PM -0600, Mark.Donaldson AT cexp DOT com
>> wrote:
>>
>>
>>> Here's a quick and dirty script that sweeps the bptm logs on a media
>>> server
>>> for a supplied policy name and reports the "fill_buffer, waiting on
>>> empty
>>> buffer" and "write_backup, waiting on full buffer" statistics.
>>>
>>> Output looks like this:
>>>
>>>
>>>
>>>> policy_perf Hot_PRD
>>>>
>>>
>>> ## Gathering data..........Done.
>>> ## Write to buffer waiting on available buffer:
>>> Min: 0 Avg: 356 Max: 5877 with 285 samples
>>>
>>> ## Write to tape waiting on full buffer:
>>> Min: 0 Avg: 43373 Max: 290583 with 7 samples
>>>
>>
>>
>> I've added a section to optionally pass in a date so I can go back
>> through previous days logs and here's a sample:
>>
>> [root@osiris ewilts]# ./perf.sh osiris-vpn 081004
>> Using /usr/openv/netbackup/logs/bptm/log.081004
>> ## Gathering data.................................................Done.
>> ## Write to buffer waiting on available buffer:
>> Min: 0 Avg: 1216 Max: 42479 with 150 samples
>>
>> ## Write to tape waiting on full buffer:
>> Min: 317 Avg: 34382 Max: 157723 with 48 samples
>>
>>
>>
>>> If the Write to Buffer is waiting for an available empty buffer a whole
>>> bunch, then perhaps you should increase your buffer count. If
>>> you're tape
>>> writing process waiting on a full buffer a lot, then you're starving
>>> your
>>> tape drives and you should find a way to increase the delivery of
>>> client
>>> data to your media server or increase your multiplexing factor.
>>>
>>
>>
>> So what's a "whole bunch"? Is what I'm seeing an issue I should deal
>> with? Don't things like incrementals really slow down the tape
>> processing?
>>
>> Can it be broken down by host instead of by policy? Having multiple
>> hosts per policy would make it difficult to target a system to fix.
>> There's also the minor issue of not knowing which hosts or policies even
>> have buffer messages in bptm. The script is an excellent start though.
>>
>> My overall issue is that although we have GigE connections between many
>> hosts and the media servers, and trying to drive 8 SDLT220 drives in an
>> L700, we almost never exceed 11MBs of traffic coming into the media
>> servers. It's like there's a cap there that we just haven't been able to
>> remove.
>> Thanks,
>> .../Ed
>>
>>
>>
> I just saw this script at beginning of this thread and my current
> today log file has no policy info so i quickly added a dirty way to
> check yesterdays log with a -y flag
>
> policy=$1
> yesterday=$2
>
>
> today=`date +%m%d%y`
> TMPFILEf=/tmp/`basename $0`.tmp.f
> TMPFILEw=/tmp/`basename $0`.tmp.w
>
> [ -f $TMPFILEf ] && rm -f $TMPFILEf
> [ -f $TMPFILEw ] && rm -f $TMPFILEw
>
> [ $2 = "" ] && yesterday = "undef"
>
> if [ $yesterday = "-y" ]
> then
> if [ -n `echo $today | grep ^0` ]
> then
> today1=`expr $today - 1`
> today=0$today1
> else
> today=`expr $today - 1`
> fi
> fi
>
> echo "## Gathering data.\c"
> ...............script continues
>
>
>
> On answer to your capping have you check the gigabit ndd settings for
> the device ? This may help :-
>
> This is nicked from a script i wrote to auto set device paremeters on
> boot up or if ran manually to check them.
>
> do_check_ge()
> {
> DEV=$1
> INST=$2
>
> ndd -set $DEV instance ${INST}
> echo "+-------------------+"
> if [ `ndd $DEV link_status` = 0 ];then
> echo "$DEV${INST} status is down";else
> echo "$DEV${INST} status is up"
> fi
>
> if [ `ndd $DEV link_speed` = 1000 ];then
> echo "$DEV${INST} link speed 1000 Mbps";else
> echo "$DEV${INST} link not up"
> fi
>
> if [ `ndd $DEV link_mode` = 0 ];then
> echo "$DEV${INST} link mode Half-Duplex";else
> echo "$DEV${INST} link mode Full-Duplex"
> fi
>
> if [ `ndd $DEV adv_1000autoneg_cap` = 0 ];then
> echo "$DEV${INST} Auto-Negotiation-OFF";else
> echo "$DEV${INST} Auto-Negotiation-ON"
> fi
> if [ `ndd $DEV adv_pauseTX` = 0 ];then
> echo "$DEV${INST} Transmit PAUSE Not
> Capable(default)";else
> echo "$DEV${INST} Transmit PAUSE Capable"
> fi
> if [ `ndd $DEV adv_pauseRX` = 0 ];then
> echo "$DEV${INST} Receive PAUSE Not Capable";else
> echo "$DEV${INST} Receive PAUSE Capable(default)"
> fi
> }
>
> do_set_ge()
> {
> DEV=$1
> INST=$2
> ndd -set $DEV instance ${INST}
> echo "Setting $DEV${INST} adv_1000autoneg_cap 0"
> ndd -set $DEV adv_1000autoneg_cap 0
> echo "Setting $DEV${INST} adv_1000fdx_cap 1"
> ndd -set $DEV adv_1000fdx_cap 1
> echo "Setting $DEV${INST} adv_1000hdx_cap 0"
> ndd -set $DEV adv_1000hdx_cap 0
> echo "Setting $DEV${INST} adv_pauseTX 0"
> ndd -set $DEV adv_pauseTX 0
> echo "Setting $DEV${INST} adv_pauseRX 1"
> ndd -set $DEV adv_pauseRX 1
> }
>
> # Workings
>
> case "$1" in
> 'check')
>
> ### Ge gigabit interface different
> GE_=`nawk '$NF == "\"ge\"" {print $2}' /etc/path_to_inst | uniq`
> if [ "$GE_" != "" ];then
> for x in ${GE_};do
> do_check_ge /dev/ge $x
> done
>
> ANS=`ckyorn -p "Do you want to force all nics 1000 Mbps
> , Full-Duplex, Auto Negotiation off?~"`
> if [ $ANS = y ] || [ $ANS = Y ] || [ $ANS = YES ] || [
> $ANS = yes ];then
> echo "Setting Interfaces"
> for x in ${GE_};do
> do_set_ge /dev/ge $x
> done
> fi
> fi
> ;;
>
> 'start')
>
> ### Ge gigabit interface different
> GE_=`nawk '$NF == "\"ge\"" {print $2}' /etc/path_to_inst | uniq`
> if [ "$GE_" != "" ];then
> for x in ${GE_};do
> do_set_ge /dev/ge $x
> done
> fi
>
> ;;
>
> *)
> echo "Usage: $0 { check | start }"
> exit 1
> esac
> exit 0
>
>
>
> Thanks
>
>
>
>
>
>
>
> _______________________________________________
> Veritas-bu maillist - Veritas-bu AT mailman.eng.auburn DOT edu
> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
>
I must apologize i hadnt tested properly. the -y function takes 1 off
the year bit of date not the day. Sorry ill fix
Dave
|
|
|