[Orca-users] Re: Not reporting I/O wait under 2.6?

Blair Zajac blair at orcaware.com
Sat Nov 24 13:18:39 PST 2001


"Nick Steel (Lists)" wrote:
> 
> > Hi Blair -
> >
> > > Nick,
> > >
> > > I haven't seen this problem myself.
> > >
> > > Which version of SE are you using on the 2.6 hosts?  Can you try the
> > > other example SE scripts and see if they generate the same types of
> > > data?
> >
> > I'm using setoolkit 3.2.1 on all the hosts.
> >
> > I had a dig through the examples and contribs and didn't find any scripts
> > that explicitly show wio.What I did was modify cpustats.se a bit.
> > cpustats.se calculated idle by adding wait_time + idle_time together. I
> > added a couple more cols to its output to list wait_time and idle_time
> > individually. I've had a look at the output on both a Sol8 and Sol6 and it
> > looks like its reporting it properly in that its reporting wait_time
> higher
> > than what the orcallator script is showing.
> >
> > I had a poke through the percol files and verified that orcallator.se is
> > writing out the wrong data.
> >
> > One thing that's intersting (and I'm not quite sure what to make of this
> but
> > I thought itw as worth mentioning. I modified orcallator.se 1.28 on one of
> > the Sol8 boxes to only look at cpu stats and gave it a very short refresh
> > rate and ran it from commant line and the numbers it put out looked right.
> > When I tried the same thing on a couple of the 2.6 boxes, I did not get
> any
> > output at all. I'm begining to wonder if se could be broken on the 2.6
> > boxes? Although  the rest of the se scripts in examples/ run without any
> > problems.
> >
> > A very confused
> > - Nick
> >
> > >
> > > Orcallator.se uses the SE supplied classes to get this information, so
> I'm
> > > guessing right now its an issue with SE.
> 
> Ok, I'm replying to my own post. I had a poke around with orcallator.se and
> I found something quirky with vmglobal_total(); Its setup so that it will
> average the .wait_times for all the cpu's toegether if the os minor version
> is < 70, otherwise it uses the sum. Here's the code from live_rules.se:
> 
>   if (GLOBAL_pvm_ncpus > 1) {
>     /* average over cpu count */
>     pvm.user_time        /= GLOBAL_pvm_ncpus;
>     pvm.system_time      /= GLOBAL_pvm_ncpus;
>     pvm.wait_time        /= GLOBAL_pvm_ncpus;
>     pvm.idle_time        /= GLOBAL_pvm_ncpus;
> #if MINOR_VERSION < 70
>     /* reduce wait time - only one CPU can ever be waiting - others are idle
> */
>     /* system counts all idle CPUs as waiting if any I/O is outstanding */
>     pvm.wait_time        /= GLOBAL_pvm_ncpus;
> #endif
> 
> I found that if I stopped it from averageing the wait_io the numbers I get
> back are much more reasonable and run pretty close to what I'm seeing from
> iostat being run at the same time.
> 
> Does anyone know if later kernel revisions of 2.6 changed the way wio is
> stored and reported that could be affecting vmglobal_total()?
> 
> - Nick

Nick,

I'm not to familiar with the changes made between OSes and SE in this regard.
I would send a message off to the SE feedback mailing list asking for help:

	se-feedback at setoolkit.co

If you could cc this mailing list to keep us in the loop, I'd appreciate it.

Best,
Blair

-- 
Blair Zajac <blair at orcaware.com> - Perl & sysadmin services for hire
Web and OS performance plots - http://www.orcaware.com/orca/



More information about the Orca-users mailing list