[Orca-users] Re: Orcallator CPU Questions
Blair Zajac
blair at orcaware.com
Fri Aug 16 17:23:35 PDT 2002
"Gary M.Blumenstein" wrote:
>
> Dear Orca Users,
>
> I'm trying to better understand how Orcallator and SEtoolkit determines
> CPU utilization on a multiprocessor Sparc system. First here's a little
> background...
>
> We have a 16 processor Sparc machine used mainly for running an image
> processing application. Processing times for each image takes anywhere
> between 2-6 minutes depending on image size, complexity, Etc. Right now I
> have Orca and Orcallator.se set up to generate graphs using the default 5
> minute sampling interval and the results show max CPU usage rarely exceeds
> 25% user time. Very little time is spent in system and only occaisional
> blips in wait. The vast majority (80-90%) of the CPU time remains idle
> and that has a few people around here a little perplexed.
>
> The author of the image processing code doesn't beleive our Orcallator
> numbers accurately shows how the CPUs are being used by his application.
> He says our sampling interval is too long and that we're "missing" periods
> where images are being processed and completed before the next Orcallator
> interval occurs. For example, where the image takes 2 minutes to complete
> but Orcallator reports every 5 minutes.
>
> He explained - and he's correct about this - when you watch mpstat every 5
> seconds while an image is being processed, you see instances where all 16
> processors are 100 percent busy executing a mix of system, user, i/o wait,
> and system calls. However there's other times while the same image is
> being processed, where the CPUs go from busy, to kinda' busy, to
> not-so-busy, then back to fully busy again. Once the image is complete,
> the CPUs return back to idle.
>
> Based on mpstat, the programmer thinks we're running our Sparc E6500
> system at full-bore during image processing and we would see that if we
> decreased Orcallator's sample interval. In the past he has made the case
> to management that his application is very CPU intensive and thus requires
> massive amounts of hardware to run. He was a little perturbed when I
> showed the Orcallator stats during a presentation in front of the whole
> program management group because this seemed to contradict a lot of the
> justification that was used to purchase the big iron. I'm not trying to
> demonstrate under-utilization or trivialize the application. I'm just
> trying to find a tool that accurately reports the system's true
> utilization.
I would run for a day or two several instances of orcallator.se
at different time resolutions and look at the raw data.
The last command line argument to orcallator.se tells it how many
seconds to wait between measurements. So you may want to try something
like this:
se -DWATCH_OS orcallator.se 5 > orcallator-data-5
se -DWATCH_OS orcallator.se 30 > orcallator-data-30
se -DWATCH_OS orcallator.se 60 > orcallator-data-60
se -DWATCH_OS orcallator.se 120 > orcallator-data-120
The 5 second interval may be excessive, but with this data you can
compare the numbers to mpstat to make sure they match.
If you want, you can always set up a separate orcallator.se process
later on that measures over a shorter time period and feed this into
Orca for plotting. You'll have to edit the orcallator.cfg file to
change the `interval' parameter to match the new measurement interval
and then regenerate your RRD files.
Best,
Blair
--
Blair Zajac <blair at orcaware.com>
Web and OS performance plots - http://www.orcaware.com/orca/
More information about the Orca-users
mailing list