[Orca-users] Old data / parsed data
David Michaels
dragon at raytheon.com
Thu Sep 6 16:41:05 PDT 2007
Francisco Mauro Puente wrote:
> Thanks Michael, Dragon,
>
> Here is the situation:
>
> I've used to run orca on the Linux box (Pentium 4 2.8 / 512RAM) and at
> the time orca started to run, the disk activity put the system to it
> knees...
>
> Since I had all the servers scp'ng the files over to the Linux machine,
> I couldn't change it just like that, so I decided to leave the file
> where they are, but share the rrd and the html directories onto a Sun
> v490 for Orca to run and process remotely, and store the files on the
> NFS mounted directory.
>
> While orca ran on the linux box, the disk and CPU activity caused a
> VERY high I/O. (There are some other things running on that box). I'm
> processing data for 30 servers, orca dies after some time here.
>
> Now that the files are being processed on the v490, I've managed, with
> this, to move the CPU time to the Sun box, but the disk are being
> accesed the same way, and the linux starts being useless...nothing else
> can be done on the linux at the time orca starts to run.
>
Ah, I see. Ideally you should change things so that the raw files are
written (scp'ed) to the Sun instead. If that's not an option, maybe you
can change things so that the RRD files are written to one of the Sun's
local drives instead of the Linux box. That would help a bit. Moving
the HTML files to physically reside local to the Sun would also help.
I would also look at the linux's box's paging activity -- you might
simply not have enough RAM, causing the system to do a lot of swapping
(thus increasing I/O load tremendously). Adding more RAM would help there.
Consider also looking at how you scp files -- if you are scp'ing all
files all the time, that would kill your I/O over time. Try adjusting
it to only scp recent files (find . -mtime -2 -exec scp {}
remotehost:/remote/path \: ). Or perhaps use "rsync" or equivalent
tools to distribute only new data.
> I'm in process of getting a new server, with 2 o 4 CPU's in order to
> run orca.
>
If you haven't made a decision yet, you might want to avoid a Niagra box
-- they're fantastic transaction machines, but not very good at floating
point, which Orca does a lot of. Niagra2s are much better, as are the
UltraSPARCs and Intel clones.
> I'm using RICHPse-3.4.1, will update orca to r529 or later as you
> suggested
>
> I've attached one of my server's raw data directory, so you can see the
> size of the files
>
Didn't see it -- it was probably too big for the mailing list. Maybe
you can just cut & paste a "du -sk *" for me?
> I know a simple 'find' will remove them but once the data is already
> generated, couldn't I just remove them all? same thing on the client
> side...right? I should keep only the files generated on the html
> directory right?
>
I believe the common practice is to hold on to the raw data
(compressed), as everything else can be derived from that. If you lose
your RRD files or your HTML files, for example, you can recreate them
from the raw data files. If you lose your raw data files, though, you
can't sustain a loss of RRD or HTML anymore. Also, you may encounter
instances where you need to regenerate the RRD files or HTML files from
scratch. Without the raw data files, you lose this option, and that
could be problematic down the road.
If you want to reduce space, perhaps archiving your raw data would be
the way to go. I wouldn't remove old raw data except as a last resort.
If you have a lot of change in your environment, comb through the raw,
RRD, and HTML directories, and see if you can find directories for
servers that no longer exist or that have changed in substantial ways --
remove those areas first, to help mitigate your space crunch.
And of course, consider altering the "orca" script and/or orcallator.se
to record less data.
> I hope this information helps a bit more.
>
Yes, very informative, thanks. :)
--Dragon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </pipermail/orca-users/attachments/20070906/d750b9c8/attachment-0002.html>
More information about the Orca-users
mailing list