[Orca-users] Re: Orcallator on Solaris 10

Dmitry Berezin dberezin at surfside.rutgers.edu
Tue Nov 1 07:20:00 PST 2005


Liston,

Have you tried removing '#define WATCH_IO 1' in the beginning of the
orcallator.se? If that does not help, try adding -DNO_KSTAT_IO in the
start_orcallator shell script to the line where SE is being called (at the
end of the script). Let me know if this does not help either.

  -Dmitry.

> -----Original Message-----
> From: orca-users-bounces+dberezin=acs.rutgers.edu at orcaware.com
> [mailto:orca-users-bounces+dberezin=acs.rutgers.edu at orcaware.com] On
> Behalf Of orca at bias.org
> Sent: Monday, October 31, 2005 10:24 PM
> To: orca-users at orcaware.com
> Subject: Re: [Orca-users] Re: Orcallator on Solaris 10
> 
> I'm certain I need to edit orcallator.se file.  The orcallator script
> bombs (segmentation fault) when it is traversing the disk/tape devices
> after I remove luns that were there previously.  As such, I get
> none/zip/nadda data in my orcallator data files to generate graphs from.
> 
> This appears to be an Solaris 10 issues as I am seeing issues with other
> things (like emc powerpath) also, so I am also researching it from that
> angle also.
> 
> Thanks,
> Liston
> 
> On Tue, 1 Nov 2005, Ditlev, Unix Consulting wrote:
> 
> > You can disable disk presentation from your orcallator.cfg.
> >
> > They will however still appear in data collection, - if you want them
> moved
> > from there try and look in the se script.
> >
> > brgds
> > Tommy
> >
> > /// uncomment ... from cfg
> > plot {
> > title			%g Disk System Wide Reads/Writes Per Second
> > source			orcallator
> > data			disk_rd/s
> > data			disk_wr/s
> > line_type		area
> > line_type		line1
> > legend			Reads/s
> > legend			Writes/s
> > y_legend		Ops/s
> > data_min		0
> >
> hrefhttp://www.orcaware.com/orca/docs/orcallator.html#disk_system_wide_rea
> ds_writes_per_second}maybe
> > try fiddling with your se file
> > // If WATCH_OS is defined, then measure every part of the operating
> > // system.
> > //#ifdef WATCH_OS
> > #define WATCH_CPU		1
> > #define WATCH_MUTEX		1
> > #define WATCH_NET		1
> > #define WATCH_TCP		1
> > #define WATCH_NFS_CLIENT	1
> > #define WATCH_NFS_SERVER	1
> > #define WATCH_MOUNTS		1
> > //#define WATCH_DISK		1
> > #define WATCH_DNLC		1
> > #define WATCH_INODE		1
> > #define WATCH_RAM		1
> > #define WATCH_PAGES		1
> > //#endif
> >
> >
> >
> > ----- Original Message ----- From: <orca at bias.org>
> > To: <orca-users at orcaware.com>
> > Sent: Monday, October 31, 2005 7:58 PM
> > Subject: RE: [Orca-users] Re: Orcallator on Solaris 10
> >
> >
> >> Still having issues with disks.se Segmantation Fault after
> adding/removing
> >> disk on Solaris 10 system with Veritas 4.1.
> >>
> >> Does anyone know an easy way to turn off disk data collection, so I can
> get
> >> the other (network, CPU, etc) info on the system?  I have tried
> changing
> >> some paramenters in orcallator.se to 0, but that doesn't seem to be
> >> working.
> >>
> >> Thanks,
> >> Liston
> >>
> >> On Wed, 28 Sep 2005 orca at bias.org wrote:
> >>
> >>> The disk.se seems to be the problem.  I am getting consistent
> Segmentation
> >>> Faults on that (tail of verbose output below comments):
> >>>
> >>>  # /opt/RICHPse/bin/se disks.se
> >>>  Segmentation Fault(coredump)
> >>>
> >>> It seems to only be a problem on Solaris 10 systems doing lots of luns
> >>> adding/removing.  I have not seen the problem on Solaris 8 where we
> also
> >>> do lun adding/removing, but the activity on Solaris 8 systems is less.
> >>>
> >>> I did rebuild the devices from CD boot and the problem went away again
> on
> >>> one system in questions.  I could do clean se disks.se and start
> >>> orcallator.  The devfsadm -C suggestion worked on a couple of systems,
> but
> >>> did not work on another 5 systems having the same issue.
> >>>
> >>> SE disks.se seems to have some issues with processing luns that still
> show
> >>> up in the devices but are not currently attached to system.
> >>>
> >>> Thanks,
> >>> Liston
> >>>
> >>>
> >>> Tail Output from /opt/RICHPse/bin/se -d -DWATCH_OS disks.se
> >>>
> >>> if (strchr(dp.d_name<c3t61d0>, <116>) != <(nil)>)
> >>> sscanf(dp.d_name<c3t61d0>, <c%dt%dd%d>, &c<0>, &t<0>, &d<95>)
> >>> GLOBAL_disk_info[2].controller = c<3>
> >>> GLOBAL_disk_info[2].target = t<61>
> >>> GLOBAL_disk_info[2].device = d<0>
> >>> GLOBAL_disk_info[2].short_name = <(nil)>
> >>> if ((np = str2int_get(path_tree<4297124416>,
> >>> points_at</pci at 1d,700000/lpfc at 1,1/sd at 3d,0>)) != <0>)
> >>> anp = avlget(tree<4297124416>, ((ulonglong)
> >>> tag</pci at 1d,700000/lpfc at 1,1/sd at 3d,0>))
> >>> if (anp<0> == <0>)
> >>> return <0>;
> >>> if (GLOBAL_disk_info[2].short_name<(nil)> == <(nil)>)
> >>> ld = readdir(dirp<18446744071543194240>)
> >>> if (count<2> == GLOBAL_diskinfo_size<146>)
> >>> dp = *((dirent_t *) ld<18446744071543217648>)
> >>> if (dp.d_name<c3t61d0s1> == <.> || dp.d_name<c3t61d0s1> == <..>)
> >>> if (!(dp.d_name<c3t61d0s1> =~ <s0$>))
> >>> ld = readdir(dirp<18446744071543194240>)
> >>> if (count<2> == GLOBAL_diskinfo_size<146>)
> >>> dp = *((dirent_t *) ld<18446744071543217680>)
> >>> if (dp.d_name<c3t61d0s2> == <.> || dp.d_name<c3t61d0s2> == <..>)
> >>> if (!(dp.d_name<c3t61d0s2> =~ <s0$>))
> >>> ld = readdir(dirp<18446744071543194240>)
> >>> if (count<2> == GLOBAL_diskinfo_size<146>)
> >>> dp = *((dirent_t *) ld<18446744071543217712>)
> >>> if (dp.d_name<emcpower8a> == <.> || dp.d_name<emcpower8a> == <..>)
> >>> if (!(dp.d_name<emcpower8a> =~ <s0$>))
> >>> ld = readdir(dirp<18446744071543194240>)
> >>> if (count<2> == GLOBAL_diskinfo_size<146>)
> >>> dp = *((dirent_t *) ld<18446744071543217744>)
> >>> if (dp.d_name<c3t61d0s3> == <.> || dp.d_name<c3t61d0s3> == <..>)
> >>> if (!(dp.d_name<c3t61d0s3> =~ <s0$>))
> >>> ld = readdir(dirp<18446744071543194240>)
> >>> if (count<2> == GLOBAL_diskinfo_size<146>)
> >>> dp = *((dirent_t *) ld<18446744071543217776>)
> >>> if (dp.d_name<c3t61d0s4> == <.> || dp.d_name<c3t61d0s4> == <..>)
> >>> if (!(dp.d_name<c3t61d0s4> =~ <s0$>))
> >>> ld = readdir(dirp<18446744071543194240>)
> >>> if (count<2> == GLOBAL_diskinfo_size<146>)
> >>> dp = *((dirent_t *) ld<18446744071543217808>)
> >>> if (dp.d_name<c3t61d0s5> == <.> || dp.d_name<c3t61d0s5> == <..>)
> >>> if (!(dp.d_name<c3t61d0s5> =~ <s0$>))
> >>> ld = readdir(dirp<18446744071543194240>)
> >>> if (count<2> == GLOBAL_diskinfo_size<146>)
> >>> dp = *((dirent_t *) ld<18446744071543217840>)
> >>> if (dp.d_name<c3t61d0s6> == <.> || dp.d_name<c3t61d0s6> == <..>)
> >>> if (!(dp.d_name<c3t61d0s6> =~ <s0$>))
> >>> ld = readdir(dirp<18446744071543194240>)
> >>> if (count<2> == GLOBAL_diskinfo_size<146>)
> >>> dp = *((dirent_t *) ld<18446744071543217872>)
> >>> if (dp.d_name<c3t61d0s7> == <.> || dp.d_name<c3t61d0s7> == <..>)
> >>> if (!(dp.d_name<c3t61d0s7> =~ <s0$>))
> >>> ld = readdir(dirp<18446744071543194240>)
> >>> if (count<2> == GLOBAL_diskinfo_size<146>)
> >>> dp = *((dirent_t *) ld<18446744071543217904>)
> >>> Segmentation Fault(coredump)
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Tue, 27 Sep 2005, Cockcroft, Adrian wrote:
> >>>
> >>>> Also try running # se disks.se
> >>>> This should give you the full set of disks and mappings, if it also
> >>>> crashes then there needs to be a generic fix in SE rather than a fix
> in
> >>>> Orcallator
> >>>>
> >>>> Adrian
> >>>>
> >>>> -----Original Message-----
> >>>> From: orca-users-bounces+acockcroft=ebay.com at orcaware.com
> >>>> [mailto:orca-users-bounces+acockcroft=ebay.com at orcaware.com] On
> Behalf
> >>>> Of orca at bias.org
> >>>> Sent: Tuesday, September 27, 2005 2:37 PM
> >>>> To: orca-users at orcaware.com
> >>>> Subject: RE: [Orca-users] Re: Orcallator on Solaris 10
> >>>>
> >>>> Devfsadm -C worked for one of the systems that has no SAN drives
> >>>> currently
> >>>> attached, but did not work for the others.  I still get Segmentation
> >>>> Fault on those.  Thanks for the suggestions.
> >>>>
> >>>> I am using Version 3.4 of SE.  I never tried another version with
> >>>> Solaris
> >>>> 10, cause I thought I needed 3.4 on this.
> >>>>
> >>>> I will look into the iostate class more closely.
> >>>>
> >>>> Timefinder not an option for us with Clariion, but we do something
> very
> >>>> similar by manually rewriting diskid and importing mapfiles of
> snapped
> >>>> veritas volumes.
> >>>>
> >>>>
> >>>> On Tue, 27 Sep 2005, Dmitry Berezin wrote:
> >>>>
> >>>>> Liston,
> >>>>>
> >>>>> Unfortunately I don't have a Solaris 10 box yet, so I can't try this
> >>>> out...
> >>>>> However, we use EMC TimeFinder on some of our Solaris 8 servers and
> >>>> have no
> >>>>> problems with orcallator.se when volumes disappear and reappear on
> the
> >>>>> system.
> >>>>> You are correct about not finding RAWDISK code in the current
> releases
> >>>> of
> >>>>> orcallator.se - it has been removed.
> >>>>>> From your snippet it appears that the problem is in the diskinfo.se
> >>>> that
> >>>>> comes with SE Toolkit. It is being included in p_iostat_class.se
> (also
> >>>> comes
> >>>>> with SE) that is included in orcallator.se. What version of SE are
> you
> >>>>> using?
> >>>>> Try devfsadm -C (I believe it works the same on Solaris 10 as on 8).
> >>>> Also,
> >>>>> check for broken links in /dev/dsk.
> >>>>>
> >>>>>  -Dmitry.
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: orca-users-bounces+dberezin=acs.rutgers.edu at orcaware.com
> >>>>>> [mailto:orca-users-bounces+dberezin=acs.rutgers.edu at orcaware.com]
> On
> >>>>>> Behalf Of orca at bias.org
> >>>>>> Sent: Tuesday, September 27, 2005 4:26 PM
> >>>>>> To: orca-users at orcaware.com
> >>>>>> Subject: [Orca-users] Re: Orcallator on Solaris 10
> >>>>>>
> >>>>>>
> >>>>>> Should have done more research before last night to list on
> >>>> segmentation
> >>>>>> error for orcallator... but here is more info now.
> >>>>>>
> >>>>>> It looks like I did have problems in March 2005 on this server with
> >>>>>> orcallator getting segmentation errors.  I even wrote the list on
> >>>> this
> >>>>>> thinking it was caused by Veritas 4.1 install.  After complete
> erase
> >>>> of
> >>>>>> all my disk paths through reboot, it appeared to solve the issue
> and
> >>>>>> orcallator was working again.
> >>>>>>
> >>>>>> I saw a few comments about MAX_RAWDISKS and USE_RAWDISK in the
> >>>>>> orcallator.se file as resolution to my issues.  There is no rawdisk
> >>>>>> reference in the version I have (r497 and r412) so that does not
> >>>> appear to
> >>>>>> be at issue.
> >>>>>>
> >>>>>> Although I'm still not sure how, I am thinking that my disk paths
> >>>> coming
> >>>>>> and going is the culprit here.  We do a lot of moving around of
> >>>> volumes
> >>>>>> via snapshots and clones, so disk that appear today may not appear
> >>>> ever
> >>>>>> again.
> >>>>>>
> >>>>>> Does anyone else do a lot of diskpath altering on their systems?
> Do
> >>>> you
> >>>>>> have problems?
> >>>>>>
> >>>>>> I did a "-d DWATCH_OS" option on the se run and got the following
> >>>> tail
> >>>>>> before is does a segmentation fault:
> >>>>>>
> >>>>>> break;
> >>>>>> return sderr(STRUCTURE);
> >>>>>> count++;
> >>>>>> ld = readdir(dirp<18446744071543194240>)
> >>>>>> if (count<27> == GLOBAL_diskinfo_size<150>)
> >>>>>> dp = *((dirent_t *) ld<18446744071543217808>)
> >>>>>> if (dp.d_name<c3t61d91s1> == <.> || dp.d_name<c3t61d91s1> == <..>)
> >>>>>> if (!(dp.d_name<c3t61d91s1> =~ <s0$>))
> >>>>>> ld = readdir(dirp<18446744071543194240>)
> >>>>>> if (count<27> == GLOBAL_diskinfo_size<150>)
> >>>>>> dp = *((dirent_t *) ld<18446744071543217840>)
> >>>>>> if (dp.d_name<emcpower1a> == <.> || dp.d_name<emcpower1a> == <..>)
> >>>>>> if (!(dp.d_name<emcpower1a> =~ <s0$>))
> >>>>>> ld = readdir(dirp<18446744071543194240>)
> >>>>>> if (count<27> == GLOBAL_diskinfo_size<150>)
> >>>>>> dp = *((dirent_t *) ld<18446744071543217872>)
> >>>>>> if (dp.d_name<emcpower1b> == <.> || dp.d_name<emcpower1b> == <..>)
> >>>>>> if (!(dp.d_name<emcpower1b> =~ <s0$>))
> >>>>>> ld = readdir(dirp<18446744071543194240>)
> >>>>>> if (count<27> == GLOBAL_diskinfo_size<150>)
> >>>>>> dp = *((dirent_t *) ld<18446744071543217904>)
> >>>>>> Segmentation Fault
> >>>>>>
> >>>>>>
> >>>>>> I'm think I may need to rebuilt the paths again, but certainly
> would
> >>>> like
> >>>>>> to find a non-reboot option to this.
> >>>>>>
> >>>>>> I may also alter orcallator.se to completly eliminate gathering
> disk
> >>>> info,
> >>>>>> since load/cpu/network/etc would be better than nothing.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Liston
> >>>>>> _______________________________________________
> >>>>>> Orca-users mailing list
> >>>>>> Orca-users at orcaware.com
> >>>>>> http://www.orcaware.com/mailman/listinfo/orca-users
> >>>>>
> >>>> _______________________________________________
> >>>> Orca-users mailing list
> >>>> Orca-users at orcaware.com
> >>>> http://www.orcaware.com/mailman/listinfo/orca-users
> >>>>
> >>>
> >>
> >> _______________________________________________
> >> Orca-users mailing list
> >> Orca-users at orcaware.com
> >> http://www.orcaware.com/mailman/listinfo/orca-users
> >>
> >
> 
> _______________________________________________
> Orca-users mailing list
> Orca-users at orcaware.com
> http://www.orcaware.com/mailman/listinfo/orca-users





More information about the Orca-users mailing list