[Orca-users] Re: Orcallator on Solaris 10

Mon Oct 31 19:23:46 PST 2005

I'm certain I need to edit orcallator.se file.  The orcallator script 
bombs (segmentation fault) when it is traversing the disk/tape devices 
after I remove luns that were there previously.  As such, I get 
none/zip/nadda data in my orcallator data files to generate graphs from.

This appears to be an Solaris 10 issues as I am seeing issues with other 
things (like emc powerpath) also, so I am also researching it from that 
angle also.

Thanks,
Liston

On Tue, 1 Nov 2005, Ditlev, Unix Consulting wrote:

> You can disable disk presentation from your orcallator.cfg.
>
> They will however still appear in data collection, - if you want them moved 
> from there try and look in the se script.
>
> brgds
> Tommy
>
> /// uncomment ... from cfg
> plot {
> title			%g Disk System Wide Reads/Writes Per Second
> source			orcallator
> data			disk_rd/s
> data			disk_wr/s
> line_type		area
> line_type		line1
> legend			Reads/s
> legend			Writes/s
> y_legend		Ops/s
> data_min		0
> hrefhttp://www.orcaware.com/orca/docs/orcallator.html#disk_system_wide_reads_writes_per_second}maybe 
> try fiddling with your se file
> // If WATCH_OS is defined, then measure every part of the operating
> // system.
> //#ifdef WATCH_OS
> #define WATCH_CPU		1
> #define WATCH_MUTEX		1
> #define WATCH_NET		1
> #define WATCH_TCP		1
> #define WATCH_NFS_CLIENT	1
> #define WATCH_NFS_SERVER	1
> #define WATCH_MOUNTS		1
> //#define WATCH_DISK		1
> #define WATCH_DNLC		1
> #define WATCH_INODE		1
> #define WATCH_RAM		1
> #define WATCH_PAGES		1
> //#endif
>
>
>
> ----- Original Message ----- From: <orca at bias.org>
> To: <orca-users at orcaware.com>
> Sent: Monday, October 31, 2005 7:58 PM
> Subject: RE: [Orca-users] Re: Orcallator on Solaris 10
>
>
>> Still having issues with disks.se Segmantation Fault after adding/removing 
>> disk on Solaris 10 system with Veritas 4.1.
>> 
>> Does anyone know an easy way to turn off disk data collection, so I can get 
>> the other (network, CPU, etc) info on the system?  I have tried changing 
>> some paramenters in orcallator.se to 0, but that doesn't seem to be 
>> working.
>> 
>> Thanks,
>> Liston
>> 
>> On Wed, 28 Sep 2005 orca at bias.org wrote:
>> 
>>> The disk.se seems to be the problem.  I am getting consistent Segmentation 
>>> Faults on that (tail of verbose output below comments):
>>> 
>>>  # /opt/RICHPse/bin/se disks.se
>>>  Segmentation Fault(coredump)
>>> 
>>> It seems to only be a problem on Solaris 10 systems doing lots of luns 
>>> adding/removing.  I have not seen the problem on Solaris 8 where we also 
>>> do lun adding/removing, but the activity on Solaris 8 systems is less.
>>> 
>>> I did rebuild the devices from CD boot and the problem went away again on 
>>> one system in questions.  I could do clean se disks.se and start 
>>> orcallator.  The devfsadm -C suggestion worked on a couple of systems, but 
>>> did not work on another 5 systems having the same issue.
>>> 
>>> SE disks.se seems to have some issues with processing luns that still show 
>>> up in the devices but are not currently attached to system.
>>> 
>>> Thanks,
>>> Liston
>>> 
>>> 
>>> Tail Output from /opt/RICHPse/bin/se -d -DWATCH_OS disks.se
>>> 
>>> if (strchr(dp.d_name<c3t61d0>, <116>) != <(nil)>)
>>> sscanf(dp.d_name<c3t61d0>, <c%dt%dd%d>, &c<0>, &t<0>, &d<95>)
>>> GLOBAL_disk_info[2].controller = c<3>
>>> GLOBAL_disk_info[2].target = t<61>
>>> GLOBAL_disk_info[2].device = d<0>
>>> GLOBAL_disk_info[2].short_name = <(nil)>
>>> if ((np = str2int_get(path_tree<4297124416>, 
>>> points_at</pci at 1d,700000/lpfc at 1,1/sd at 3d,0>)) != <0>)
>>> anp = avlget(tree<4297124416>, ((ulonglong) 
>>> tag</pci at 1d,700000/lpfc at 1,1/sd at 3d,0>))
>>> if (anp<0> == <0>)
>>> return <0>;
>>> if (GLOBAL_disk_info[2].short_name<(nil)> == <(nil)>)
>>> ld = readdir(dirp<18446744071543194240>)
>>> if (count<2> == GLOBAL_diskinfo_size<146>)
>>> dp = *((dirent_t *) ld<18446744071543217648>)
>>> if (dp.d_name<c3t61d0s1> == <.> || dp.d_name<c3t61d0s1> == <..>)
>>> if (!(dp.d_name<c3t61d0s1> =~ <s0$>))
>>> ld = readdir(dirp<18446744071543194240>)
>>> if (count<2> == GLOBAL_diskinfo_size<146>)
>>> dp = *((dirent_t *) ld<18446744071543217680>)
>>> if (dp.d_name<c3t61d0s2> == <.> || dp.d_name<c3t61d0s2> == <..>)
>>> if (!(dp.d_name<c3t61d0s2> =~ <s0$>))
>>> ld = readdir(dirp<18446744071543194240>)
>>> if (count<2> == GLOBAL_diskinfo_size<146>)
>>> dp = *((dirent_t *) ld<18446744071543217712>)
>>> if (dp.d_name<emcpower8a> == <.> || dp.d_name<emcpower8a> == <..>)
>>> if (!(dp.d_name<emcpower8a> =~ <s0$>))
>>> ld = readdir(dirp<18446744071543194240>)
>>> if (count<2> == GLOBAL_diskinfo_size<146>)
>>> dp = *((dirent_t *) ld<18446744071543217744>)
>>> if (dp.d_name<c3t61d0s3> == <.> || dp.d_name<c3t61d0s3> == <..>)
>>> if (!(dp.d_name<c3t61d0s3> =~ <s0$>))
>>> ld = readdir(dirp<18446744071543194240>)
>>> if (count<2> == GLOBAL_diskinfo_size<146>)
>>> dp = *((dirent_t *) ld<18446744071543217776>)
>>> if (dp.d_name<c3t61d0s4> == <.> || dp.d_name<c3t61d0s4> == <..>)
>>> if (!(dp.d_name<c3t61d0s4> =~ <s0$>))
>>> ld = readdir(dirp<18446744071543194240>)
>>> if (count<2> == GLOBAL_diskinfo_size<146>)
>>> dp = *((dirent_t *) ld<18446744071543217808>)
>>> if (dp.d_name<c3t61d0s5> == <.> || dp.d_name<c3t61d0s5> == <..>)
>>> if (!(dp.d_name<c3t61d0s5> =~ <s0$>))
>>> ld = readdir(dirp<18446744071543194240>)
>>> if (count<2> == GLOBAL_diskinfo_size<146>)
>>> dp = *((dirent_t *) ld<18446744071543217840>)
>>> if (dp.d_name<c3t61d0s6> == <.> || dp.d_name<c3t61d0s6> == <..>)
>>> if (!(dp.d_name<c3t61d0s6> =~ <s0$>))
>>> ld = readdir(dirp<18446744071543194240>)
>>> if (count<2> == GLOBAL_diskinfo_size<146>)
>>> dp = *((dirent_t *) ld<18446744071543217872>)
>>> if (dp.d_name<c3t61d0s7> == <.> || dp.d_name<c3t61d0s7> == <..>)
>>> if (!(dp.d_name<c3t61d0s7> =~ <s0$>))
>>> ld = readdir(dirp<18446744071543194240>)
>>> if (count<2> == GLOBAL_diskinfo_size<146>)
>>> dp = *((dirent_t *) ld<18446744071543217904>)
>>> Segmentation Fault(coredump)
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Tue, 27 Sep 2005, Cockcroft, Adrian wrote:
>>> 
>>>> Also try running # se disks.se
>>>> This should give you the full set of disks and mappings, if it also
>>>> crashes then there needs to be a generic fix in SE rather than a fix in
>>>> Orcallator
>>>> 
>>>> Adrian
>>>> 
>>>> -----Original Message-----
>>>> From: orca-users-bounces+acockcroft=ebay.com at orcaware.com
>>>> [mailto:orca-users-bounces+acockcroft=ebay.com at orcaware.com] On Behalf
>>>> Of orca at bias.org
>>>> Sent: Tuesday, September 27, 2005 2:37 PM
>>>> To: orca-users at orcaware.com
>>>> Subject: RE: [Orca-users] Re: Orcallator on Solaris 10
>>>> 
>>>> Devfsadm -C worked for one of the systems that has no SAN drives
>>>> currently
>>>> attached, but did not work for the others.  I still get Segmentation
>>>> Fault on those.  Thanks for the suggestions.
>>>> 
>>>> I am using Version 3.4 of SE.  I never tried another version with
>>>> Solaris
>>>> 10, cause I thought I needed 3.4 on this.
>>>> 
>>>> I will look into the iostate class more closely.
>>>> 
>>>> Timefinder not an option for us with Clariion, but we do something very
>>>> similar by manually rewriting diskid and importing mapfiles of snapped
>>>> veritas volumes.
>>>> 
>>>> 
>>>> On Tue, 27 Sep 2005, Dmitry Berezin wrote:
>>>> 
>>>>> Liston,
>>>>> 
>>>>> Unfortunately I don't have a Solaris 10 box yet, so I can't try this
>>>> out...
>>>>> However, we use EMC TimeFinder on some of our Solaris 8 servers and
>>>> have no
>>>>> problems with orcallator.se when volumes disappear and reappear on the
>>>>> system.
>>>>> You are correct about not finding RAWDISK code in the current releases
>>>> of
>>>>> orcallator.se - it has been removed.
>>>>>> From your snippet it appears that the problem is in the diskinfo.se
>>>> that
>>>>> comes with SE Toolkit. It is being included in p_iostat_class.se (also
>>>> comes
>>>>> with SE) that is included in orcallator.se. What version of SE are you
>>>>> using?
>>>>> Try devfsadm -C (I believe it works the same on Solaris 10 as on 8).
>>>> Also,
>>>>> check for broken links in /dev/dsk.
>>>>> 
>>>>>  -Dmitry.
>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: orca-users-bounces+dberezin=acs.rutgers.edu at orcaware.com
>>>>>> [mailto:orca-users-bounces+dberezin=acs.rutgers.edu at orcaware.com] On
>>>>>> Behalf Of orca at bias.org
>>>>>> Sent: Tuesday, September 27, 2005 4:26 PM
>>>>>> To: orca-users at orcaware.com
>>>>>> Subject: [Orca-users] Re: Orcallator on Solaris 10
>>>>>> 
>>>>>> 
>>>>>> Should have done more research before last night to list on
>>>> segmentation
>>>>>> error for orcallator... but here is more info now.
>>>>>> 
>>>>>> It looks like I did have problems in March 2005 on this server with
>>>>>> orcallator getting segmentation errors.  I even wrote the list on
>>>> this
>>>>>> thinking it was caused by Veritas 4.1 install.  After complete erase
>>>> of
>>>>>> all my disk paths through reboot, it appeared to solve the issue and
>>>>>> orcallator was working again.
>>>>>> 
>>>>>> I saw a few comments about MAX_RAWDISKS and USE_RAWDISK in the
>>>>>> orcallator.se file as resolution to my issues.  There is no rawdisk
>>>>>> reference in the version I have (r497 and r412) so that does not
>>>> appear to
>>>>>> be at issue.
>>>>>> 
>>>>>> Although I'm still not sure how, I am thinking that my disk paths
>>>> coming
>>>>>> and going is the culprit here.  We do a lot of moving around of
>>>> volumes
>>>>>> via snapshots and clones, so disk that appear today may not appear
>>>> ever
>>>>>> again.
>>>>>> 
>>>>>> Does anyone else do a lot of diskpath altering on their systems?  Do
>>>> you
>>>>>> have problems?
>>>>>> 
>>>>>> I did a "-d DWATCH_OS" option on the se run and got the following
>>>> tail
>>>>>> before is does a segmentation fault:
>>>>>> 
>>>>>> break;
>>>>>> return sderr(STRUCTURE);
>>>>>> count++;
>>>>>> ld = readdir(dirp<18446744071543194240>)
>>>>>> if (count<27> == GLOBAL_diskinfo_size<150>)
>>>>>> dp = *((dirent_t *) ld<18446744071543217808>)
>>>>>> if (dp.d_name<c3t61d91s1> == <.> || dp.d_name<c3t61d91s1> == <..>)
>>>>>> if (!(dp.d_name<c3t61d91s1> =~ <s0$>))
>>>>>> ld = readdir(dirp<18446744071543194240>)
>>>>>> if (count<27> == GLOBAL_diskinfo_size<150>)
>>>>>> dp = *((dirent_t *) ld<18446744071543217840>)
>>>>>> if (dp.d_name<emcpower1a> == <.> || dp.d_name<emcpower1a> == <..>)
>>>>>> if (!(dp.d_name<emcpower1a> =~ <s0$>))
>>>>>> ld = readdir(dirp<18446744071543194240>)
>>>>>> if (count<27> == GLOBAL_diskinfo_size<150>)
>>>>>> dp = *((dirent_t *) ld<18446744071543217872>)
>>>>>> if (dp.d_name<emcpower1b> == <.> || dp.d_name<emcpower1b> == <..>)
>>>>>> if (!(dp.d_name<emcpower1b> =~ <s0$>))
>>>>>> ld = readdir(dirp<18446744071543194240>)
>>>>>> if (count<27> == GLOBAL_diskinfo_size<150>)
>>>>>> dp = *((dirent_t *) ld<18446744071543217904>)
>>>>>> Segmentation Fault
>>>>>> 
>>>>>> 
>>>>>> I'm think I may need to rebuilt the paths again, but certainly would
>>>> like
>>>>>> to find a non-reboot option to this.
>>>>>> 
>>>>>> I may also alter orcallator.se to completly eliminate gathering disk
>>>> info,
>>>>>> since load/cpu/network/etc would be better than nothing.
>>>>>> 
>>>>>> Thanks,
>>>>>> Liston
>>>>>> _______________________________________________
>>>>>> Orca-users mailing list
>>>>>> Orca-users at orcaware.com
>>>>>> http://www.orcaware.com/mailman/listinfo/orca-users
>>>>> 
>>>> _______________________________________________
>>>> Orca-users mailing list
>>>> Orca-users at orcaware.com
>>>> http://www.orcaware.com/mailman/listinfo/orca-users
>>>> 
>>> 
>> 
>> _______________________________________________
>> Orca-users mailing list
>> Orca-users at orcaware.com
>> http://www.orcaware.com/mailman/listinfo/orca-users
>> 
>