[Orca-users] Re: Orcallator on Solaris 10

Tue Nov 1 12:43:19 PST 2005

Thanks for the suggestions.

Neither option worked.  I already tried removing WATCH_IO previously as 
well as WATCH_DISK and WATCH_TAPE, but went ahead and tried again now 
removing WATCH_IO with no luck.

- Liston

On Tue, 1 Nov 2005, Dmitry Berezin wrote:

> Liston,
>
> Have you tried removing '#define WATCH_IO 1' in the beginning of the
> orcallator.se? If that does not help, try adding -DNO_KSTAT_IO in the
> start_orcallator shell script to the line where SE is being called (at the
> end of the script). Let me know if this does not help either.
>
>  -Dmitry.
>
>> -----Original Message-----
>> From: orca-users-bounces+dberezin=acs.rutgers.edu at orcaware.com
>> [mailto:orca-users-bounces+dberezin=acs.rutgers.edu at orcaware.com] On
>> Behalf Of orca at bias.org
>> Sent: Monday, October 31, 2005 10:24 PM
>> To: orca-users at orcaware.com
>> Subject: Re: [Orca-users] Re: Orcallator on Solaris 10
>>
>> I'm certain I need to edit orcallator.se file.  The orcallator script
>> bombs (segmentation fault) when it is traversing the disk/tape devices
>> after I remove luns that were there previously.  As such, I get
>> none/zip/nadda data in my orcallator data files to generate graphs from.
>>
>> This appears to be an Solaris 10 issues as I am seeing issues with other
>> things (like emc powerpath) also, so I am also researching it from that
>> angle also.
>>
>> Thanks,
>> Liston
>>
>> On Tue, 1 Nov 2005, Ditlev, Unix Consulting wrote:
>>
>>> You can disable disk presentation from your orcallator.cfg.
>>>
>>> They will however still appear in data collection, - if you want them
>> moved
>>> from there try and look in the se script.
>>>
>>> brgds
>>> Tommy
>>>
>>> /// uncomment ... from cfg
>>> plot {
>>> title			%g Disk System Wide Reads/Writes Per Second
>>> source			orcallator
>>> data			disk_rd/s
>>> data			disk_wr/s
>>> line_type		area
>>> line_type		line1
>>> legend			Reads/s
>>> legend			Writes/s
>>> y_legend		Ops/s
>>> data_min		0
>>>
>> hrefhttp://www.orcaware.com/orca/docs/orcallator.html#disk_system_wide_rea
>> ds_writes_per_second}maybe
>>> try fiddling with your se file
>>> // If WATCH_OS is defined, then measure every part of the operating
>>> // system.
>>> //#ifdef WATCH_OS
>>> #define WATCH_CPU		1
>>> #define WATCH_MUTEX		1
>>> #define WATCH_NET		1
>>> #define WATCH_TCP		1
>>> #define WATCH_NFS_CLIENT	1
>>> #define WATCH_NFS_SERVER	1
>>> #define WATCH_MOUNTS		1
>>> //#define WATCH_DISK		1
>>> #define WATCH_DNLC		1
>>> #define WATCH_INODE		1
>>> #define WATCH_RAM		1
>>> #define WATCH_PAGES		1
>>> //#endif
>>>
>>>
>>>
>>> ----- Original Message ----- From: <orca at bias.org>
>>> To: <orca-users at orcaware.com>
>>> Sent: Monday, October 31, 2005 7:58 PM
>>> Subject: RE: [Orca-users] Re: Orcallator on Solaris 10
>>>
>>>
>>>> Still having issues with disks.se Segmantation Fault after
>> adding/removing
>>>> disk on Solaris 10 system with Veritas 4.1.
>>>>
>>>> Does anyone know an easy way to turn off disk data collection, so I can
>> get
>>>> the other (network, CPU, etc) info on the system?  I have tried
>> changing
>>>> some paramenters in orcallator.se to 0, but that doesn't seem to be
>>>> working.
>>>>
>>>> Thanks,
>>>> Liston
>>>>
>>>> On Wed, 28 Sep 2005 orca at bias.org wrote:
>>>>
>>>>> The disk.se seems to be the problem.  I am getting consistent
>> Segmentation
>>>>> Faults on that (tail of verbose output below comments):
>>>>>
>>>>>  # /opt/RICHPse/bin/se disks.se
>>>>>  Segmentation Fault(coredump)
>>>>>
>>>>> It seems to only be a problem on Solaris 10 systems doing lots of luns
>>>>> adding/removing.  I have not seen the problem on Solaris 8 where we
>> also
>>>>> do lun adding/removing, but the activity on Solaris 8 systems is less.
>>>>>
>>>>> I did rebuild the devices from CD boot and the problem went away again
>> on
>>>>> one system in questions.  I could do clean se disks.se and start
>>>>> orcallator.  The devfsadm -C suggestion worked on a couple of systems,
>> but
>>>>> did not work on another 5 systems having the same issue.
>>>>>
>>>>> SE disks.se seems to have some issues with processing luns that still
>> show
>>>>> up in the devices but are not currently attached to system.
>>>>>
>>>>> Thanks,
>>>>> Liston
>>>>>
>>>>>
>>>>> Tail Output from /opt/RICHPse/bin/se -d -DWATCH_OS disks.se
>>>>>
>>>>> if (strchr(dp.d_name<c3t61d0>, <116>) != <(nil)>)
>>>>> sscanf(dp.d_name<c3t61d0>, <c%dt%dd%d>, &c<0>, &t<0>, &d<95>)
>>>>> GLOBAL_disk_info[2].controller = c<3>
>>>>> GLOBAL_disk_info[2].target = t<61>
>>>>> GLOBAL_disk_info[2].device = d<0>
>>>>> GLOBAL_disk_info[2].short_name = <(nil)>
>>>>> if ((np = str2int_get(path_tree<4297124416>,
>>>>> points_at</pci at 1d,700000/lpfc at 1,1/sd at 3d,0>)) != <0>)
>>>>> anp = avlget(tree<4297124416>, ((ulonglong)
>>>>> tag</pci at 1d,700000/lpfc at 1,1/sd at 3d,0>))
>>>>> if (anp<0> == <0>)
>>>>> return <0>;
>>>>> if (GLOBAL_disk_info[2].short_name<(nil)> == <(nil)>)
>>>>> ld = readdir(dirp<18446744071543194240>)
>>>>> if (count<2> == GLOBAL_diskinfo_size<146>)
>>>>> dp = *((dirent_t *) ld<18446744071543217648>)
>>>>> if (dp.d_name<c3t61d0s1> == <.> || dp.d_name<c3t61d0s1> == <..>)
>>>>> if (!(dp.d_name<c3t61d0s1> =~ <s0$>))
>>>>> ld = readdir(dirp<18446744071543194240>)
>>>>> if (count<2> == GLOBAL_diskinfo_size<146>)
>>>>> dp = *((dirent_t *) ld<18446744071543217680>)
>>>>> if (dp.d_name<c3t61d0s2> == <.> || dp.d_name<c3t61d0s2> == <..>)
>>>>> if (!(dp.d_name<c3t61d0s2> =~ <s0$>))
>>>>> ld = readdir(dirp<18446744071543194240>)
>>>>> if (count<2> == GLOBAL_diskinfo_size<146>)
>>>>> dp = *((dirent_t *) ld<18446744071543217712>)
>>>>> if (dp.d_name<emcpower8a> == <.> || dp.d_name<emcpower8a> == <..>)
>>>>> if (!(dp.d_name<emcpower8a> =~ <s0$>))
>>>>> ld = readdir(dirp<18446744071543194240>)
>>>>> if (count<2> == GLOBAL_diskinfo_size<146>)
>>>>> dp = *((dirent_t *) ld<18446744071543217744>)
>>>>> if (dp.d_name<c3t61d0s3> == <.> || dp.d_name<c3t61d0s3> == <..>)
>>>>> if (!(dp.d_name<c3t61d0s3> =~ <s0$>))
>>>>> ld = readdir(dirp<18446744071543194240>)
>>>>> if (count<2> == GLOBAL_diskinfo_size<146>)
>>>>> dp = *((dirent_t *) ld<18446744071543217776>)
>>>>> if (dp.d_name<c3t61d0s4> == <.> || dp.d_name<c3t61d0s4> == <..>)
>>>>> if (!(dp.d_name<c3t61d0s4> =~ <s0$>))
>>>>> ld = readdir(dirp<18446744071543194240>)
>>>>> if (count<2> == GLOBAL_diskinfo_size<146>)
>>>>> dp = *((dirent_t *) ld<18446744071543217808>)
>>>>> if (dp.d_name<c3t61d0s5> == <.> || dp.d_name<c3t61d0s5> == <..>)
>>>>> if (!(dp.d_name<c3t61d0s5> =~ <s0$>))
>>>>> ld = readdir(dirp<18446744071543194240>)
>>>>> if (count<2> == GLOBAL_diskinfo_size<146>)
>>>>> dp = *((dirent_t *) ld<18446744071543217840>)
>>>>> if (dp.d_name<c3t61d0s6> == <.> || dp.d_name<c3t61d0s6> == <..>)
>>>>> if (!(dp.d_name<c3t61d0s6> =~ <s0$>))
>>>>> ld = readdir(dirp<18446744071543194240>)
>>>>> if (count<2> == GLOBAL_diskinfo_size<146>)
>>>>> dp = *((dirent_t *) ld<18446744071543217872>)
>>>>> if (dp.d_name<c3t61d0s7> == <.> || dp.d_name<c3t61d0s7> == <..>)
>>>>> if (!(dp.d_name<c3t61d0s7> =~ <s0$>))
>>>>> ld = readdir(dirp<18446744071543194240>)
>>>>> if (count<2> == GLOBAL_diskinfo_size<146>)
>>>>> dp = *((dirent_t *) ld<18446744071543217904>)
>>>>> Segmentation Fault(coredump)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, 27 Sep 2005, Cockcroft, Adrian wrote:
>>>>>
>>>>>> Also try running # se disks.se
>>>>>> This should give you the full set of disks and mappings, if it also
>>>>>> crashes then there needs to be a generic fix in SE rather than a fix
>> in
>>>>>> Orcallator
>>>>>>
>>>>>> Adrian
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: orca-users-bounces+acockcroft=ebay.com at orcaware.com
>>>>>> [mailto:orca-users-bounces+acockcroft=ebay.com at orcaware.com] On
>> Behalf
>>>>>> Of orca at bias.org
>>>>>> Sent: Tuesday, September 27, 2005 2:37 PM
>>>>>> To: orca-users at orcaware.com
>>>>>> Subject: RE: [Orca-users] Re: Orcallator on Solaris 10
>>>>>>
>>>>>> Devfsadm -C worked for one of the systems that has no SAN drives
>>>>>> currently
>>>>>> attached, but did not work for the others.  I still get Segmentation
>>>>>> Fault on those.  Thanks for the suggestions.
>>>>>>
>>>>>> I am using Version 3.4 of SE.  I never tried another version with
>>>>>> Solaris
>>>>>> 10, cause I thought I needed 3.4 on this.
>>>>>>
>>>>>> I will look into the iostate class more closely.
>>>>>>
>>>>>> Timefinder not an option for us with Clariion, but we do something
>> very
>>>>>> similar by manually rewriting diskid and importing mapfiles of
>> snapped
>>>>>> veritas volumes.
>>>>>>
>>>>>>
>>>>>> On Tue, 27 Sep 2005, Dmitry Berezin wrote:
>>>>>>
>>>>>>> Liston,
>>>>>>>
>>>>>>> Unfortunately I don't have a Solaris 10 box yet, so I can't try this
>>>>>> out...
>>>>>>> However, we use EMC TimeFinder on some of our Solaris 8 servers and
>>>>>> have no
>>>>>>> problems with orcallator.se when volumes disappear and reappear on
>> the
>>>>>>> system.
>>>>>>> You are correct about not finding RAWDISK code in the current
>> releases
>>>>>> of
>>>>>>> orcallator.se - it has been removed.
>>>>>>>> From your snippet it appears that the problem is in the diskinfo.se
>>>>>> that
>>>>>>> comes with SE Toolkit. It is being included in p_iostat_class.se
>> (also
>>>>>> comes
>>>>>>> with SE) that is included in orcallator.se. What version of SE are
>> you
>>>>>>> using?
>>>>>>> Try devfsadm -C (I believe it works the same on Solaris 10 as on 8).
>>>>>> Also,
>>>>>>> check for broken links in /dev/dsk.
>>>>>>>
>>>>>>>  -Dmitry.
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: orca-users-bounces+dberezin=acs.rutgers.edu at orcaware.com
>>>>>>>> [mailto:orca-users-bounces+dberezin=acs.rutgers.edu at orcaware.com]
>> On
>>>>>>>> Behalf Of orca at bias.org
>>>>>>>> Sent: Tuesday, September 27, 2005 4:26 PM
>>>>>>>> To: orca-users at orcaware.com
>>>>>>>> Subject: [Orca-users] Re: Orcallator on Solaris 10
>>>>>>>>
>>>>>>>>
>>>>>>>> Should have done more research before last night to list on
>>>>>> segmentation
>>>>>>>> error for orcallator... but here is more info now.
>>>>>>>>
>>>>>>>> It looks like I did have problems in March 2005 on this server with
>>>>>>>> orcallator getting segmentation errors.  I even wrote the list on
>>>>>> this
>>>>>>>> thinking it was caused by Veritas 4.1 install.  After complete
>> erase
>>>>>> of
>>>>>>>> all my disk paths through reboot, it appeared to solve the issue
>> and
>>>>>>>> orcallator was working again.
>>>>>>>>
>>>>>>>> I saw a few comments about MAX_RAWDISKS and USE_RAWDISK in the
>>>>>>>> orcallator.se file as resolution to my issues.  There is no rawdisk
>>>>>>>> reference in the version I have (r497 and r412) so that does not
>>>>>> appear to
>>>>>>>> be at issue.
>>>>>>>>
>>>>>>>> Although I'm still not sure how, I am thinking that my disk paths
>>>>>> coming
>>>>>>>> and going is the culprit here.  We do a lot of moving around of
>>>>>> volumes
>>>>>>>> via snapshots and clones, so disk that appear today may not appear
>>>>>> ever
>>>>>>>> again.
>>>>>>>>
>>>>>>>> Does anyone else do a lot of diskpath altering on their systems?
>> Do
>>>>>> you
>>>>>>>> have problems?
>>>>>>>>
>>>>>>>> I did a "-d DWATCH_OS" option on the se run and got the following
>>>>>> tail
>>>>>>>> before is does a segmentation fault:
>>>>>>>>
>>>>>>>> break;
>>>>>>>> return sderr(STRUCTURE);
>>>>>>>> count++;
>>>>>>>> ld = readdir(dirp<18446744071543194240>)
>>>>>>>> if (count<27> == GLOBAL_diskinfo_size<150>)
>>>>>>>> dp = *((dirent_t *) ld<18446744071543217808>)
>>>>>>>> if (dp.d_name<c3t61d91s1> == <.> || dp.d_name<c3t61d91s1> == <..>)
>>>>>>>> if (!(dp.d_name<c3t61d91s1> =~ <s0$>))
>>>>>>>> ld = readdir(dirp<18446744071543194240>)
>>>>>>>> if (count<27> == GLOBAL_diskinfo_size<150>)
>>>>>>>> dp = *((dirent_t *) ld<18446744071543217840>)
>>>>>>>> if (dp.d_name<emcpower1a> == <.> || dp.d_name<emcpower1a> == <..>)
>>>>>>>> if (!(dp.d_name<emcpower1a> =~ <s0$>))
>>>>>>>> ld = readdir(dirp<18446744071543194240>)
>>>>>>>> if (count<27> == GLOBAL_diskinfo_size<150>)
>>>>>>>> dp = *((dirent_t *) ld<18446744071543217872>)
>>>>>>>> if (dp.d_name<emcpower1b> == <.> || dp.d_name<emcpower1b> == <..>)
>>>>>>>> if (!(dp.d_name<emcpower1b> =~ <s0$>))
>>>>>>>> ld = readdir(dirp<18446744071543194240>)
>>>>>>>> if (count<27> == GLOBAL_diskinfo_size<150>)
>>>>>>>> dp = *((dirent_t *) ld<18446744071543217904>)
>>>>>>>> Segmentation Fault
>>>>>>>>
>>>>>>>>
>>>>>>>> I'm think I may need to rebuilt the paths again, but certainly
>> would
>>>>>> like
>>>>>>>> to find a non-reboot option to this.
>>>>>>>>
>>>>>>>> I may also alter orcallator.se to completly eliminate gathering
>> disk
>>>>>> info,
>>>>>>>> since load/cpu/network/etc would be better than nothing.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Liston
>>>>>>>> _______________________________________________
>>>>>>>> Orca-users mailing list
>>>>>>>> Orca-users at orcaware.com
>>>>>>>> http://www.orcaware.com/mailman/listinfo/orca-users
>>>>>>>
>>>>>> _______________________________________________
>>>>>> Orca-users mailing list
>>>>>> Orca-users at orcaware.com
>>>>>> http://www.orcaware.com/mailman/listinfo/orca-users
>>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> Orca-users mailing list
>>>> Orca-users at orcaware.com
>>>> http://www.orcaware.com/mailman/listinfo/orca-users
>>>>
>>>
>>
>> _______________________________________________
>> Orca-users mailing list
>> Orca-users at orcaware.com
>> http://www.orcaware.com/mailman/listinfo/orca-users
>