[Orca-users] Solution: SE Toolkit and Orca Segmentation Fault on dirent_t

Dagobert Michelsen dam at baltic-online.de
Thu Feb 8 08:20:53 PST 2007


Hi,

I finally managed to find the problem with the segmentation
fault on dereferencing dirent_t which bothered so many of us
on the list. The problem is rooted in the SE Toolkit.

Best regards

  -- Dagobert


Orcallator segmentation fault on dereferencing dirent_t
-------------------------------------------------------

1. Identifying the problem

The problem occurs on dereferencing a (dirent_t *) structure
returned by readdir() as in e.g. /opt/RICHPse/include/diskinfo.se
and results in a segmentation fault:

# /opt/RICHPse/bin/se.sparcv9 -d /opt/RICHPse/examples/disks.se
...
ld = readdir(dirp<18446744071543194240>)
if (count<22> == GLOBAL_diskinfo_size<76>)
dp = *((dirent_t *) ld<18446744071543217856>)
if (dp.d_name<c3t50050763051844FEd2s3> == <.> || dp.d_name<c3t50050763051844FEd2s3> == <..>)
if (!(dp.d_name<c3t50050763051844FEd2s3> =~ <s0$>))
ld = readdir(dirp<18446744071543194240>)
if (count<22> == GLOBAL_diskinfo_size<76>)
dp = *((dirent_t *) ld<18446744071543217904>)
Segmentation Fault (core dumped)


# truss /opt/RICHPse/bin/se.sparcv9 /opt/RICHPse/examples/disks.se
...
17038: readlink("/dev/dsk/c3t50050763051844FEd2s0", "../../devices/pci at 8,600000/SUNW,qlc at 1,1/fp at 0,0/ssd at w50050763051844fe,2:a", 256) = 72
17038: ioctl(4, KSTAT_IOC_CHAIN_ID, 0x00000000) = 162417
17038: ioctl(4, KSTAT_IOC_READ, "sd1,err") = 162417
17038: ioctl(4, KSTAT_IOC_CHAIN_ID, 0x00000000) = 162417
17038: Incurred fault #6, FLTBOUNDS %pc = 0xFFFFFFFF7DF0092C
17038: siginfo: SIGSEGV SEGV_MAPERR addr=0xFFFFFFFF7EE06000
17038: Received signal #11, SIGSEGV [default]
17038: siginfo: SIGSEGV SEGV_MAPERR addr=0xFFFFFFFF7EE06000


# truss -u '*' /opt/RICHPse/bin/se.sparcv9 /opt/RICHPse/examples/disks.se
...
/1 at 1: -> libc_psr:memcpy(0xffffffff7fff9080, 0x1001519f8, 0x78, 0x0)
/1 at 1: <- libc_psr:memcpy() = 0xffffffff7fff9080
/1 at 1: -> libc_psr:memcpy(0xffffffff7fff9080, 0x100242e10, 0x78, 0x1)
/1 at 1: <- libc_psr:memcpy() = 0xffffffff7fff9080
/1 at 1: -> libc_psr:memcpy(0x10024cb90, 0xffffffff7ee05f02, 0x100, 0x0)
/1: Incurred fault #6, FLTBOUNDS %pc = 0xFFFFFFFF7DF0092C
/1: siginfo: SIGSEGV SEGV_MAPERR addr=0xFFFFFFFF7EE06000
/1: Received signal #11, SIGSEGV [default]
/1: siginfo: SIGSEGV SEGV_MAPERR addr=0xFFFFFFFF7EE06000


# mdb se core
Loading modules: [ libc.so.1 libuutil.so.1 ld.so.1 ]
> ::stack
libc_psr.so.1`memcpy+0x42c(10025a960, ffffffff7ee05ef0, 0, 100023cac, ffffffff7e2e4000, 1)
struct_fill+0x128(10025a720, ffffffff7ee05ef0, 0, ffffffff7eb0946c, ffffffff7ee02000, 1ef0)
run_indirection+0x130(ffffffff7fff9180, 2, ffffffff7fff96d1, 100046604, 0, ffffffff7eb08000)
run_call+0x38(ffffffff7fff9180, 10025a1e0, 1, fffffffffffffffa, 4, 0)
resolve_expression+0x58(ffffffff7fff9d50, 10025a600, 1, 100023cac, ffffffff7e2e4000, 1)
...


2. Understanding the problem

The cause of the problem lies in the strange structure of the dirent-structure and
the improper handling of it by the SE Toolkit. A directory entry is defined in
/usr/include/sys/dirent.h and looks like this:
    typedef struct dirent {
            ino_t d_ino; /* "inode number" of entry */
            off_t d_off; /* offset of disk directory entry */
            unsigned short d_reclen; /* length of this record */
            char d_name[1]; /* name of file */
    } dirent_t;
Here, d_name is basically a zero-terminated array of characters with variable length
and a maximum size of MAXNAMLEN (which is 512). This is mapped in the SE Toolkit to
the structure dirent_t in /opt/RICHPse/include/dirent.se:

    #define MAXNAMELEN 256

    struct dirent_t {
      ulong d_ino; /* (ino_t) "inode number" of entry */
      long d_off; /* (off_t) offset of disk directory entry */
      ushort d_reclen; /* length of this record */
      char d_name[MAXNAMELEN]; /* name of file */
    };

There are two problems with this:
1. MAXNAMELEN is obviously too short (not the problem here and easy to fix)
2. d_name is defined as an array with constant size of 256 bytes. When
   dirent_t gets dereferenced the character array is copied with the full
   size. If opendir/readdir has allocated enough buffer space everything is
   fine, the name along with some garbage gets copied with maximum size.
   When we reach the end of directory list and the buffer was allocated
   to fit nicely memcpy() will try to read past the end of the allocated
   array and cause the segmentation fault.



3. The solution

Due to the special nature of the string in dirent_t I am unsure of the best
solution. Some suggestions:
(1) Introduce a new type for zero-terminated character arrays with variable length in SE
    Possible, but may be oversized for this small problem
(2) Don't just dereference dirent_t but use a copy function for this
    This means changes in lots of code
(3) Use strlcpy in member_fill to copy character arrays
    Easy to implement, may break programs which store \0 within char arrays
Any ideas?

As a quick solution for the sufferers heres a patch implementing (3) which
gave me salvation:

diff -Naur -x bin /opt/RICHPse/src/run.c src/run.c
--- /opt/RICHPse/src/run.c
2005-01-05 22:39:46.000000000 +0100
+++ src/run.c
2007-02-08 16:33:10.994092000 +0100
@@ -988,6 +988,10 @@
     if (vp->var_type == VAR_STRING) {
       for(i=0; i<vp->var_dimension; i++)
         ((char **) t_array)[i] = ((char **) f_array)[i];
+ } else if (vp->var_type == VAR_CHAR || vp->var_type == VAR_UCHAR ) {
+ strlcpy(t_array, f_array, size);
     } else {
       if ((t_array == 0) || (f_array == 0))
         se_fatal("reference through nil pointer accessing: %s", vp->var_name);

--
Dagobert Michelsen (Leiter IT)          Baltic Online Computer GmbH
"Of course computer servers don't need thrust, since they generally
don't go anywhere."  -- Comment in TR on new HP server turbine fans



More information about the Orca-users mailing list