[Orca-dev] Solution: SE Toolkit and Orca Segmentation Fault on dirent_t
Dagobert Michelsen
dam at baltic-online.de
Thu Feb 8 08:05:48 PST 2007
Hi,
I finally managed to find the problem with the segmentation
fault on dereferencing dirent_t which bothered so many of us
on the list. The problem is rooted in the SE Toolkit.
Best regards
-- Dagobert
Orcallator segmentation fault on dereferencing dirent_t
-------------------------------------------------------
1. Identifying the problem
The problem occurs on dereferencing a (dirent_t *) structure
returned by readdir() as in e.g. /opt/RICHPse/include/diskinfo.se
and results in a segmentation fault:
# /opt/RICHPse/bin/se.sparcv9 -d /opt/RICHPse/examples/disks.se
...
ld = readdir(dirp<18446744071543194240>)
if (count<22> == GLOBAL_diskinfo_size<76>)
dp = *((dirent_t *) ld<18446744071543217856>)
if (dp.d_name<c3t50050763051844FEd2s3> == <.> || dp.d_name<c3t50050763051844FEd2s3> == <..>)
if (!(dp.d_name<c3t50050763051844FEd2s3> =~ <s0$>))
ld = readdir(dirp<18446744071543194240>)
if (count<22> == GLOBAL_diskinfo_size<76>)
dp = *((dirent_t *) ld<18446744071543217904>)
Segmentation Fault (core dumped)
# truss /opt/RICHPse/bin/se.sparcv9 /opt/RICHPse/examples/disks.se
...
17038: readlink("/dev/dsk/c3t50050763051844FEd2s0", "../../devices/pci at 8,600000/SUNW,qlc at 1,1/fp at 0,0/ssd at w50050763051844fe,2:a", 256) = 72
17038: ioctl(4, KSTAT_IOC_CHAIN_ID, 0x00000000) = 162417
17038: ioctl(4, KSTAT_IOC_READ, "sd1,err") = 162417
17038: ioctl(4, KSTAT_IOC_CHAIN_ID, 0x00000000) = 162417
17038: Incurred fault #6, FLTBOUNDS %pc = 0xFFFFFFFF7DF0092C
17038: siginfo: SIGSEGV SEGV_MAPERR addr=0xFFFFFFFF7EE06000
17038: Received signal #11, SIGSEGV [default]
17038: siginfo: SIGSEGV SEGV_MAPERR addr=0xFFFFFFFF7EE06000
# truss -u '*' /opt/RICHPse/bin/se.sparcv9 /opt/RICHPse/examples/disks.se
...
/1 at 1: -> libc_psr:memcpy(0xffffffff7fff9080, 0x1001519f8, 0x78, 0x0)
/1 at 1: <- libc_psr:memcpy() = 0xffffffff7fff9080
/1 at 1: -> libc_psr:memcpy(0xffffffff7fff9080, 0x100242e10, 0x78, 0x1)
/1 at 1: <- libc_psr:memcpy() = 0xffffffff7fff9080
/1 at 1: -> libc_psr:memcpy(0x10024cb90, 0xffffffff7ee05f02, 0x100, 0x0)
/1: Incurred fault #6, FLTBOUNDS %pc = 0xFFFFFFFF7DF0092C
/1: siginfo: SIGSEGV SEGV_MAPERR addr=0xFFFFFFFF7EE06000
/1: Received signal #11, SIGSEGV [default]
/1: siginfo: SIGSEGV SEGV_MAPERR addr=0xFFFFFFFF7EE06000
# mdb se core
Loading modules: [ libc.so.1 libuutil.so.1 ld.so.1 ]
> ::stack
libc_psr.so.1`memcpy+0x42c(10025a960, ffffffff7ee05ef0, 0, 100023cac, ffffffff7e2e4000, 1)
struct_fill+0x128(10025a720, ffffffff7ee05ef0, 0, ffffffff7eb0946c, ffffffff7ee02000, 1ef0)
run_indirection+0x130(ffffffff7fff9180, 2, ffffffff7fff96d1, 100046604, 0, ffffffff7eb08000)
run_call+0x38(ffffffff7fff9180, 10025a1e0, 1, fffffffffffffffa, 4, 0)
resolve_expression+0x58(ffffffff7fff9d50, 10025a600, 1, 100023cac, ffffffff7e2e4000, 1)
...
2. Understanding the problem
The cause of the problem lies in the strange structure of the dirent-structure and
the improper handling of it by the SE Toolkit. A directory entry is defined in
/usr/include/sys/dirent.h and looks like this:
typedef struct dirent {
ino_t d_ino; /* "inode number" of entry */
off_t d_off; /* offset of disk directory entry */
unsigned short d_reclen; /* length of this record */
char d_name[1]; /* name of file */
} dirent_t;
Here, d_name is basically a zero-terminated array of characters with variable length
and a maximum size of MAXNAMLEN (which is 512). This is mapped in the SE Toolkit to
the structure dirent_t in /opt/RICHPse/include/dirent.se:
#define MAXNAMELEN 256
struct dirent_t {
ulong d_ino; /* (ino_t) "inode number" of entry */
long d_off; /* (off_t) offset of disk directory entry */
ushort d_reclen; /* length of this record */
char d_name[MAXNAMELEN]; /* name of file */
};
There are two problems with this:
1. MAXNAMELEN is obviously too short (not the problem here and easy to fix)
2. d_name is defined as an array with constant size of 256 bytes. When
dirent_t gets dereferenced the character array is copied with the full
size. If opendir/readdir has allocated enough buffer space everything is
fine, the name along with some garbage gets copied with maximum size.
When we reach the end of directory list and the buffer was allocated
to fit nicely memcpy() will try to read past the end of the allocated
array and cause the segmentation fault.
3. The solution
Due to the special nature of the string in dirent_t I am unsure of the best
solution. Some suggestions:
(1) Introduce a new type for zero-terminated character arrays with variable length in SE
Possible, but may be oversized for this small problem
(2) Don't just dereference dirent_t but use a copy function for this
This means changes in lots of code
(3) Use strlcpy in member_fill to copy character arrays
Easy to implement, may break programs which store \0 within char arrays
Any ideas?
As a quick solution for the sufferers heres a patch implementing (3) which
gave me salvation:
diff -Naur -x bin /opt/RICHPse/src/run.c src/run.c
--- /opt/RICHPse/src/run.c
2005-01-05 22:39:46.000000000 +0100
+++ src/run.c
2007-02-08 16:33:10.994092000 +0100
@@ -988,6 +988,10 @@
if (vp->var_type == VAR_STRING) {
for(i=0; i<vp->var_dimension; i++)
((char **) t_array)[i] = ((char **) f_array)[i];
+ } else if (vp->var_type == VAR_CHAR || vp->var_type == VAR_UCHAR ) {
+ strlcpy(t_array, f_array, size);
} else {
if ((t_array == 0) || (f_array == 0))
se_fatal("reference through nil pointer accessing: %s", vp->var_name);
--
Dagobert Michelsen (Leiter IT) Baltic Online Computer GmbH
Firmensitz: Alter Markt 1-2, 24103 Kiel, Telefon: +49 431 54003-0
"Of course computer servers don't need thrust, since they generally
don't go anywhere." -- Comment in TR on new HP server turbine fans
More information about the Orca-dev
mailing list