[Ocaml-i18n] Re: [Ocaml-lib-devel] Some (simple) functions I'd like to see in ExtLib ...
Yamagata Yoriyuki
yoriyuki at mbg.ocn.ne.jp
Thu Jun 3 04:36:41 PDT 2004
From: Richard Jones <rich at annexia.org>
Subject: [Ocaml-i18n] Re: [Ocaml-lib-devel] Some (simple) functions I'd like to see in ExtLib ...
Date: Thu, 27 May 2004 15:59:44 +0100
> On Thu, May 27, 2004 at 03:53:41PM +0100, Richard Jones wrote:
> > > > ** ExtChar (or perhaps better in UChar):
> > > >
> > > > is_space, is_alnum, is_digit, is_xdigit, etc. It's inexplicable why
> > > > these were left out of the standard OCaml library.
> > >
> > > you're welcome to send a full featured ExtChar module.
> >
> > OK, will look at this. Do you think it should be ExtChar or UChar
> > though? Since so much of the code I now write uses UTF-8 exclusively
> > I'm loathe to contribute any more 8-bit-char-specific code to the
> > world ...
>
> Actually I can answer my own question here. We could define the
> ExtChar.is_* functions to only work correctly on 7-bit ASCII. They
> would return false on any character codes >= 128. This way they
> should do the Right Thing when presented with UTF-8 strings too.
I think the general consensus (of I18N experts) is that ISO-C
char. classes are not enough. Unicode standard defines elaborate
character properties
(http://camomile.sourceforge.net/dochtml/UCharInfo.html). You can
define ISO-C char. classes from these properties, though. (glibc
actually does this).
--
Yamagata Yoriyuki
More information about the Ocaml-i18n
mailing list