[Ocaml-pxp-users] Handling undeclared entities
Gerd Stolpmann
gerd at gerd-stolpmann.de
Thu Jul 16 12:45:21 PDT 2009
Am Donnerstag, den 16.07.2009, 09:43 -0700 schrieb Dario Teixeira:
> Hi,
>
> I am using PXP to parse a small HTML-like markup. I would like to allow
> the use of common HTML entities in the source text (such as €), but I
> don't want to include a list of *all* of them in the DTD (note that these
> are eventually checked for validity somewhere else; I just don't need this
> task to be performed also by PXP).
>
> Now, the PXP manual mentions several times that entities are automatically
> converted into regular #PCDATA, and there doesn't seem to be a way of passing
> them unmodified to the processing code. Therefore, if they are not declared
> in the DTD I get a parsing error.
>
> One solution I can think of is to preprocess the source file, using regexps
> to replace entity references by a special node. Something like this:
> "the symbol is €" -> "the symbol is <entity>euro<entity>".
>
> This solution is of course way to kludgy and error prone. Is there a better
> alternative within PXP?
Sure. The entity declarations need only to be put into the dtd object at
the right moment. The parsing functions have a callback for exactly that
purpose, ~transform_dtd, e.g.
parse_document_entity
~transform_dtd:(fun dtd ->
let e = Pxp_Dtd.Entity.create_internal_entity
~name:"euro" ~value:"€" dtd in
dtd # add_gen_entity e false
)
config src spec
Gerd
>
> Thanks!
> Best regards,
> Dario Teixeira
>
>
>
>
> _______________________________________________
> Ocaml-pxp-users mailing list
> Ocaml-pxp-users at orcaware.com
> http://www.orcaware.com/mailman/listinfo/ocaml-pxp-users
>
--
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany
gerd at gerd-stolpmann.de http://www.gerd-stolpmann.de
Phone: +49-6151-153855 Fax: +49-6151-997714
------------------------------------------------------------
More information about the Ocaml-pxp-users
mailing list