Is there a Java XML API that can parse a document without resolving character entities?

The STaX API has support for the notion of not replacing character entity references, by way of the IS_REPLACING_ENTITY_REFERENCES property:

Requires the parser to replace
internal entity references with their
replacement text and report them as
characters

This can be set into an XmlInputFactory, which is then in turn used to construct an XmlEventReader or XmlStreamReader. However, the API is careful to say that this property is only intended to force the implementation to perform the replacement, rather than forcing it to not replace them. Still, it’s got to be worth a try.

Leave a Comment