Why are “control” characters illegal in XML 1.0?

My understanding is that this range is barred on the grounds that a markup language should not have any need to support transmission and flow control characters and including them would create a problem for any editors and parsers in binary conversion.

I’m struggling to find anything ex cathedra on this from Tim Bray et al though.

edit: some discussion of control chars and a vague admission it wasn’t exactly over-engineered:

At 09:27 AM 17/06/00 -0500, Mark Volkmann wrote:

I’ve never seen a discussion of the reason why most ASCII control
characters, such as a form feed, are not allowed in XML documents. Can
anyone tell me the reason behind that decision or point me to a spec. that
explains that?

I’m not sure we’d do it the same way if we were doing it again. I
don’t see that they do any real harm. Clearly, if you’re optimizing
for a highly interoperable content markup language (and XML is) it’s
legitimate to be suspicious of things like vertical-tab and backspace
and so on… but then how can it be consistent to leave in \n and DEL
and so on? -Tim

Leave a Comment