Why does XSLT output all text by default?

why did the former code outputs TEXT,
why should I insist XSL to ignore all
other text? is that the behavior of
all XML parsers or only of my own

You are discovering one of the most fundamental XSLT features as specified in the Specification: the built-in templates of XSLT.

From the Spec:

There is a built-in template rule to
allow recursive processing to continue
in the absence of a successful pattern
match by an explicit template rule in
the stylesheet. This template rule
applies to both element nodes and the
root node. The following shows the
equivalent of the built-in template
rule:

<xsl:template match="*|/">
  <xsl:apply-templates/>
</xsl:template>

There is also a built-in template rule
for each mode, which allows recursive
processing to continue in the same
mode in the absence of a successful
pattern match by an explicit template
rule in the stylesheet. This template
rule applies to both element nodes and
the root node. The following shows the
equivalent of the built-in template
rule for mode m.

<xsl:template match="*|/" mode="m">
  <xsl:apply-templates mode="m"/>
</xsl:template>

There is also a built-in template rule
for text and attribute nodes that
copies text through:

<xsl:template match="text()|@*">
  <xsl:value-of select="."/>
</xsl:template>

The built-in template rule for
processing instructions and comments
is to do nothing.

<xsl:template match="processing-instruction()|comment()"/>

The built-in template rule for
namespace nodes is also to do nothing.
There is no pattern that can match a
namespace node; so, the built-in
template rule is the only template
rule that is applied for namespace
nodes.

The built-in template rules are
treated as if they were imported
implicitly before the stylesheet and
so have lower import precedence than
all other template rules. Thus, the
author can override a built-in
template rule by including an explicit
template rule.

So, the reported behavior is the result of the application of the built-in templates — the 1st and 2nd of all three of them.

It is a good XSLT design pattern to override the built-in templates with your own that will issue an error message whenever called so that the programmer immediately knows his transformation is “leaking”:

For example, if there is this XML document:

<a>
  <b>
    <c>Don't want to see this</c>
  </b>
</a>

and it is processed with this transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>
 
 <xsl:template match="a|b">
   <xsl:copy>
      <xsl:attribute name="name">
        <xsl:value-of select="name()"/>
      </xsl:attribute>
      <xsl:apply-templates/>
   </xsl:copy>
 </xsl:template>
</xsl:stylesheet>

the result is:

<a name="a">
   <b name="b">Don't want to see this</b>
</a>

and the programmer will be greatly confused how the unwanted text appeared.

However, just adding this catch-all template helps avoid any such confusion and catch errors immediately:

 <xsl:template match="*">
  <xsl:message terminate="no">
   WARNING: Unmatched element: <xsl:value-of select="name()"/>
  </xsl:message>
  
  <xsl:apply-templates/>
 </xsl:template>

Now, besides the confusing output the programmer gets a warning that explains the problem immediately:

 WARNING: Unmatched element: c

Later Addition by Michael Kay for XSLT 3.0

In XSLT 3.0, rather than adding a catch-all template rule, you can specify the fallback behaviour on an xsl:mode declaration. For example, <xsl:mode on-no-match="shallow-skip"/> causes all nodes that are not matched (including text nodes) to be skipped, while <xsl:mode on-no-match="fail"/> treats a no-match as an error, and <xsl:mode warning-on-no-match="true"/> results in a warning.

Leave a Comment