XSLT – How to keep only wanted elements from XML

This general transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:ns="some:ns">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <ns:WhiteList>
  <name>ns:currency</name>
  <name>ns:currency_code3</name>
 </ns:WhiteList>

 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match=
  "*[not(descendant-or-self::*[name()=document('')/*/ns:WhiteList/*])]"/>
</xsl:stylesheet>

when applied on the provided XML document (with namespace definition added to make it well-formed):

<ns:stuff xmlns:ns="some:ns">
    <ns:things>
        <ns:currency>somecurrency</ns:currency>
        <ns:currency_code/>
        <ns:currency_code2/>
        <ns:currency_code3/>
        <ns:currency_code4/>
    </ns:things>
</ns:stuff>

produces the wanted result (white-listed elements and their structural relations are preserved):

<ns:stuff xmlns:ns="some:ns">
   <ns:things>
      <ns:currency>somecurrency</ns:currency>
      <ns:currency_code3/>
   </ns:things>
</ns:stuff>

Explanation:

  1. The identity rule/template copies all nodes “as-is”.

  2. The stylesheet contains a top-level <ns:WhiteList> element whose <name> children specify all white-listed element’s names — the elements that are to be preserved with their structural relationships in the document.

  3. The <ns:WhiteList> element is best kept in a separate document so that the current stylesheet will not need to be edited with new names. Here the whitelist is in the same stylesheet just for convenience.

  4. One single template is overriding the identity template. It doesn’t process (deletes) any element that is not white-listed and has no descendent that is white-listed.

Leave a Comment