Java: Most efficient method to iterate over all elements in a org.w3c.dom.Document?

Basically you have two ways to iterate over all elements:

1. Using recursion (the most common way I think):

public static void main(String[] args) throws SAXException, IOException,
        ParserConfigurationException, TransformerException {

    DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory
        .newInstance();
    DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
    Document document = docBuilder.parse(new File("document.xml"));
    doSomething(document.getDocumentElement());
}

public static void doSomething(Node node) {
    // do something with the current node instead of System.out
    System.out.println(node.getNodeName());

    NodeList nodeList = node.getChildNodes();
    for (int i = 0; i < nodeList.getLength(); i++) {
        Node currentNode = nodeList.item(i);
        if (currentNode.getNodeType() == Node.ELEMENT_NODE) {
            //calls this method for all the children which is Element
            doSomething(currentNode);
        }
    }
}

2. Avoiding recursion using getElementsByTagName() method with * as parameter:

public static void main(String[] args) throws SAXException, IOException,
        ParserConfigurationException, TransformerException {

    DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory
            .newInstance();
    DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
    Document document = docBuilder.parse(new File("document.xml"));

    NodeList nodeList = document.getElementsByTagName("*");
    for (int i = 0; i < nodeList.getLength(); i++) {
        Node node = nodeList.item(i);
        if (node.getNodeType() == Node.ELEMENT_NODE) {
            // do something with the current element
            System.out.println(node.getNodeName());
        }
    }
}

I think these ways are both efficient.
Hope this helps.

Leave a Comment