XML Server XML performance optimization

I can give you one answer and one guess:

First I use a declared table variable to mock up your scenario:

DECLARE @tbl TABLE(s NVARCHAR(MAX));
INSERT INTO @tbl VALUES
(N'<root>
    <SomeElement>This is first text of element1
        <InnerElement>This is text of inner element1</InnerElement>
        This is second text of element1
    </SomeElement>
    <SomeElement>This is first text of element2
        <InnerElement>This is text of inner element2</InnerElement>
        This is second text of element2
    </SomeElement>
</root>')
,(N'<root>
    <SomeElement>This is first text of elementA
        <InnerElement>This is text of inner elementA</InnerElement>
        This is second text of elementA
    </SomeElement>
    <SomeElement>This is first text of elementB
        <InnerElement>This is text of inner elementB</InnerElement>
        This is second text of elementB
    </SomeElement>
</root>');

–This query will read the XML with a cast out of a sub-select. You might use a CTE instead, but this should be syntactical sugar only…

SELECT se.value(N'(.)[1]','nvarchar(max)') SomeElementsContent
      ,se.value(N'(InnerElement)[1]','nvarchar(max)') InnerElementsContent
      ,se.value(N'(./text())[1]','nvarchar(max)') ElementsFirstText
      ,se.value(N'(./text())[2]','nvarchar(max)') ElementsSecondText
FROM (SELECT CAST(s AS XML) FROM @tbl) AS tbl(TheXml)
CROSS APPLY TheXml.nodes(N'/root/SomeElement') AS A(se);

–The second part uses a table to write in the typed XML and read from there:

DECLARE @tbl2 TABLE(x XML)
INSERT INTO @tbl2
SELECT CAST(s AS XML) FROM @tbl;

SELECT se.value(N'(.)[1]','nvarchar(max)') SomeElementsContent
      ,se.value(N'(InnerElement)[1]','nvarchar(max)') InnerElementsContent
      ,se.value(N'(./text())[1]','nvarchar(max)') ElementsFirstText
      ,se.value(N'(./text())[2]','nvarchar(max)') ElementsSecondText
FROM @tbl2 t2
CROSS APPLY t2.x.nodes(N'/root/SomeElement') AS A(se);

Why is /text() faster than without /text()?

If you look at my example, the content of an element is everything from the opening tag down to the closing tag. The text() of an element is the floating text between these tags. You can see this in the results of the select above. The text() is one separately stored portion in a tree structure actually (read next section). To fetch it, is a one-step-action. Otherwise a complex structure has to be analysed to find everything between the opening tag and its corresponding closing tag – even if there is nothing else than the text().

Why should I store XML in the appropriate type?

XML is not just text with some silly extra characters! It is a document with a complex structure. The XML is not stored as the text you see. XML is stored in a tree structure. Whenever you cast a string, which represents an XML, into a real XML, this very expensive work must be done. When the XML is presented to you (or any other output) the representing string is (re)built from scratch.

Why is the pre-casted approach faster

This is guessing…
In my example both approaches are quite equal and lead to (almost) the same execution plan.
SQL Server will not work down everything the way you might expect this. This is not a procedural system where you state do this, than do this and after do this!. You tell the engine what you want, and the engine decides how to do this best. And the engine is pretty good with this!
Before execution starts, the engine tries to estimate the costs of approaches. CONVERT (or CAST) is a rather cheap operation. It could be, that the engine decides to work down the list of your calls and do the cast for each single need over and over, because it thinks, that this is cheaper than the expensive creation of a derived table…

Leave a Comment