Why are logical reads for windowed aggregate functions so high?

Logical reads are counted differently for worktables: there is one ‘logical read’ per row read. This does not mean that worktables are somehow less efficient than a ‘real’ spool table (quite the reverse); the logical reads are just in different units.

I believe the thinking was that counting hashed pages for worktable logical reads would not be very useful because these structures are internal to the server. Reporting rows spooled in the logical reads counter makes the number more meaningful for analysis purposes.

This insight should make the reason your formula works clear. The two secondary spools are fully read twice (2 * COUNT(*)), and the primary spool emits (number of group values + 1) rows as explained in my blog entry, giving the (COUNT(DISTINCT CustomerID) + 1) component. The plus one is for the extra row emitted by the primary spool to indicate the final group has ended.

Paul

Leave a Comment