“order by newid()” – how does it work?

I know what NewID() does, I’m just
trying to understand how it would help
in the random selection. Is it that
(1) the select statement will select
EVERYTHING from mytable, (2) for each
row selected, tack on a
uniqueidentifier generated by NewID(),
(3) sort the rows by this
uniqueidentifier and (4) pick off the
top 100 from the sorted list?

Yes. this is pretty much exactly correct (except it doesn’t necessarily need to sort all the rows). You can verify this by looking at the actual execution plan.

SELECT TOP 100 * 
FROM master..spt_values 
ORDER BY NEWID()

The compute scalar operator adds the NEWID() column on for each row (2506 in the table in my example query) then the rows in the table are sorted by this column with the top 100 selected.

SQL Server doesn’t actually need to sort the entire set from positions 100 down so it uses a TOP N sort operator which attempts to perform the entire sort operation in memory (for small values of N)

Plan

Leave a Comment