Looking for an efficient integer square root algorithm for ARM Thumb2

Integer Square Roots by Jack W. Crenshaw could be useful as another reference. The C Snippets Archive also has an integer square root implementation. This one goes beyond just the integer result, and calculates extra fractional (fixed-point) bits of the answer. (Update: unfortunately, the C snippets archive is now defunct. The link points to the