Python faster than compiled Haskell?

The Original Haskell Code

There are two issues with the Haskell version:

  • You’re using string IO, which builds linked lists of characters
  • You’re using a non-quicksort that looks like quicksort.

This program takes 18.7 seconds to run on my Intel Core2 2.5 GHz laptop. (GHC 7.4 using -O2)

Daniel’s ByteString Version

This is much improved, but notice it still uses the inefficient built-in merge sort.

His version takes 8.1 seconds (and doesn’t handle negative numbers, but that’s more of a non-issue for this exploration).

Note

From here on this answer uses the following packages: Vector, attoparsec, text and vector-algorithms. Also notice that kindall’s version using timsort takes 2.8 seconds on my machine (edit: and 2 seconds using pypy).

A Text Version

I ripped off Daniel’s version, translated it to Text (so it handles various encodings) and added better sorting using a mutable Vector in an ST monad:

import Data.Attoparsec.Text.Lazy
import qualified Data.Text.Lazy as T
import qualified Data.Text.Lazy.IO as TIO
import qualified Data.Vector.Unboxed as V
import qualified Data.Vector.Algorithms.Intro as I
import Control.Applicative
import Control.Monad.ST
import System.Environment (getArgs)

parser = many (decimal <* char '\n')

main = do
    numbers <- TIO.readFile =<< fmap head getArgs
    case parse parser numbers of
        Done t r | T.null t -> writeFile "sorted" . unlines
                                                  . map show . vsort $ r
        x -> error $ Prelude.take 40 (show x)

vsort :: [Int] -> [Int]
vsort l = runST $ do
        let v = V.fromList l
        m <- V.unsafeThaw v
        I.sort m
        v' <- V.unsafeFreeze m
        return (V.toList v')

This runs in 4 seconds (and also doesn’t handle negatives)

Return to the Bytestring

So now we know we can make a more general program that’s faster, what about making the ASCii -only version fast? No problem!

import qualified Data.ByteString.Lazy.Char8 as BS
import Data.Attoparsec.ByteString.Lazy (parse,  Result(..))
import Data.Attoparsec.ByteString.Char8 (decimal, char)
import Control.Applicative ((<*), many)
import qualified Data.Vector.Unboxed as V
import qualified Data.Vector.Algorithms.Intro as I
import Control.Monad.ST


parser = many (decimal <* char '\n')

main = do
    numbers <- BS.readFile "rands"
    case parse parser numbers of
        Done t r | BS.null t -> writeFile "sorted" . unlines
                                                   . map show . vsort $ r

vsort :: [Int] -> [Int]
vsort l = runST $ do
        let v = V.fromList l
        m <- V.unsafeThaw v
        I.sort m
        v' <- V.unsafeFreeze m
        return (V.toList v')

This runs in 2.3 seconds.

Producing a Test File

Just in case anyone’s curious, my test file was produced by:

import Control.Monad.CryptoRandom
import Crypto.Random
main = do
  g <- newGenIO :: IO SystemRandom
  let rs = Prelude.take (2^20) (map abs (crandoms g) :: [Int])
  writeFile "rands" (unlines $ map show rs)

If you’re wondering why vsort isn’t packaged in some easier form on Hackage… so am I.

Leave a Comment