Compiling numpy with OpenBLAS integration

I just compiled numpy inside a virtualenv with OpenBLAS integration, and it seems to be working OK.

This was my process:

  1. Compile OpenBLAS:

    $ git clone
    $ cd OpenBLAS && make FC=gfortran
    $ sudo make PREFIX=/opt/OpenBLAS install

    If you don’t have admin rights you could set PREFIX= to a directory where you have write privileges (just modify the corresponding steps below accordingly).

  2. Make sure that the directory containing is in your shared library search path.

    • To do this locally, you could edit your ~/.bashrc file to contain the line

      export LD_LIBRARY_PATH=/opt/OpenBLAS/lib:$LD_LIBRARY_PATH

      The LD_LIBRARY_PATH environment variable will be updated when you start a new terminal session (use $ source ~/.bashrc to force an update within the same session).

    • Another option that will work for multiple users is to create a .conf file in /etc/ containing the line /opt/OpenBLAS/lib, e.g.:

      $ sudo sh -c "echo '/opt/OpenBLAS/lib' > /etc/"

    Once you are done with either option, run

    $ sudo ldconfig
  3. Grab the numpy source code:

    $ git clone
    $ cd numpy
  4. Copy site.cfg.example to site.cfg and edit the copy:

    $ cp site.cfg.example site.cfg
    $ nano site.cfg

    Uncomment these lines:

    libraries = openblas
    library_dirs = /opt/OpenBLAS/lib
    include_dirs = /opt/OpenBLAS/include
  5. Check configuration, build, install (optionally inside a virtualenv)

    $ python config

    The output should look something like this:

        libraries = ['openblas', 'openblas']
        library_dirs = ['/opt/OpenBLAS/lib']
        language = c
        define_macros = [('HAVE_CBLAS', None)]
        libraries = ['openblas', 'openblas']
        library_dirs = ['/opt/OpenBLAS/lib']
        language = c
        define_macros = [('HAVE_CBLAS', None)]

    Installing with pip is preferable to using python install, since pip will keep track of the package metadata and allow you to easily uninstall or upgrade numpy in the future.

    $ pip install .
  6. Optional: you can use this script to test performance for different thread counts.

    $ OMP_NUM_THREADS=1 python build/
    version: 1.10.0.dev0+8e026a2
    maxint:  9223372036854775807
    BLAS info:
     * libraries ['openblas', 'openblas']
     * library_dirs ['/opt/OpenBLAS/lib']
     * define_macros [('HAVE_CBLAS', None)]
     * language c
    dot: 0.099796795845 sec
    $ OMP_NUM_THREADS=8 python build/
    version: 1.10.0.dev0+8e026a2
    maxint:  9223372036854775807
    BLAS info:
     * libraries ['openblas', 'openblas']
     * library_dirs ['/opt/OpenBLAS/lib']
     * define_macros [('HAVE_CBLAS', None)]
     * language c
    dot: 0.0439578056335 sec

There seems to be a noticeable improvement in performance for higher thread counts. However, I haven’t tested this very systematically, and it’s likely that for smaller matrices the additional overhead would outweigh the performance benefit from a higher thread count.

Leave a Comment