I just compiled numpy
inside a virtualenv
with OpenBLAS
integration, and it seems to be working OK.
This was my process:
-
Compile
OpenBLAS
:$ git clone https://github.com/xianyi/OpenBLAS $ cd OpenBLAS && make FC=gfortran $ sudo make PREFIX=/opt/OpenBLAS install
If you don’t have admin rights you could set
PREFIX=
to a directory where you have write privileges (just modify the corresponding steps below accordingly). -
Make sure that the directory containing
libopenblas.so
is in your shared library search path.-
To do this locally, you could edit your
~/.bashrc
file to contain the lineexport LD_LIBRARY_PATH=/opt/OpenBLAS/lib:$LD_LIBRARY_PATH
The
LD_LIBRARY_PATH
environment variable will be updated when you start a new terminal session (use$ source ~/.bashrc
to force an update within the same session). -
Another option that will work for multiple users is to create a
.conf
file in/etc/ld.so.conf.d/
containing the line/opt/OpenBLAS/lib
, e.g.:$ sudo sh -c "echo '/opt/OpenBLAS/lib' > /etc/ld.so.conf.d/openblas.conf"
Once you are done with either option, run
$ sudo ldconfig
-
-
Grab the
numpy
source code:$ git clone https://github.com/numpy/numpy $ cd numpy
-
Copy
site.cfg.example
tosite.cfg
and edit the copy:$ cp site.cfg.example site.cfg $ nano site.cfg
Uncomment these lines:
.... [openblas] libraries = openblas library_dirs = /opt/OpenBLAS/lib include_dirs = /opt/OpenBLAS/include ....
-
Check configuration, build, install (optionally inside a
virtualenv
)$ python setup.py config
The output should look something like this:
... openblas_info: FOUND: libraries = ['openblas', 'openblas'] library_dirs = ['/opt/OpenBLAS/lib'] language = c define_macros = [('HAVE_CBLAS', None)] FOUND: libraries = ['openblas', 'openblas'] library_dirs = ['/opt/OpenBLAS/lib'] language = c define_macros = [('HAVE_CBLAS', None)] ...
Installing with
pip
is preferable to usingpython setup.py install
, sincepip
will keep track of the package metadata and allow you to easily uninstall or upgrade numpy in the future.$ pip install .
-
Optional: you can use this script to test performance for different thread counts.
$ OMP_NUM_THREADS=1 python build/test_numpy.py version: 1.10.0.dev0+8e026a2 maxint: 9223372036854775807 BLAS info: * libraries ['openblas', 'openblas'] * library_dirs ['/opt/OpenBLAS/lib'] * define_macros [('HAVE_CBLAS', None)] * language c dot: 0.099796795845 sec $ OMP_NUM_THREADS=8 python build/test_numpy.py version: 1.10.0.dev0+8e026a2 maxint: 9223372036854775807 BLAS info: * libraries ['openblas', 'openblas'] * library_dirs ['/opt/OpenBLAS/lib'] * define_macros [('HAVE_CBLAS', None)] * language c dot: 0.0439578056335 sec
There seems to be a noticeable improvement in performance for higher thread counts. However, I haven’t tested this very systematically, and it’s likely that for smaller matrices the additional overhead would outweigh the performance benefit from a higher thread count.