I have Jupyter installed, and indeed It is simpler than you think:
- Install anaconda for OSX.
-
Install jupyter typing the next line in your terminal Click me for more info.
ilovejobs@mymac:~$ conda install jupyter
-
Update jupyter just in case.
ilovejobs@mymac:~$ conda update jupyter
-
Download Apache Spark and compile it, or download and uncompress Apache Spark 1.5.1 + Hadoop 2.6.
ilovejobs@mymac:~$ cd Downloads ilovejobs@mymac:~/Downloads$ wget http://www.apache.org/dyn/closer.lua/spark/spark-1.5.1/spark-1.5.1-bin-hadoop2.6.tgz
-
Create an
Apps
folder on your home (i.e):ilovejobs@mymac:~/Downloads$ mkdir ~/Apps
-
Move the uncompressed folder
spark-1.5.1
to the~/Apps
directory.ilovejobs@mymac:~/Downloads$ mv spark-1.5.1/ ~/Apps
-
Move to the
~/Apps
directory and verify that spark is there.ilovejobs@mymac:~/Downloads$ cd ~/Apps ilovejobs@mymac:~/Apps$ ls -l drwxr-xr-x ?? ilovejobs ilovejobs 4096 ?? ?? ??:?? spark-1.5.1
-
Here is the first tricky part. Add the spark binaries to your
$PATH
:ilovejobs@mymac:~/Apps$ cd ilovejobs@mymac:~$ echo "export $HOME/apps/spark/bin:$PATH" >> .profile
-
Here is the second tricky part. Add this environment variables also:
ilovejobs@mymac:~$ echo "export PYSPARK_DRIVER_PYTHON=ipython" >> .profile ilovejobs@mymac:~$ echo "export PYSPARK_DRIVER_PYTHON_OPTS='notebook' pyspark" >> .profile
-
Source the profile to make these variables available for this terminal
ilovejobs@mymac:~$ source .profile
-
Create a
~/notebooks
directory.ilovejobs@mymac:~$ mkdir notebooks
-
Move to
~/notebooks
and run pyspark:ilovejobs@mymac:~$ cd notebooks ilovejobs@mymac:~/notebooks$ pyspark
Notice that you can add those variables to the .bashrc
located in your home.
Now be happy, You should be able to run jupyter with a pyspark kernel (It will show it as a python 2 but it will use spark)