Posted on

PixieDust is a Python library for use in Jupyter notebooks. Using PixieDust is cool, but it is only supported by IBM Watson Studio and the official documentation only tells you how to install it on macOS. Today I will illustrate how to install it on Ubuntu 18.04. This tutorial is based on the installation guide for macOS.

Prerequisites

  • A Linux VM running Ubuntu 18.04 Server(I am using a Vultr Machine)

Install Anaconda

You should refer to: https://docs.anaconda.com/anaconda/install/linux/, and the following are the commands I use to install it.

$ cd
$ wget https://repo.anaconda.com/archive/Anaconda3-2019.10-Linux-x86_64.sh
$ sha256sum Anaconda3-2019.10-Linux-x86_64.sh
46d762284d252e51cd58a8ca6c8adc9da2eadc82c342927b2f66ed011d1d8b53  Anaconda3-2019.10-Linux-x86_64.sh
$ bash Anaconda3-2019.10-Linux-x86_64.sh

Press “Enter”, “Q” and then enter “yes” to agree with the terms.

Now, you should disconnect the ssh link or turn off the local terminal to have these changes take effect.

After logging in again, you should check if the installation is successful.

$ conda --version
conda 4.7.12

Install Java Dependency

PixieDust depends on java 1.8, i.e. Java SE 8.

$ sudo apt-get update
$ apt install openjdk-8-jre-headless 

Install PixieDust

Pip & Jupyterlab

The environment recommendation for PixieDust is Python 3.5 and spark 2.1.0. So firstly, we need to create a Python 3.5 environment with Anaconda.

$ conda deactivate
$ conda create -n pixiedust python=3.5
$ conda activate pixiedust
$ pip --version
pip 10.0.1 from /root/anaconda3/envs/pixiedust/lib/python3.5/site-packages/pip (python 3.5)
$ pip install --upgrade pip

Then let’s install Jupyter with Anaconda, choose the default options during the installation.

$ conda install -c conda-forge jupyterlab
$ jupyter --version
jupyter core     : 4.5.0
jupyter-notebook : 5.7.0
qtconsole        : not installed
ipython          : 7.0.1
ipykernel        : 5.1.0
jupyter client   : 5.3.3
jupyter lab      : 1.2.6
nbconvert        : 5.6.0
ipywidgets       : not installed
nbformat         : 5.0.4
traitlets        : 4.3.2

Install PixieDust from Source

The PyPI version of PixieDust is buggy on Linux, so don’t use it. In other words, you have to install it from the source.

$ git clone https://github.com/ibm-cds-labs/pixiedust
$ pip install -e ./pixiedust

Install a Jupyter Kernel

Make sure that you are under the right the pixiedust Anaconda environment and run

$ jupyter pixiedust install
$ jupyter pixiedust list
Available kernels:
  pythonwithpixiedustspark20    /root/.local/share/jupyter/kernels/pythonwithpixiedustspark20
  python3                       /root/anaconda3/envs/pixiedust/share/jupyter/kernels/python3

During the installation, for safe, you should choose spark 2.0.2.

Setup Jupyter Notebook

Now let’s configure the Jupyter Notebook before we start it.

Generate a password

$ python
>>> from notebook.auth import passwd
>>> passwd()
Enter password: 
Verify password: 
'sha1:2d0f350c2e1e:14d600fded80be55bcabbbe750c69f03859e6395'
>>> exit()

Configure the Notebook

Now let’s apply the password to the configuration file.

$ jupyter notebook --generate-config
Writing default config to: /root/.jupyter/jupyter_notebook_config.py
$ vim /root/.jupyter/jupyter_notebook_config.py

To make sure that we can access the notebook with its public IP address or in a local network. We need to make sure that the c.NotebookApp.ip is ‘0.0.0.0’ and you should specify a port you prefer.

Prerequisites

Following are some prerequisites to import pixiedust, but you can always install them later.

$ pip install matplotlib pandas yaml

Run the Notebook

Firstly, run the notebook as a background task.

$ nohup jupyter notebook pixiedust/notebook --allow-root &

Now you should login the notebook with the password you set.

If you are prompted to choose a Jupyter kernel, choose the one with Spark.

In Case of Missing Python Modules

If you are missing any Python module, you can always go back to the pixiedust Anaconda environment and install it using pip. After that restart the Jupyter kernel in the Web UI.

Example

Following is an example running in Jupyter Notebook. Cool, right?

References

  1. Github Repo pixiedust/pixiedust: https://github.com/pixiedust/pixiedust
  2. Installation guide for macOS: https://pixiedust.github.io/pixiedust/install.html
  3. PixieDust Documentation: https://pixiedust.github.io/pixiedust/use.html
  4. Install Anaconda on Linux: https://docs.anaconda.com/anaconda/install/linux/
  5. Managing Python with Anaconda: https://docs.conda.io/projects/conda/en/latest/user-guide/getting-started.html#managing-python
  6. Installing Different Versions of Python with Anaconda: https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-python.html
  7. Install Jupyter Notebook: https://jupyter.org/install
  8. Configure Jupyter Notebook: https://jupyter-notebook.readthedocs.io/en/stable/config_overview.html

Leave a Reply

Your email address will not be published. Required fields are marked *