Tuesday, July 24, 2012

Accessing IPython Notebook remotely over an SSH tunnel

What is IPython Notebook?

IPython Notebook is the web-based environment that comes with IPython and is used for scientific computing and visualization. From their website:

"A web-based notebook with the same core features but support for code, text, mathematical expressions, inline plots and other rich media."

What we wanted to do?

We wanted to be able to access the IPython Notebook environment remotely from researcher machines.  The reasons for this were multiple:
  1. Their data had to reside on the remote machines that we did not have access to. 
  2. Install and configuration of IPython on these remote machines was specifically setup for this use and we wanted to keep that environment consistent.
  3. There were multiple users that wanted to access the IPython Notebook but also each others' information as needed.

The Solution

The solution was to access the IPython Notebook environment remotely over an SSH tunnel.

Step 1

On the remote machine run the IPython web-based environment and direct it to a specific port (instead of the default which fires up a web browser). The command for this should look something like:
ipython notebook --no-browser --port=7000
Now it is running on port 7000 (we just picked this arbitrarily) and also doesn't start up the browser as it normally does.

Step 2

On the local machine, you would want to make sure you have credentials to access the remote machine without having to type a password; this makes the process smoother and is necessary with the flags we are using below. For example, you could use SSH public-private key pairs with cached credentials or authenticate with Kerberos where SSH supports GSSAPI.

Step 3

Now you can access this port using an SSH tunnel with port forwarding. The command for that will look like:
ssh -N -f -L localhost:6000:localhost:7000 username@dest.ination.com
This command can be broken down as follows:

-NThis flag is used to tell SSH to not execute a remote command; it is used exclusively with port forwarding for situations like this.
-fThis flag tells SSH to go into the background before it executes the command but after the port forwards are established.

This command also implies "-n" which prevents reading from stdin and is necessary when putting SSH in the background.

Note: "-n" does something completely different from "-N".
-LThis command sets up the port forwards.

In this particular case, the first pair of localhost:6000 says to forward port 6000 on the localhost to the SSH port 22. The second pair of localhost:7000 says to have the SSH port 22 on the remote machine be forwarded to port 7000 on the remote machine which is where IPython Notebook is running.

It essentially links an arbitrary local port to a port on the remote machine over an SSH tunnel.

The overall idea of the command is that you want to tell SSH to connect to the remote machine, not read data from stdin and go into the background successfully after establishing the port forwards.

Note: The port numbers are arbitrary. I used 6000 and 7000 to clarify the command but you can just use the same port number as well.


Step 4

This is all setup. Now to use the session all you have to do is run your preferred web browser on your local machine with the URL: 
http://localhost:6000
Note: Your particular installation of the IPython Notebook server may have SSL enabled in its configuration files. If that is the case use 'https://localhost:6000' instead.

So how did the different users sharing data come into play here?

We assigned each person on the research team a different port number and asked them to always run IPython with that on the remote machine. Then when they connected to it via port-forwarding they used a different port number to access a different researcher's session.

Note: only one person can establish the tunnel to an IPython session at a time.

We wrapped this up in a nice pretty script so that all they had to do was run the command with the name of the user to whose session they wanted to connect to and it did the rest for them.



4 comments:

  1. Great post! Quick note: you can configure the --no-browser and custom ports permanently into a profile (along with SSL and password support if you also want these notebooks available without SSH forwarding). Here are some quick instructions on that: http://ipython.org/ipython-doc/rel-0.13/interactive/htmlnotebook.html#quick-howto-running-a-public-notebook-server

    We've tried to make the profile system an easy way to encapsulate all the details of a specific configuration into a single location.

    ReplyDelete
  2. Thank you! One little thing. I was only able to connect to the IPython notebook using 'https' instead of 'http' as indicated in the tutorial. Really useful!

    ReplyDelete
  3. You're quite welcome. :)

    I looked into the http vs https issue. I believe that is a result of the configuration of the particular IPython Notebook environment. In particular, ipython_notebook_config.py may have the variable c.NotebookApp.certfile set which then enforces SSL. You can find some details on this and other parameters here: http://ipython.org/ipython-doc/dev/interactive/htmlnotebook.html

    I will make a note in the blog post regarding this.

    ReplyDelete