This guide describes how to set up a Python 2 and 3 environment and a Jupyter kernel on a remote server. Please reach out to me with any suggestions or issues! My email is firstname.lastname@example.org
I have run code almost exclusively on a Jupyter kernel set up on a remote server for nearly 4 years, both while working at Uber and now at Stanford. The benefits over running anything locally are substantial: instead of overwhelming my laptop, everything is done remotely. With most large datasets it will be truly impossible to run code locally due to memory constraints.
First, we need a good Python environment manager
Follow the instructions here to install
The following commands should all be run from the remote server. Importantly, you do not need root access for any of them. There is a chance, however, that a package will require some library dependency that you cannot install yourself and will need to ask a root user (e.g., your university IT department).
Setup directories and add code to your
.bashrc for starting pyenv on launch
# Change directory names if desired mkdir ~/.ve mkdir ~/workspace echo 'export WORKON_HOME=~/.ve' >> ~/.bashrc echo 'export PROJECT_HOME=~/workspace' >> ~/.bashrc echo 'eval "$(pyenv init -)"' >> ~/.bashrc
Install the latest Python 2 and 3 versions
pyenv install 3.7.4 pyenv install 2.7.13
Setup two virtual environments, one for Python 2 and one for Python 3
pyenv virtualenv 3.7.4 jupyter3 pyenv virtualenv 2.7.13 ipython2
Now we will install packages in each of the two virtualenvs. This ensures that our Python 2 and Python 3 environments are separate. Soon, I'll show you how to link them such that they both appear when you launch a Jupyter kernel in the jupyter3 environment.
I use a text document
requirements.txt to track all packages that I want installed automatically. My current one is available here.
We'll also want to expand Jupyter's capabilities with extensions from
Starting with the jupyter3 environment...
pyenv activate jupyter3 pip install --upgrade pip pip install jupyter python -m ipykernel install --user pip install -r requirements.txt # make sure this is in the same folder! # jupyter extensions pip install https://github.com/ipython-contrib/jupyter_contrib_nbextensions/tarball/master jupyter contrib nbextension install --user pip install jupyter_nbextensions_configurator jupyter nbextensions_configurator enable --user pyenv deactivate
For the ipython2 environment, we just need to install the appropriate versions of the packages. All the Jupyter goodness will be run from the jupyter3 environment.
# install python 2 packages pyenv activate ipython2 pip install --upgrade pip pip install -r requirements.txt pyenv deactivate
As you work and decide you need additional packages, be sure to install them in the appropriate environments.
Now we need to make our two environments play nicely with each other. This establishes the PATH priority of the environments.
pyenv global 3.7.4 2.7.13 jupyter3 ipython2
Finally, we want to ensure the virtualenv wrapper starts the moment you login to the server
echo 'pyenv virtualenvwrapper_lazy' >> ~/.bashrc exec $SHELL # restart the shell to confirm everything worked
We can now launch Jupyter! Make sure to activate the jupyter3 environment first. Since we're running this on a remote server—with no UI or browser—we need to specify that we don't want Jupyter to try to open a browser link.
pyenv activate jupyter3 jupyter notebook --no-browser
One issue: if we log out now, our kernel will die. If we want to leave code running overnight, for example, we'd have to leave our local computer on. This would defeat one of the main benefits of using a remote server.
Never fear, there's a solution. I use
tmux, others use an alternative called
screen. Tmux allows you to open a 'window' on a remote server that will persist once you log out. We want to run Jupyter in that window, so that our kernel never dies and we can always access it, even without logging back in to the server. I recommend this guide for installing and understanding tmux.
Once you have tmux installed, we make the following small change to the above code:
tmux pyenv activate jupyter3 jupyter notebook --no-browser
Now we can log out of the server, but our Jupyter notebook will keep running within the tmux window.
When you start the kernel, you want to pay attention to two things. First, the port it's hosted on. It will likely say
localhost:8888. Second, the access key, which will be some long string of letters and numbers. Copy this somewhere. If you forget either of these, no worries -- you can open this tmux window again in the future and scroll up to find them.
To check on the kernel in the future, we can just log into the server and attach that tmux window:
tmux ls # list open windows tmux a -t 0 # attach window 0
Now that we've set up a Jupyter notebook on a remote server, which will persist thanks to running it in tmux, we want to be able to access it in our local browser.
To do so, we'll use tunnels to create a, well, tunnel, from our local computer to the server. Tunnels are kind of a pain to manage. But fortunately, there's a nice solution: the SSH Tunnel Manager. Install it from that link.
Open the Tunnel Manager and click the gear icon to setup a new tunnel. Setup the tunnel to look as follows:
The remote ports should be the port you saw when you started the Jupyter kernel. The local ports will be how we access it now. The only requirement is that these not be in use. The default is usually 8888, but if you are running multiple notebooks on the server then you will need to switch to 8889, etc... for example, in my setup I use 8889 as the local port because 8888 is already in use.
Whenever you want to access your remote Jupyter notebook, you'll first just open the tunnel by clicking start. It will prompt for your password and, if necessary, 2-factor authentication.
In your local browser, go to
localhost:8888 (or whichever port you used for the local connection). You should see a toolbar that looks like this:
If you click on 'new' on the right, you'll be able to start a new notebook in either Python 2 or Python 3.
Anytime you want to work:
localhost:8888(or whichever port you used for the local connection)
A few other tips:
htopto check on resource usage. If you're using a shared server, be a good citizen and don't overwhelm its resources!
vimis painful to learn, but incredibly powerful once you invest the time. Perhaps that will be my next guide.
gitto manage your code so that it's easy to use your code on many servers (and locally). And backup your data, especially if you don't manage the server you're using -- don't trust that your code and data won't be deleted!
scpor, for an easy UI way, CyberDuck.