Jupyter is an amazing interactive environment that every developer should know about !
It should not be a surprise that Jupyter received an ACM Software System Award in 2018.
Or that Nature explained Why Jupyter is data scientists’ computational notebook of choice.
However, some users found that Jupyter does not encourage best coding practices:
- I don't like notebooks, Joel Grus (Allen Institute for Artificial Intelligence)
- Jupyter Notebooks Made Me A Lazy Programmer, and not in the right way
How can we write good code with Jupyter and integrate it with the rest of Python ecosystem ?
In this article, I present a simple solution to generate and reload Python code from Jupyter notebooks.
I also explain the pros and cons of using this technique compared to other development environments.
- nbconvert: an utility that converts Jupyter notebook (.ipynb) to other formats (.py, .pdf, .html ...)
- jupyterlab: an editing environment for Jupyter notebook with additional features (tabs, settings, ...)
You can install these requirements with the following command:
$ pip3 install --user nbconvert jupyterlab
$ cookiecutter https://git.fmind.me/fmind/cookiecutter-python-lab
Find more about organizing a Python projects with a layered approach in my previous article.
The goal of this configuration is to generate a Python file from Jupyter whenever a notebook file is saved.
For that purpose, we create a post save hook that calls
nbconvert on the notebook (if its extension is
The hook automatically replaces the extension
.py and makes the Python file executable with
# In: jupyterlab.py import os import subprocess from traitlets.config import get_config c = get_config() def post_save_hook(model, os_path, contents_manager): cwd, name = os.path.split(os_path) if model["type"] == "notebook" and ".py.ipynb" in os_path: output = name.replace(".ipynb", "").lower() subprocess.check_call( ["jupyter", "nbconvert", "--to", "python", "--output", output, name], cwd=cwd, ) subprocess.check_call(["chmod", "u+x", output], cwd=cwd) c.FileContentsManager.post_save_hook = post_save_hook
To load this configuration with Jupyterlab, use the following command:
$ jupyter lab --config=jupyterlab.py
Now that we can generate Python files from notebooks, we can import code from other notebooks.
However, code imported this way is not automatically reloaded when their content change.
To illustrate the problem, here is a small demonstration using IPython on a simple Python module:
To automatically reload Python module, you must create a configuration file that enables one of IPython extension:
# In: ~/.ipython/profile_default/ipython_config.py c.InteractiveShellApp.extensions = ["autoreload"] c.InteractiveShellApp.exec_lines = ["%autoreload 2"]
We can observe that the module is now reloaded as expected in IPython (and by extension in Jupyterlab):
- You benefit from the main features of Jupyter environment: interactivity and simplicity.
- Python file can be tested, linted, formatted and packaged with regular Python tools.
- This style encourages literate programming to explain code with rich formatting.
- Jupyterlab (still) lacks some modern features compared to other development environment.
- You cannot edit Python file directly, as they will be overwritten when you save a notebook.
- The use of notebook is not common and may reduce the engagement of your project.
If you love Jupyter development workflow and wish it would be more pythonic, then I hope this approach will help.
Interestingly, the Jupytext project chose an alternative approach where notebooks are edited as plain text file.
You can find an implemention of this technique in gampy, an exploratory project about pipeline composition.