Formal Interactive Notebook Testing

This post is for notebook authors who need to ship software, and anyone who wants to start learning more about testing in Python. We'll focus on Python's standard library unit testing tools interactively from the notebook. There are many flavors of testing, but our focus remains on unit testing to provide quantitative metrics about the fitness of program. Readers will leave understanding how to prototype and simulate writing unit tests in interactive notebooks.

A common motivation for using notebooks is to test an idea. Without formal conventions, notebooks can result in scatter-shot code that informally verifies an idea. In this post, we discuss how to mature informal notebooks into formal unit test conventions. With practice, an effective use of the notebook is to compose both code and formal tests that can be moved into your project's module and testing suite; you do have tests don't you?

Why Are Formal Tests Valuable

Tests are investments, and testing over time measures the return on investment. Testing promotes:

  • longevity of ideas

  • protection from upstream changes

  • value to you and consumers of your software

  • health metrics when used in continuous deployment

Learn more about the motivation for testing The Hitchhiker's Guide to Python - Testing your code

Testing is Standard

Most programming languages come with unit testing abilities that allow authors to make formal assertions about the expectations of their code. In Python, doctest and unittest are built-in libraries that enable testing; meanwhile, pytest is the choice of popular projects in the broader Python community. You will not need extra dependencies besides a notebook interface and Python to apply the ideas from this post.

We will not discuss testing notebooks in pytest in this document, but if you want to read ahead you can look at nbval, importnb, or testbook for different flavors of notebook testing.

Python Doctest

Documentation-driven testing was introduced into Python in 1999. It introduced the ability to combine code and narrative into docstrings following a long lineage of literate programming concepts.

Anatomy of a Doctest

The goal of a doctest is to compare the execution of a line_of_code with an expected_result. When a doctest is executed it executes the line_of_code and generates a test_result. When the test_result and the expected_result are the same a test has passed, otherwise, a test has failed.

The list below demonstrates some forms of doctests in pseudocode representation.

  • a doctest with a single line of code and an expected result

>>> { line_of_code }  
{ expected_result }
  • a doctest with multiple lines of code and an expected result

>>> { line_of_code }  
... { line_of_code }  
... { line_of_code }  
{ expected_result }
  • a doctest with multiple lines of code that prints no output

>>> { line_of_code }  
... { line_of_code }

import doctest

We like doctest because it is the easiest way to run tests in notebooks. Below is a concrete example of a doctest:

  • line 4-6 represent the line_of_code

  • line 7 is the expected_result

def a_simple_function_with_a_doctest(x):
    """this function turns every input into its string representation. 
    >>> a_simple_function_with_a_doctest(
    ...      1
    ... )
    return str(x)

The easy invocation of a doctest in the notebook is what makes it the easiest test tool to use.

TestResults (failed=0, attempted=1)

When we invoke doctest.testmod it will look for all the functions and classes in our module that have docstrings and doctest examples. As a result, the previous example finds one test. The test results summarize the test's success and failures. If the test_result (the execution of the line_of_code) matches the expected_result our tests succeed, otherwise they fail. When tests fail, we'll want to inspect both our tests and source code to discover the source of the failure.

Learn more about doctest discovery

Here we add a new class with another doctest.

class ASimpleClassWithADoctest(str):
    """this type turns every input into its string type.

    >>> ASimpleClassWithADoctest(1) 

When we re-run the doctests we notice that another test was discovered.

TestResults ( failed=0, attempted=2 )

There is one more way that doctest discovers tests, which is through the __test__ variable. We can make a __test__ dictionary with keys that name the tests, and values that are objects holding doctest syntaxes.

__test__ = dict(
    a_test_without_a_function=""">>> assert a_simple_function_with_a_doctest(1) == ASimpleClassWithADo
{'a_test_without_a_function': '>>> assert a_simple_function_with_a_doctest(1) == ASimpleClassWithADoctest (1)'}

Now the doctest finds three tests.

TestResults (failed=0, attempted=3)

Unittest When Doctest Isn't Enough

import unittest

Doctest are the easiest to invoke, but they can be difficult to write for some tests you may wish to run. Python provides the alternative unittest library that allows authors to write tests in pure Python, rather than strings. This approach to writing tests will be more familiar to new Python learners.

Doctest relies on the comparison of an expected_result and a test_result whereas unittest provides an extended interface for comparing items using their list of assertion methods.

Learn more about the relationship between doctest and unittest The Hitchhiker's Guide to Python - Testing your code

Python unit tests subclass the unittest.TestCase type.

class UnitTests(unittest.TestCase):
    def test_simple_methods(self):

Running unittest in the notebook requires some keyword arguments that we'll wrap in a function to facilitate our discussion.

def run_unittest():
    unittest.main(argv=[""], exit=False)

When we invoke run_unittest we notice our test is discovered.

Ran 1 test in 0.001s  


We've already written three other tests though. Wouldn't it be nice to include our doctests in the test suite too?! If you dig deep into the doctest documentation you'll find the unittest interface. It demonstrates that including the load_tests function in the namespace means that unittest will know to discover our doctests.

It may appear that we are using multiple testing forms by combining doctest and unittest. It turns out that a doctest is a special test case that compares the output of string values. With experience, authors will find that some tests are easier to write as doctest and others are easier as unittest assertions.

def load_tests(loader, tests, ignore):     
    return tests

Now run_unittest discovers 4 tests, including our doctest defined earlier.

Ran 4 tests in 0.008s  


Restart and Run All, or it Didn't Happen

Now you see how to simulate running formal tests against your interactive notebook code. The notebook is a means, not an end. Now it is time to copy your module code and new tests into your project!

For posterity though, it is helpful that a notebook can restart and run all. Better yet, your notebook can restart and run all... with testing in the last cell! And, this specific document abides as the prior code cell is the last cell, and a test! We now have a little more confidence that this notebook could work in the future, or at least verify that it still works.

When you get to the last cell with no errors, it is time to celebrate a notebook well written.


  • We can simulate unit testing in the notebook.

  • Doctests run tests in strings and docstrings.

  • Unittests run tests on objects and may include doctest.

Testing is a good practice. It helps formalize the scientific method when code is involved. Being able to simulate tests in the notebook, doctest and unittest help expedite the test writing process by taking advantage of the rich interactive features of notebooks. When we write tests we record ideas so that our future selves can thank us for our past practice.

Post Script: Running a Formal Testing Suite

It seems helpful to illustrate how near we are too formal tests. Jupyter notebooks simulate different document forms, and we can export this notebook as a Python script. We can formally run the exported script with unittest. These steps, for this specific document, are:

  1. Convert the notebook to a Python script using Jupyter's nbconvert tool. !jupyter nbconvert --to script 2021-testing-in-notebooks.ipynb

  2. Run the newly generated script against the unittest command line interface with any extra parameters you may want to set. !python -m unittest -v

Note: ! is an IPython feature that executes system commands.

93 views0 comments