Help other people understand your code¶

Even if you use Pythonic idioms, your code probably won't be perfectly understandable by itself. You want other people to be able to work with the code you write.

Docstrings¶

An analyst has a function that calculates the distance from a given point to the origin in three dimensions.

In [ ]:

Copied!

def distance_from_origin(x, y, z):
    return (x**2 + y**2 + z**2) ** 0.5
def distance_from_origin(x, y, z):
    return (x**2 + y**2 + z**2) ** 0.5

One option is to say that it is perfectly obvious what this function does from its name and parameters. But your functions are much more obvious to you than they are to other people. "Other people" includes future you. You do not want future you mad at current you for not explaining what your code does.

A better option is to write down explicity what this function does, what kind of arguments you can pass to it, and what kind of value it will return. For example, you might have a text file, or a web page, or a Word doc. Hopefully not a sticky note on your monitor, but even that's better than nothing. Something like:

Calculates the distance from a given point in three dimensions to the origin (0, 0, 0). 

Args: 
    x (float): The x-axis coordinate.
    y (float): The y-axis coordinate.
    z (float): The z-axis coordinate.

Returns:
    float: The distance.

That works OK, but separating your code from your documentation forces people to look in two places. It also means that the built-in help function is mostly useless for learning about your function.

In [ ]:

Copied!

help(distance_from_origin)
help(distance_from_origin)

A better way to document your code is to include the information as a docstring. You can use docstrings with modules, function, classes, and methods that you create.

In [ ]:

Copied!





def distance_from_origin_docstring(x, y, z):
    """
    Calculates the distance from a given point in three dimensions to the origin (0, 0, 0). 

    Args: 
        x (float): The x-axis coordinate.
        y (float): The y-axis coordinate.
        z (float): The z-axis coordinate.

    Returns:
        float: The distance.
    """

    return (x**2 + y**2 + z**2) ** 0.5
def distance_from_origin_docstring(x, y, z):
    """
    Calculates the distance from a given point in three dimensions to the origin (0, 0, 0). 

    Args: 
        x (float): The x-axis coordinate.
        y (float): The y-axis coordinate.
        z (float): The z-axis coordinate.

    Returns:
        float: The distance.
    """

    return (x**2 + y**2 + z**2) ** 0.5

By including a docstring, people can use the built-in help function to see the information without having to open the source code file.

In [ ]:

Copied!

help(distance_from_origin_docstring)
help(distance_from_origin_docstring)

Many IDEs will even show the information when you hover over the function name.

Type hints¶

An analyst tries using the distance_from_origin_docstring function, but is getting an error

In [ ]:

Copied!





coordinates = [2, 5, 4]
distance = distance_from_origin_docstring(*coordinates)
info_string = "The point is " + distance + " meters from the origin"
print(info_string)
coordinates = [2, 5, 4]
distance = distance_from_origin_docstring(*coordinates)
info_string = "The point is " + distance + " meters from the origin"
print(info_string)

The error is reasonably informative, and the analyst can use it to fix their code. But the problem only showed up after the analyst ran the code. It would be nice to get that information beforehand. Type hints are a way to pass information to type checkers and IDEs that can help ensure that you're using the correct types, without having to actually run the code.

In [ ]:

Copied!





def distance_from_origin_typehints(x: float, y: float, z: float) -> float:
    """
    Calculates the distance from a given point in three dimensions to the origin (0, 0, 0). 

    Args: 
        x (float): The x-axis coordinate.
        y (float): The y-axis coordinate.
        z (float): The z-axis coordinate.

    Returns:
        float: The distance.
    """

    return (x**2 + y**2 + z**2) ** 0.5
def distance_from_origin_typehints(x: float, y: float, z: float) -> float:
    """
    Calculates the distance from a given point in three dimensions to the origin (0, 0, 0). 

    Args: 
        x (float): The x-axis coordinate.
        y (float): The y-axis coordinate.
        z (float): The z-axis coordinate.

    Returns:
        float: The distance.
    """

    return (x**2 + y**2 + z**2) ** 0.5

If the analyst had used this function, type checkers like Mypy would have flagged the use of the distance name as incorrect usage. Then the analyst could have corrected their code before running it and seeing the error.

In [ ]:

Copied!





coordinates = [2, 5, 4]
distance = distance_from_origin_typehints(*coordinates)
info_string = "The point is " + distance + " meters from the origin"
print(info_string)
coordinates = [2, 5, 4]
distance = distance_from_origin_typehints(*coordinates)
info_string = "The point is " + distance + " meters from the origin"
print(info_string)

Type hints are well-named. They do not force you to use the right types. They will not cause Python to throw an error if you use the wrong types. They give you a hint that you are not using a value correctly.

For example, the distance_from_origin_typehints function still executes without an error when you pass it a complex number as an argument, even though a complex is not a float.

In [ ]:

Copied!





coordinates = [2j, 5, 4]
distance = distance_from_origin_typehints(*coordinates)
info_string = f"The point is {distance} meters from the origin"
print(info_string)
coordinates = [2j, 5, 4]
distance = distance_from_origin_typehints(*coordinates)
info_string = f"The point is {distance} meters from the origin"
print(info_string)

Type hints can be used for more complex types, like if you need to have a particular container type and you also need to specify the type of the values inside the container.

Iterable is for containers that you want to use in a for loop.
Sequence is an Iterable that lets you know the length and access an element by index.
MutableSequence is a Sequence that you might need to change.
Mapping is for dictionary-like objects where you want to get values by key.
MutableMapping is a Mapping that you might need to change.
If you know you want a specific type, you can also directly use dict, list, tuple, etc.

The code below refactors the distance function to use a single parameter and calculates the distance in any number of dimensions.

In [ ]:

Copied!





from collections.abc import Iterable

def n_dimension_distance_from_origin(coords: Iterable[float]) -> float:
    """
    Calculates the distance from a given n-dimensional point to the origin. 

    Args: 
        coords (Iterable[float]): 
            An iterable of coordinate values, one for each dimension.

    Returns:
        float: The distance.
    """

    sum_of_squares = sum(d ** 2 for d in coords)
    return sum_of_squares ** 0.5

n_dimension_distance_from_origin((1, 1, 1, 1))
n_dimension_distance_from_origin([1, 1, 1, 1])
n_dimension_distance_from_origin(("1", "1", "1", "1"))
from collections.abc import Iterable

def n_dimension_distance_from_origin(coords: Iterable[float]) -> float:
    """
    Calculates the distance from a given n-dimensional point to the origin. 

    Args: 
        coords (Iterable[float]): 
            An iterable of coordinate values, one for each dimension.

    Returns:
        float: The distance.
    """

    sum_of_squares = sum(d ** 2 for d in coords)
    return sum_of_squares ** 0.5

n_dimension_distance_from_origin((1, 1, 1, 1))
n_dimension_distance_from_origin([1, 1, 1, 1])
n_dimension_distance_from_origin(("1", "1", "1", "1"))

Exercises¶

The exercises below invite you to practice applying the different strategies outlined above. They follow the order of the concepts presented, but you can attempt them in any order. Start with the ones that seem most applicable to the work you need to do.

You can find example answers in the ExerciseAnswers.ipynb notebook.

1) Use type hints¶

Determine the input and output types of the calculate_area function below, then add type hints.

Hint: The correct type for vertices is complex. It is passed to the cycle function, which means you need to be able to loop over it. The containers inside vertices must have both an x and a y property, and you need to be able to do arithmetic using the values of those properties.

In [ ]:

Copied!





from itertools import cycle
from typing import NamedTuple

class Vertex(NamedTuple):
    x: float
    y: float


def calculate_area(vertices):
    subtotals = []
    vertex_cycle = cycle(vertices)
    next(vertex_cycle)
    for vertex in vertices:
        next_vertex = next(vertex_cycle)
        subtotal = vertex.x * next_vertex.y - vertex.y * next_vertex.x
        subtotals.append(subtotal)
    area = abs(sum(subtotals) / 2)
    return area

vertices = (Vertex(4, 10), Vertex(9, 7), Vertex(11, 2), Vertex(2, 2))
calculate_area(vertices)
from itertools import cycle
from typing import NamedTuple

class Vertex(NamedTuple):
    x: float
    y: float


def calculate_area(vertices):
    subtotals = []
    vertex_cycle = cycle(vertices)
    next(vertex_cycle)
    for vertex in vertices:
        next_vertex = next(vertex_cycle)
        subtotal = vertex.x * next_vertex.y - vertex.y * next_vertex.x
        subtotals.append(subtotal)
    area = abs(sum(subtotals) / 2)
    return area

vertices = (Vertex(4, 10), Vertex(9, 7), Vertex(11, 2), Vertex(2, 2))
calculate_area(vertices)

2) Add a docstring to a function¶

Determine what the calculate area function does, then add a docstring.

The examples above use Google-style docstrings, which is a common standard. You may also want to look at other common formats.

Hint: It is not actually necessary to understand the shoelace algorithm implemented by this function. You can still write an excellent doc string explaining what it does and how to use it.