Help other people understand your code¶
Even if you use Pythonic idioms, your code probably won't be perfectly understandable by itself. You want other people to be able to work with the code you write.
Docstrings¶
An analyst has a function that calculates the distance from a given point to the origin in three dimensions.
def distance_from_origin(x, y, z):
return (x**2 + y**2 + z**2) ** 0.5
One option is to say that it is perfectly obvious what this function does from its name and parameters. But your functions are much more obvious to you than they are to other people. "Other people" includes future you. You do not want future you mad at current you for not explaining what your code does.
A better option is to write down explicity what this function does, what kind of arguments you can pass to it, and what kind of value it will return. For example, you might have a text file, or a web page, or a Word doc. Hopefully not a sticky note on your monitor, but even that's better than nothing. Something like:
Calculates the distance from a given point in three dimensions to the origin (0, 0, 0).
Args:
x (float): The x-axis coordinate.
y (float): The y-axis coordinate.
z (float): The z-axis coordinate.
Returns:
float: The distance.
That works OK, but separating your code from your documentation forces people to look in two places. It also means that the built-in help
function is mostly useless for learning about your function.
help(distance_from_origin)
A better way to document your code is to include the information as a docstring. You can use docstrings with modules, function, classes, and methods that you create.
def distance_from_origin_docstring(x, y, z):
"""
Calculates the distance from a given point in three dimensions to the origin (0, 0, 0).
Args:
x (float): The x-axis coordinate.
y (float): The y-axis coordinate.
z (float): The z-axis coordinate.
Returns:
float: The distance.
"""
return (x**2 + y**2 + z**2) ** 0.5
By including a docstring, people can use the built-in help
function to see the information without having to open the source code file.
help(distance_from_origin_docstring)
Many IDEs will even show the information when you hover over the function name.
Type hints¶
An analyst tries using the distance_from_origin_docstring
function, but is getting an error
coordinates = [2, 5, 4]
distance = distance_from_origin_docstring(*coordinates)
info_string = "The point is " + distance + " meters from the origin"
print(info_string)
The error is reasonably informative, and the analyst can use it to fix their code. But the problem only showed up after the analyst ran the code. It would be nice to get that information beforehand. Type hints are a way to pass information to type checkers and IDEs that can help ensure that you're using the correct types, without having to actually run the code.
def distance_from_origin_typehints(x: float, y: float, z: float) -> float:
"""
Calculates the distance from a given point in three dimensions to the origin (0, 0, 0).
Args:
x (float): The x-axis coordinate.
y (float): The y-axis coordinate.
z (float): The z-axis coordinate.
Returns:
float: The distance.
"""
return (x**2 + y**2 + z**2) ** 0.5
If the analyst had used this function, type checkers like Mypy would have flagged the use of the distance
name as incorrect usage. Then the analyst could have corrected their code before running it and seeing the error.
coordinates = [2, 5, 4]
distance = distance_from_origin_typehints(*coordinates)
info_string = "The point is " + distance + " meters from the origin"
print(info_string)
Type hints are well-named. They do not force you to use the right types. They will not cause Python to throw an error if you use the wrong types. They give you a hint that you are not using a value correctly.
For example, the distance_from_origin_typehints
function still executes without an error when you pass it a complex
number as an argument, even though a complex
is not a float
.
coordinates = [2j, 5, 4]
distance = distance_from_origin_typehints(*coordinates)
info_string = f"The point is {distance} meters from the origin"
print(info_string)
Type hints can be used for more complex types, like if you need to have a particular container type and you also need to specify the type of the values inside the container.
Iterable
is for containers that you want to use in afor
loop.Sequence
is anIterable
that lets you know the length and access an element by index.MutableSequence
is aSequence
that you might need to change.Mapping
is for dictionary-like objects where you want to get values by key.MutableMapping
is aMapping
that you might need to change.- If you know you want a specific type, you can also directly use
dict
,list
,tuple
, etc.
The code below refactors the distance function to use a single parameter and calculates the distance in any number of dimensions.
from collections.abc import Iterable
def n_dimension_distance_from_origin(coords: Iterable[float]) -> float:
"""
Calculates the distance from a given n-dimensional point to the origin.
Args:
coords (Iterable[float]):
An iterable of coordinate values, one for each dimension.
Returns:
float: The distance.
"""
sum_of_squares = sum(d ** 2 for d in coords)
return sum_of_squares ** 0.5
n_dimension_distance_from_origin((1, 1, 1, 1))
n_dimension_distance_from_origin([1, 1, 1, 1])
n_dimension_distance_from_origin(("1", "1", "1", "1"))
Exercises¶
The exercises below invite you to practice applying the different strategies outlined above. They follow the order of the concepts presented, but you can attempt them in any order. Start with the ones that seem most applicable to the work you need to do.
You can find example answers in the ExerciseAnswers.ipynb notebook.
1) Use type hints¶
Determine the input and output types of the calculate_area
function below, then add type hints.
Hint: The correct type for vertices
is complex. It is passed to the cycle
function, which means you need to be able to loop over it. The containers inside vertices
must have both an x
and a y
property, and you need to be able to do arithmetic using the values of those properties.
from itertools import cycle
from typing import NamedTuple
class Vertex(NamedTuple):
x: float
y: float
def calculate_area(vertices):
subtotals = []
vertex_cycle = cycle(vertices)
next(vertex_cycle)
for vertex in vertices:
next_vertex = next(vertex_cycle)
subtotal = vertex.x * next_vertex.y - vertex.y * next_vertex.x
subtotals.append(subtotal)
area = abs(sum(subtotals) / 2)
return area
vertices = (Vertex(4, 10), Vertex(9, 7), Vertex(11, 2), Vertex(2, 2))
calculate_area(vertices)
2) Add a docstring to a function¶
Determine what the calculate area
function does, then add a docstring.
The examples above use Google-style docstrings, which is a common standard. You may also want to look at other common formats.
Hint: It is not actually necessary to understand the shoelace algorithm implemented by this function. You can still write an excellent doc string explaining what it does and how to use it.